Digital Marketing » Webinars » Webinars as Guest » Knowledge Graphs & SEO: The next chapter | #KnowCon2020 Workshop

Knowledge Graphs & SEO: The next chapter | #KnowCon2020 Workshop

A talk by Dawn Anderson (Bertey), David Amerland (Davidamerland.com), Jason Barnard (Kalicube®), Hamlet Batista (RankSense) &Andrea Volpini (Wordlift).

Knowledge Graph” is an overloaded term

Today Knowledge Graphs are becoming mainstream, and as this happens, more and more people associate Knowledge Graphs with data models, semantics, knowledge management, and ontologies

For many other people, however, Knowledge Graphs still mean Google Search Info Boxes, panels, SERPs, and SEO (Search Engine Optimization)

They are all right

The term Knowledge Graph was introduced by Google to signify the huge improvement that semantic technology brought to its search engine

Over time, the extended search capabilities and components enabled by semantic technology have become namesakes for Knowledge Graph

While the term Knowledge Graph has more meanings than this, it’s useful to return to the source

The evolution of Knowledge Graph-powered Google search now extends to voice, assimilates information from JSON-LD markup beyond Wikipedia, and advances the state of the art in NLP (Natural Language Processing)

Let’s explore how this influences, and is influenced by, advances in semantic technology, where the evolution of SEO is headed, and what this means for knowledge graphs at large

Published by: Knowledge Connexions. Guests: Jason BarnardDawn AndersonDavid Amerland, Andrea Volpini. February 1, 2021.

Note: This transcript is substantially complete but ends mid-session. The closing portion of the panel was not recoverable from the source material.


Summary: Knowledge Graphs & SEO: The next chapter | #KnowCon2020 Workshop

1. UCD Framework: Understandability, Credibility, Deliverability

Jason Barnard articulated the three-pillar framework that would become central to The Kalicube Framework. Discussing what SEO professionals need to master, he stated: “Make sure that Google has understood who we are, what we offer, who our audience is. We need to make sure that once it’s understood, we’re the most credible solution for Google to present as a recommendation for its users. And thirdly, make sure we have it in the format that’s deliverable.” He noted he had published an article on exactly this framework in Search Engine Journal, referencing the three pillars by name: Understandability, Credibility, and Deliverability.

Historical significance: This panel constitutes one of the earliest recorded public articulations of what Kalicube would formalise as the UCD Framework - not merely as a content concept but as a structured three-part diagnostic for how search engines (and by extension AI systems) evaluate, trust, and serve brand information.


2. Communication as the Primary SEO Skill

Jason Barnard identified communication - rather than technical mastery - as the foundational skill of SEO: “The principal skill has to be communication. Whether it’s writing, making videos, you need to communicate with your audience and you need to communicate with the machine.” He connected this directly to UCD, arguing that effective communication with both humans and machines is what enables understandability, credibility, and deliverability simultaneously.


3. Disambiguation as the Number One Content Practice

When asked for practical content creation guidance, Jason Barnard named disambiguation as his first principle: “My number one top go-to thing is disambiguate. Take a big step back every time you write something. Take the time to actually think about it and rewrite it to the point at which it isn’t ambiguous - or you can make it as least ambiguous as possible - at least for Google.” He added that clarity for machines does not require sacrificing quality for humans: “Disambiguation within your copy doesn’t necessarily mean that it has to be boring and dull.”


4. The Entity Confusion Problem: Humans as the Source of Ambiguity

Jason Barnard argued that the deeper problem in disambiguation is not machine capability but human confusion. He gave the example of clients whose software product shares a name with the company itself: “When I say to them, ‘When you say that, do you mean the company or the software?’ they can’t actually tell me - they have to think about it.” He also cited his personal Boowa & Kwala project, where six distinct but co-named entities (characters, song in French, song in English, movie, TV series, podcast series) proved impossible to disambiguate cleanly even with deliberate effort. The point: machines can only be as clear as the humans feeding them data.


5. Entity-Based Content Models

Jason Barnard referenced collaborative work with Andrea Volpini on entity-based content models, describing the approach as “when we create our content, if we start thinking in terms of entities and their properties and attributes and the relationships to other entities, we start to really get to the core of what it is we’re talking about - the topics and why we are situated within our market vis-à-vis our audience.” This frames entity modelling not as a technical exercise but as a clarity and positioning discipline.


6. Consistency Across the Web as a Prerequisite for Machine Understanding

Jason Barnard described his standard client audit process: “I get all my clients to do an audit - go around the web, find all the information about [their brand], and just see how inconsistent it all is. It’s usually a bit of a shock.” He illustrated the problem with his own Kalicube Tuesdays live event experiment, noting that even with meticulous title and description matching across platforms, Google still created duplicate entities. His conclusion: “However consistent you are, the machine is still having a certain number of problems. But that doesn’t mean give up on being consistent - it’s the only way we’re going to get anywhere near getting the machine to understand.”


7. Machine Learning as Goal-Directed Feedback: A Practitioner’s Mental Model

Jason Barnard relayed a conversation with Frédéric Dubut from Bing about how machine learning actually works at the engineering level. The mental model: engineers set a goal, feed the machine a large verified dataset, observe the machine’s outputs, tag them good or bad, and feed corrective and reinforcing data back in - a bandit-style reward loop. Barnard’s framing: “The engineers are actually just looking at the goal and then trying… setting the metrics by which they’re judging the machine, and then feeding the data back in to correct the machine as and when it makes mistakes, and to encourage it when it’s doing it really well.” He argued this makes machine learning conceptually accessible for marketers without requiring mathematical expertise.


Historical Significance

This panel, held at Knowledge Connexions 2020 - a joint event of Connected Data London and the Knowledge Graph Conference - places Jason Barnard in formal academic and technical company (AI researchers, NLP specialists, knowledge graph engineers) discussing what would become core TKF architecture.

Specifically documented here:

  • UCD Framework - named and defined publicly, cross-referenced to a contemporaneous SEJ article, in a knowledge-graph-focused academic/practitioner context. Dated to late 2020.
  • Disambiguation as content strategy - framed as the primary human contribution to machine clarity.
  • Entity confusion as a human problem - the argument that machines are limited by human ambiguity in content creation, not only by their own technical limitations. This underpins later TKF reasoning about why brand-controlled entity homes are necessary.
  • Consistency as the precondition for machine trust - consistent naming and description across platforms as the prerequisite for a machine to form a stable, unambiguous entity record. This becomes central to The Kalicube Process.

This panel also marks an early documented instance of Jason Barnard working in public collaboration with Andrea Volpini (WordLift) on entity-based content models - a partnership that would continue through to the 2025-2026 AAO positioning work.


Panels documented to date: 10 Concepts staked in this panel: UCD (public articulation, late 2020), Disambiguation-first content strategy, Entity confusion as human problem, Consistency as machine trust prerequisite, Communication as primary SEO skill, Entity-based content models (with Volpini)

Full Transcript: Knowledge Connexions 2020 - “Knowledge Graphs & SEO: The Next Chapter”

(Transcript ends mid-session - closing portion not recoverable)


George Anadiotis: Good morning, or good afternoon everyone, depending on which part of the world you are joining us from, and welcome to Knowledge Connexions 2020. I’m George Anadiotis, part of the core organising team for the event, which is a joint venture coming to you from Connected Data London and the Knowledge Graph Conference. This is actually the first session we have in the whole event - the first day - and this is the first workshop, in which we have a great team of people who are going to be dissecting all the latest and greatest developments in SEO and knowledge graphs, with a little sprinkle of natural language processing and AI in the process. So you’re in for a really good time if this is your thing - and obviously it should be, since you’re here.

Just a few words: as you know, this is the first session of our event. I don’t really have to explain to anyone, I guess, how hard 2020 has been for all of us. This is traditionally the time of year when Connected Data London holds its flagship event, and so we didn’t let this hard year keep us back. This is our way of doing it online instead of face to face - but on the upside, it also gives us the chance to have a dream team together today.

Without further ado, I’m going to introduce David Amerland, who is the moderator for this workshop. Just by way of introduction, I was telling David earlier that it may actually be a good thing to mention how we met, because I think it’s relevant for this workshop and gives an idea of how knowledge graphs mean different things to different people and how you can approach them from different angles.

As some of you may know, I’m a contributor to Sifted, and at some point I had written an article about knowledge graphs. David read it, and I guess it seemed interesting to him - he sent me a message saying that what I was saying didn’t really make much sense. I quickly checked his background and realised: okay, this person actually knows what he’s talking about. But I also know what I’m talking about, so there must be some kind of misunderstanding. He probably means something different when he refers to knowledge graphs than what I mean. This was the beginning of a wonderful friendship, and also a very good example of how semantics come into play when you talk about knowledge graphs specifically, and more broadly.

David wears a number of hats - he’s an author, he’s into data mining, he does consulting, and he is a man of many talents including SEO and knowledge graphs, which is why he’s here moderating this workshop. I’m going to give him the floor. But before I do: while our guests are having the conversation, you can type questions, comments, or feedback in the chat box on the bottom right of the screen. I’ll keep an eye on it, and at certain points David will pause the conversation to take questions and relay them to the panel. That’s it from me - David, the floor is yours.


David Amerland: Okay, thank you very much for that introduction. And certainly, when we talk about knowledge graphs there’s always a lot of confusion. It’s been almost twenty years since the semantic web was first mentioned, about ten years since knowledge graphs started coming into play, and we now have artificial intelligence across the web being used in SEO - and nobody really knows how all these things clarify at the point of practical application. We have a brilliant panel. I’ll introduce each person in turn.

We start with Andrea Volpini - CEO and founder of WordLift, a company that helps brands increase visibility using semantic web technologies and artificial intelligence. We have Hamlet Batista - CEO at RankSense, an agile SEO platform for online retailers and manufacturers. We have Dawn Anderson - Managing Director of Bertey, an SEO consultancy and digital marketing agency. And last but not least, Jason Barnard - founder of Kalicube, who specialises in Brand SERPs and knowledge panels, and who I’ve personally seen run a number of very interesting experiments in SEO.

What we’re going to do is start a discussion, take a few questions, field them to the panel, and get everybody’s experience. The focus here is on practicality. As George mentioned, as questions come up please put them in the chat. Let’s start the first one: when we talk about knowledge graphs, what are we actually talking about? Let’s take this question with Andrea first, please.


Andrea Volpini: A knowledge graph, in simple terms, is a data structure - and it’s a data structure that connects one node to another node through an edge. This edge is what creates the information. The way in which we specify the connection is the way in which we convey information into a graph.

That’s almost a textbook definition, and in SEO terms those edges become the properties of the entity. These properties are part of types, and each type has its own set of attributes. So in SEO terms, you want to understand what type you are: are you representing yourself as a person, as a doctor, as a company? And what facets of your entity do you want to show? Because at the end, SEO is about marketing whatever you do best.


David Amerland: Okay - and Dawn, anything to add on that?


Dawn Anderson: Yes - I read something recently in a book about natural language and knowledge graphs, and there’s a term: is-a. I think Andrea is alluding to that. Whenever we produce content, we constantly think about is-a - “this is a…” It’s mentioned extensively in books on ontology, the idea of the is-a relationship.


David Amerland: And Jason - let’s take the next demystifying question to you. What is an entity?


Jason Barnard: Oh, nice question. An entity is a thing - it’s something you can define. A person, a place, a road, a house, a dog - or, in fact, a concept such as economics. I think we tend to forget that economics is an entity: it’s something we can define and name. And the idea of entities - well, with Andrea at WordLift we’ve been doing entity-based content models, and I hope we’ll dig into that because it’s incredibly interesting. When we create our content, if we start thinking in terms of entities and their properties, attributes, and relationships to other entities, we start to really get to the core of what we’re talking about - the topics, and why we are situated within our market vis-à-vis our audience.

You mentioned properties when it comes to entities - can you clarify a little? Well, it was Andrea who said “properties” earlier - I was just repeating it. I mean, I’ve been using my podcast as an example because it’s part of my knowledge graph experiments. I have a creative work series - the type is a podcast series - and within that series I have multiple podcast episodes. The episodes have properties which Andrea can explain.


David Amerland: Okay, let’s go to Andrea on this one.


Andrea Volpini: The properties are the attributes that specify a set of information. In the case of the podcast, a core property is of course the title - that’s the content people will look for. But then the edge conveys the real value behind the podcast, and one of the most important edges is whoever Jason is interviewing - the relationship between the two people, the author and the contributor, is what creates the uniqueness of that piece of content.

I think it’s interesting that we’ve already moved from the knowledge graph in the context of Google to the knowledge graph in the context of a website’s own data. And that’s quite an interesting jump, because when we talk about knowledge graphs, people in SEO tend to think about Google’s Knowledge Graph specifically. But I think we are now mature enough to also talk about building our own knowledge graph.


Jason Barnard: I think that’s kind of where David was coming from - saying, what is a knowledge graph, and how does it apply to us personally? And for us personally, it comes down to what WordLift is doing: build your internal knowledge graph.


Dawn Anderson: Ultimately, “knowledge graph” as a term has been popularised by Google. I remember last year at the Web Conference in San Francisco, one of the Google ontologists was quite emphatic about how knowledge graphs, the whole semantic web, and linked data have been around for a very long time - ever since Tim Berners-Lee talked about linked connected data. But Google has massively popularised the whole thing. So “knowledge graph” as an actual term is quite a loose one. And on attributes: I would compare them with classes of things. A vehicle is a… a car is a vehicle, a bike is a vehicle.


David Amerland: It does make sense. And Hamlet - let’s take the question of ontologies to you. What are they, how are they created, and how do they differ from a taxonomy?


Hamlet Batista: I feel we’ve been a little too high-level and abstract in our definitions, so let me give a layman’s explanation. Knowledge graphs and ontologies are best understood through examples. Take Abraham Lincoln: that name could refer to a person, a bridge, a tunnel, a university. The same label expresses completely different things. How do you solve this? For a human, context helps - you understand what it means from the surrounding information. But instead of talking about attributes, let’s talk about what a thing is capable of doing or being. A person is going to walk, breathe, have a birthday. A bridge doesn’t have those capabilities. So when you look at the entity - the label - and you look at the capabilities of that label, you can distinguish between a bridge, a university, and a person. That’s how you get from the academic level to the practical.


Dawn Anderson: But sometimes even when we’re talking about the same person, the attributes aren’t always sufficient - because we might say “Mozart” while others use his full name or initials. There’s the problem of name entity determination even within a single entity.


Hamlet Batista: Yes, but that’s a different problem - it’s still the same person. It doesn’t matter whether you use an abbreviation or first name or last name. If you’re talking about Mozart, he’s going to have a specific birthday, specific works, specific attributes. You can identify him unambiguously from those attributes regardless of how the name is labelled. You’re able to tell who something is by the company it keeps.


David Amerland: For the audience: what this comes down to is that content which is as unambiguous and as clear as possible - clearly describing what a thing actually does - will help a search engine know what you’re referring to, and an audience understand what you’re talking about. We’re seeing that the technicalities are quite immense, but Hamlet’s approach of describing capabilities offers a very accessible way into understanding entity resolution. Let’s talk a little about context now. Dawn, how do context and intent drive the evolution of search?


Dawn Anderson: Much of what we’ve discussed comes down to what Susan Dumais called the vocabulary problem: so many people have different ways of saying exactly the same thing. Search is not a solved problem. I saw a really interesting analogy recently: “Search is like trying to understand where somebody is going and what they want to do in New York, from seeing them step off the plane at JFK.” That’s why search results are diversified on broad queries - if someone types a single word they give very little clue about intent. That’s why universal search exists, why there’s diversification: Google needs that user feedback loop of clicks to refine its understanding of what people actually wanted.


David Amerland: Andrea, the same question - with the additional dimension that your company uses AI to try and quantify human behaviour.


Andrea Volpini: The information is ultra-complex because humans are ultra-complex. Whenever you’re designing a system you always have to account for feedback loops. My lesson: whatever AI approach you use, always keep the human in the loop, because that’s what creates the value. Ontologies are one answer to this - from the philosophical point of view, it’s the study of things, the study of whatever exists. But within technology, an ontology is what allows you to create a system that gets the machine engaged with the human and creates value that would not exist without that interaction.


David Amerland: Hamlet, let’s take the question of ontologies from you.


Hamlet Batista: When you think about ontologies, think about organisation - think about the shelves in a library. Where does something fit in when it can be expressed in many different ways? Google says 15% of searches they see every day are completely new - never seen before. How do you build a system that can answer questions never previously asked? I did an experiment with four percent of never-before-seen queries from a client’s search console data, and even though Google hadn’t seen those specific phrases, it was able to classify them correctly: this query calls for a maps result, this one for news articles, this for images. That’s the value of having an organised ontology - the ability to assign new inputs to the right bucket quickly. It’s a way of organising things in a hierarchy that machines can traverse rapidly.


Jason Barnard: When I was talking to Nathan Chalmers from Bing - the person who runs the whole-page algorithm - what struck me was the discussion of user feedback. Bing at least admits they use click and user behaviour data extensively to understand what a whole page should look like. What Nathan was saying connects to what Hamlet’s saying: how do they address something they’ve never seen before? The sheer mass of data and user behaviour from previous examples gives them a foothold to predict what a never-before-seen query is actually trying to do. From Bing’s perspective, they felt fifteen percent was vastly overestimated as a figure, for what that’s worth.


Hamlet Batista: And you can use historical data to predict. In my experiment, even though these phrases had never been seen before, I converted them into embeddings, matched them semantically against historical keyword data, and found phrases that meant the same thing - even if they were phrased differently. So even novel queries can be resolved by semantic similarity to historical patterns.


David Amerland: Let’s boil all of this down to practical content-creation guidelines. One point from each of you, starting with Dawn.


Dawn Anderson: Building quickly on what Hamlet said: whilst fifteen percent of queries may never have been seen before, there will be seasonal aspects you can anticipate. Think about what’s happening throughout the year for your audience and map your content calendar accordingly.


Andrea Volpini: Seasonality is definitely one aspect. But I believe it’s also super important to talk with your actual clients or readers from time to time and shape the content model accordingly. Talk to the people who are making the searches. If you have the chance to speak with your consumers, ask everyone you know how they got to you and what they searched. Rarely do people do that, because we think of the internet as something abstract - but it’s made of people who landed on your page because they made a query. What was that query? Why did they arrive?


Jason Barnard: For content writing, my number one top go-to thing is: disambiguate. Take a big step back every time you write something. Take the time to actually think about it and rewrite it to the point at which it isn’t ambiguous - or you can make it as least ambiguous as possible, at least for Google. And don’t ever forget, as Andrea said, the web is made of people who are actually coming to your website. Disambiguation within your copy doesn’t necessarily mean it has to be boring, dull, or uninteresting for the human being. That’s a question of writing skill. Great content writers are back.


David Amerland: Hamlet, please.


Hamlet Batista: We get used to keyword tools and spend a lot of time in them. But something as simple as this: take a query you’re targeting, type it into Google, set your browser to your audience’s language and location - and look at what’s showing up. What does Google think this query is about? What are competitors offering, and are they getting it right? If they’re not, that’s an opportunity. Don’t stop at pulling keywords - go and actually search, look at the results, and understand the intent they imply.


David Amerland: Excellent. So: know your audience, create your content, and please test search queries before you finalise content. Now - what are the skills SEO professionals need today? Jason, please.


Jason Barnard: I think the principal skill has to be communication - whether it’s writing, making videos, you need to communicate with your audience and you need to communicate with the machine. In the case of SEO, that’s Google and Bing. I wrote an article in Search Engine Journal - much like Hamlet did - on Understandability, Credibility, and Deliverability, and for me that’s the crux of everything we’re doing.

We need to make sure that Google has understood who we are, what we offer, who our audience is. We need to make sure that - once it’s understood that we have an offer appropriate for its user’s intent - we are the most credible solution for Google to present as the recommendation for its users, because that’s what it’s doing: it’s recommending a result, an answer, or a solution. And thirdly, make sure we have it in the format that’s deliverable - either directly on the SERP, so that Google can serve it, or that Google believes we can deliver it, which covers all the technical SEO on your site. If you can get those three pillars together, you’re going to win the game.


David Amerland: Dawn, please.


Dawn Anderson: I’m doing quite a bit of research at the moment into Zipfian distribution and Zipf’s law. The principle is: a small number of things are massively important, and many, many things are not important individually, though there are a lot of them. The key is to be really clear on what matters most to your audience and focus on that. And as Jason alluded to: focus on the right medium for your audience. If you work in fashion, images are probably more important than words. If you work in fitness, it’s video. Don’t assume it’s always going to be about words. Consider Pareto and Zipf’s law in everything you do.


David Amerland: Andrea, please.


Andrea Volpini: Data publishing is always my answer. Make sure you’re dealing not only with the searcher in the form of a human being, but the searcher in the form of a machine that will help the human get to you. In order for the machine to understand you, publish as much data, in the most accessible and clean way possible, so that you can be understood by the searcher - and by the agent that helps the searcher find you.


David Amerland: Brilliant. And Hamlet, please.


Hamlet Batista: I agree with Jason that communication is the most important skill. I see a lot of the problem even when SEOs talk to clients - we talk about page authority, link juice, technical terms the client can’t connect to business value. Clarity of communication in everything you do. And I’ll tell you a short story: I was talking to a prospect last week who had all this content, was getting some performance, but felt something was missing. He’d been to a guitar website and felt excited by the content - the authors, the videos. He said: “I don’t feel that way when I go to my website.” The content had been placed there without even associating authors with it. You have to communicate clearly and make sure that when people click, they’re going to be satisfied. A lot of SEOs just say, “I got the ranking.” But nobody’s clicking because the search snippet isn’t compelling. And when they do click, the page has to deliver on what the snippet promised. The technical side can be learned. The foundational thing is communication.


David Amerland: Excellent - in one word, summarising everything: communication. And consistency -


Hamlet Batista: And consistency.


David Amerland: Very true. So we see that machine learning is being used increasingly in search. How does this affect the creation and evolution of knowledge graphs? Andrea, please.


Andrea Volpini: A lot of new developments relate to the new language models capable of synthesising information. The knowledge graph is evolving very fast because it now grabs unstructured information and compiles it into triples using models like T5. We see this in experiments with JSON-LD: far more information is now being sourced from the open web. Where before we saw higher importance placed on structured sources like Wikipedia, Crunchbase, and LinkedIn - those are still relevant - now more is being sourced from the broader open web. This is an opportunity for anyone writing content: be consistent and help the machine understand what can be triplified, what information really matters. And as data publishers, ensure information is accessible and interoperable with other datasets, so knowledge graphs can continue to evolve.


David Amerland: Dawn, the same question - machine learning, AI, and their influence on search and knowledge graphs.


Dawn Anderson: What’s happening is things are getting increasingly more accurate over time. We’ve talked about consistency, and there are still a lot of inconsistencies. Interestingly, at Google’s Search On event a few weeks ago, they talked about Google Datasets. There’s no shortage of data any more - in fact, the level of data creates its own ambiguity because you have so many different bodies creating more and more data banks. We all know the issue where a featured snippet pulled an incorrect image - for example, a result about how many legs a horse has pulling a wrong answer from a different context. That happens because data is being pulled from multiple places. Over time, I think the big focus will be on data alignment - aligning data across sources. Natural language is complicated, but hopefully over time things will get less and less ambiguous.


David Amerland: And on consistency at a practical level - what do we mean by it?


Dawn Anderson: NAP is a really good example: name, address, phone number. When someone rebrands, you end up with some data sources carrying the old name and some the new, and you get confusion where a site ranks for one name one minute and a different name the next. Practically: keep every mention of your entities consistent - the same data everywhere, every time.


David Amerland: Would you agree with that, Jason?


Jason Barnard: Yes - and even without changing your brand name, there’s an awful problem with consistency. You’ve got different employees who place information in different places in different ways, especially over time as employees change. I get all my clients to do an audit: go around the web, find all the information about themselves, and just see how inconsistent it all is. It’s usually a bit of a shock. Then go around and correct it all.

And here’s the thing about consistency: as part of my knowledge graph experiments, I created an event called Kalicube Tuesdays - a live-streamed event every week on YouTube - and I can now push those into the Knowledge Graph in five minutes or less. They basically go straight in, which is wonderful. But then I place the event on different platforms, and I spent a lot of time making sure that every single time I mentioned the event, the title was exactly the same, the description was exactly the same. And Google was still creating duplicates, however careful I was. The machine is obviously still learning - there’s clearly still a learning process. But: however consistent you are, the machine is still having a certain number of problems. That doesn’t mean give up on being consistent. It’s the only way we’re going to get anywhere near getting the machine to understand and disambiguate correctly.

In terms of machine learning - Frédéric Dubut from Bing talked to me about how he perceives it. What they’re doing is: they give the machine a large chunk of verified, sorted, correctly categorised data, and they say, “Okay, this datum goes to this goal.” They give the machine a goal - great results, or whatever it may be - and then they take the data the machine produces in trying to achieve that goal, tag it either good or bad, and feed it back in. They encourage the machine with good examples and give it corrective data with bad ones. A bandit algorithm - rewarding the good, correcting the bad. The one-armed bandit with a carrot and a stick. The idea of machine learning, if you look at it that way, is that engineers are actually just looking at the goal and then setting the metrics by which they judge the machine’s output, feeding data back in to correct it when it makes mistakes and to encourage it when it’s doing well. That framing made machine learning much easier for me to grasp without understanding the technical details - and what that comes down to, as a marketing professional, is what Hamlet described: communicating clearly with our audience. That is marketing.


David Amerland: Brilliant. And Hamlet - machine learning, AI, their influence on search and knowledge graphs. And Hamlet offered to share a live demo.


Hamlet Batista: I posted a link to an article showing how to build a knowledge graph from scratch. I took the XML sitemap from Search Engine Journal, filtered the articles for this year, turned the URLs into headlines, pulled entities and relationships from the headlines, and built a knowledge graph from this automatically - no manual input. So for the relationship “launches,” I can see: “Google launches high-demand fields,” “Vlog SEO,” “Microsoft Marketing O’Clock podcast,” “Coronavirus queries.” Look how powerful this is - fully automatic, just from the sitemap.

The library I used takes text and applies named entity recognition at about 60-70% accuracy, which is better than doing it manually and is very effective for this purpose. It labels text with entity types: person, organisation, product, date. And then a syntactical parser of natural language finds the edges - the relationships between subjects and objects. With this, you can turn unstructured text into something practically useful. The code is in the article - you just hit play on the steps. Hopefully you find that useful.


David Amerland: Larry from the audience is asking: is Python important for technical SEOs?


Hamlet Batista: For technical SEOs, I’d say: master the Search Console tools from Google and Bing - they have video series and academies covering the fundamentals of how search engines crawl, index, and rank. Understanding structured data is also very important. Schema.org standards can be seen as an ontology - a way of categorising and organising information that Google uses to power rich features in search results. A good number of those elements power rich results directly.


David Amerland: Right. Let’s ask for a couple of practical takeaways from all this as we get into the second half of the session - as you create content, what should the audience be looking out for? Let’s start -

(Transcript ends here - closing portion of panel not recoverable)


End of recovered transcript.


Panels documented: 10 Running concept list staked in this transcript: UCD Framework (named, public, cross-referenced to SEJ article, late 2020) · Disambiguation-first content strategy · Entity confusion as a human problem · Consistency as machine trust prerequisite · Communication as primary SEO skill · Entity-based content models (with Andrea Volpini) · Machine learning as goal-directed feedback loop (Frédéric Dubut / Bing)

Similar Posts