InLinks: The Knowledge Panel Ep 2: What Schema Matters?
This is a panel discussion delving into what is the most essential schema to have on a website and why!
Who is on the Panel?
* Dixon Jones (Your host and CEO of InLinks)
* Iman Hamdan A passionate SEO from Yahoo Small Business
* Martha van Berlel Owner of ScemaApp - an all-round schema creation SAAS
* Jason Barnard Owner of Kalicubeยฎ and a leading advocate for Knowledge Panels and โBrand searchโ.
Published by: InLinks. Guest: Jason Barnard, Iman Hamdan, Martha van Berlel. Host: Dixon Jones. August 17, 2020.
InLinks: The Knowledge Panel Ep 2: What Schema Matters? The Knowledge Panel Show - Episode 2: “What Schema Matters?”
Panel | InLinks Live Stream Date: 17 August 2020 (streamed live) Host: Dixon Jones (CEO, InLinks) Producer: David Bain (Cast & Cred) Panellists: Jason Barnard (Kalicubeยฎ), Martha van Berkel (Schema App), Iman Hamdan (Yahoo Small Business) Recording: InLinks YouTube / Facebook / Twitter channels
1. Brand SERP as Business Card, Digital Ecosystem Window, and Credibility Signal
Jason Barnard opened with a characteristically compressed articulation of the Brand SERP’s function: “It’s your business card. People look you up whether they’re about to do business with you or whether they’re already doing business with you. It’s also a window into your digital ecosystem - you can see where things are going wrong with your reputation. And it’s also a window into your content strategy.” This is one of the earliest recorded instances of Barnard framing the Brand SERP simultaneously as a reputational, diagnostic, and strategic instrument - not merely a ranking outcome.
2. Entity Home - “Find a Home for Your Entity and Stick to It”
Jason Barnard recounted a pivotal experiment: after Wikimedia editors removed his Wikipedia pages (for Boowa & Kwala characters, his folk-punk group, and his personal page), he expected his Knowledge Graph presence to collapse. It did not. He subsequently discovered the real damage came from himself - he moved his schema markup from his homepage to an interior page, and “Google just freaked out and dropped it and created a new entity within 24 hours.” His conclusion, stated explicitly: “Find a home for your entity and stick to it.”
Historical significance: This is the earliest documented public articulation of the Entity Home concept as a strategic principle - framed not as a technical recommendation but as a foundational rule derived from live experimentation. The Wikipedia removal experiment also constitutes the first recorded demonstration that Wikipedia is not a prerequisite for Knowledge Graph presence, and that the Entity Home (a controlled page on the brand’s own site) outperforms third-party sources when schema markup is correctly applied.
3. Schema Markup as Machine Confidence and Corroboration Signal
Jason Barnard articulated what would become a core TKF principle: that schema markup functions not primarily as a rich-result trigger but as a corroboration mechanism that raises the machine’s confidence in its own understanding: “Whatever schema markup you use is always going to be positive as long as it’s accurate, simply because it confirms and corroborates and gives Google confidence that it correctly understood what was in the page in the first place.” He extended this to meta descriptions: “Please do write your meta descriptions, because it helps them confirm their summary is correct.” And drew the strategic implication: “If the machine is confident, it will put it front and centre. If it isn’t, it will try and kind of push it down a bit because it isn’t really 100% sure.”
Historical significance: This is the earliest recorded public articulation of what TKF would formalise as the Corroboration Threshold and the confidence-to-visibility relationship - the idea that machine confidence is a binary-ish gate determining whether a brand appears prominently or not.
4. Knowledge Graph Entry via Events - The Kalicube Tuesdays Experiment
Jason Barnard described a live experiment: he had observed that Semrush entered the Knowledge Graph via its events data before its Wikipedia page appeared. He designed Kalicube Tuesdays - a weekly live-streamed podcast - as a deliberate entity network: “I’ve got the event, I created the event series with lots of events in it, with people who are entities, with topics who are entities, me who’s an entity, my company that organises it - an entity - and two sponsors, WordLift and Semrush - also entities.”
The acceleration of Knowledge Graph uptake was dramatic: “It took me months to get the first one, days to get the second one, one day to get the third one, five hours to get the next one, and then ten minutes to get the last one - ten minutes from when I posted to when it actually appeared in the Knowledge Graph API.”
He also noted the unexpected first entry: “I got my first sprout in the Knowledge Graph for Ted Rubin, who isn’t in the Knowledge Graph. I thought it would be somebody like Rand Fishkin.”
Historical significance: This is the earliest recorded public description of using structured event schema as a deliberate path into the Knowledge Graph - and the first documented description of the accelerating uptake pattern that would later be theorised in TKF as the trust momentum / entity confidence build-up.
5. Splitting Entities onto Individual Pages - 8x Confidence Score Multiplication
Jason Barnard described an experiment with his Boowa & Kwala characters: he initially had one page listing all nine characters (five yellow koalas, four blue dogs). After splitting them into nine individual pages and explicitly marking up the relationships between them - “This one is this one’s mother, this one’s father, wife, husband, child” - he reported: “It actually exploded - we multiplied the confidence score by eight by splitting them down into their individual pages.”
Historical significance: This is the earliest recorded public demonstration of entity isolation combined with relationship schema as a mechanism for multiplying Knowledge Graph confidence scores. It directly prefigures TKF principles around Entity Home specificity and the value of explicit relationship markup.
6. Ali Alvi (Bing) - Featured Snippets as a Separate Algorithm
Jason Barnard cited a conversation with Ali Alvi of Bing, describing how Bing’s featured snippet (which Bing calls “QA”) operates: “The featured snippet runs along the same lines, with the same algorithms, as the snippets under the blue links. The snippets under the blue links are not part of the same algorithm as the blue link ranking algorithm.” The process: the machine summarises the full page, infers the implicit question the page answers, and then matches that implicit question to the user’s query. Barnard noted this explains why meta descriptions are valuable - they help the machine confirm its page summary is correct.
Historical significance: This is the first recorded citation of Ali Alvi in the prior art record. The Alvi conversation is referenced in later TKF work and in the ARGDW series, where the Served gate and feedback loop are discussed. This panel establishes the citation’s date: August 2020.
7. Entity-Based Content Model with WordLift - Schema Forcing Strategic Clarity
Jason Barnard described building an entity-based content model for his podcast in collaboration with WordLift: “The first thing that struck me was how very badly organised it was in my own head, and how much WordLift’s schema markup forced me to think it through.” The exercise - defining entities, properties, and relationships in schema - changed his podcast’s direction and organisation: “It actually changed the course of the podcast, the way I was organising it, the way I was developing it, because it made me think in a much more structured and logical manner.”
8. The Entity-Based Future and the Strategic Imperative
Jason Barnard closed with a forward-looking statement: “If the future is entity-based search, being recognised as an entity and being in the Knowledge Graph is the deal - and start now.”
Historical significance: A clear, timestamped public prediction of entity-based search as the governing paradigm, made August 2020.
Historical Significance - Panel Level
This panel, streamed live on 17 August 2020, documents a cluster of foundational TKF and TKP concepts in their earliest recorded public form:
- Entity Home as an explicit strategic principle, derived from the Wikipedia removal experiment
- Machine confidence as visibility gate - corroboration through schema raises confidence, which drives prominence
- Events as a Knowledge Graph entry path - demonstrated with the Kalicube Tuesdays experiment and a documented acceleration curve
- Entity isolation + relationship markup as a multiplier of Knowledge Graph confidence scores
- Ali Alvi citation - earliest documented reference; establishes August 2020 as the date of this conversation
- Schema markup as strategic content forcing function - not just a technical overlay but a tool for clarifying brand positioning
Panels documented to date: 11 Concepts staked in this panel: Entity Home (named principle, August 2020) ยท Machine confidence as visibility gate ยท Corroboration Threshold ยท Events as KG entry path ยท Entity isolation ร8 confidence score experiment ยท Ali Alvi / Bing featured snippet algorithm citation (dated) ยท Entity-based content model with WordLift ยท Entity-based future prediction
Full Corrected Transcript
The Knowledge Panel Show - Episode 2: “What Schema Matters?”
17 August 2020
Dixon Jones: What schema matters? Hello - this is The Knowledge Panel Show, Episode 2. I’m Dixon Jones, CEO of InLinks.net, and my producer is here, David Bain of Cast and Cred. How are you today, David?
David Bain: I’m very well, thanks indeed, Dixon - and all the happier for joining such a wonderful panel of experts talking about schema today. It’s going to be great. And if anybody wants to ask questions, please do. Where are we streaming, and when can people ask questions?
Dixon Jones: Sure. Well, we’re streaming in three different places at the moment - on the InLinks channels on YouTube, on Facebook, and on Twitter. So wherever you’re watching, please add your views, ask questions if you’ve got them, and we’ll try our damnedest to incorporate them as part of the conversation today.
Okay - today’s question, for The Knowledge Panel under the microscope, is: what schema matters? And I think we’ve got a fantastic panel. So why don’t we go from, as I see them, top to bottom. Jason, why don’t you introduce yourself first? Tell us who you are.
Jason Barnard: Oh, great - thank you very much, Dixon. Lovely to see you all. Wonderful panel, wonderful question as well. I’m Jason Barnard, I’m The Brand SERP Guyยฎ. I’m obsessed by Brand SERPs - which is what appears when somebody Googles your brand name or your personal brand name. A lot of people think: “I don’t need to worry about that - I rank number one, so the rest of it doesn’t matter.” And it does matter. Because it’s your business card. People look you up whether they’re about to do business with you or whether they’re already doing business with you. It’s also a window into your digital ecosystem - you can see where things are going wrong with your reputation. And it’s also a window into your content strategy: you can look at your content strategy and, for example, if you don’t have videos ranking, maybe you should be looking more at your video strategy. So I think it’s a great insight into lots of different aspects of who we are, what we do, and whether we’re credible.
Dixon Jones: Okay, great. And Iman - do you want to introduce yourself and say hello?
Iman Hamdan: Yes, hi everyone. My name is Iman Hamdan. I’m a mother of two, I live in the US, I’ve worked for enterprises for almost ten years, and roughly until now I’ve been trying to push all SEO and organic tactics - and that’s it - through Yahoo right now.
Dixon Jones: Yeah, we just plug the Yahoo bit because you know it gives you that push. And Martha - Martha, who’s been in the industry with schema forever, tell us about yourself and where your business is.
Martha van Berkel: Sure. My name is Martha van Berkel and I’m the CEO at Schema App. I’m coming to you live from Canada today. Schema App is an enterprise solution that also provides full-service schema markup - any type of schema on any type of site. Give us the most complex challenges and we’ll help you through that entire journey. You’ll get to experience a little bit of the Schema App experience. I’m really excited to talk not just about code - which is “give me the markup” - but how you turn that into business strategy. And I’m also a mum - I love that Iman led with that. I’m also an avid rower, so I’ve been enjoying some of the summer on the water.
Dixon Jones: And you’ve also got some research which you’re going to dive into in a little bit as well, which is great. So before I dive into your research - the event is hosted by InLinks, so if you haven’t tried InLinks.net yet, please go and try it. It’s got a free-forever version for twenty pages, and it will create entity-based schema for you, and internal links, and kind of work out what your entities are and do some content optimisation as well. Please do try that out.
But before I get on to the research Martha has - if we ended up today without discussing one thing, what would it be? Because I don’t want our audience to go without answering the central question: what schema matters? Jason, go on first - what schema matters to you?
Jason Barnard: Well, my big thing is the Knowledge Panel - and I love the title of the show, because it perfectly suits what I want to talk about. Schema markup related to the Knowledge Panel, and how important it appears to be. I mean, I’ve been doing lots of experiments - on events, on people, on blue and yellow cartoon characters, on TV shows, music groups, songs, and all sorts of stuff. Sometimes I do experiments with my clients, but I try to avoid that. But yes - schema markup in relation to the Knowledge Panel is the one thing I would really be disappointed if we missed.
Dixon Jones: All right. Iman, what about you?
Iman Hamdan: Mine is an interesting one but not easy to deploy - it’s text-to-speech schema. Hopefully we can discuss that. Other than that, the basic schemas we always deploy in enterprises are really basic for me, like the standard list.
Dixon Jones: Text-to-speech is really interesting - hopefully we’ll have time for that. But before we get to either of those, Martha - you’ve just done a whole lot of work around rich snippets. Maybe pull out a few pointers from the research. What did you do, and what did you find?
Martha van Berkel: Sure - a little context first. We work with all different sizes of businesses, from small through to large, and we’ve been trying to ask ourselves: which schema should you do? What schema actually matters? The way to do that is to look at the data. If you were to ask me what schema matters to start, I would say: what matters to your business? It always comes back to what kind of content you’re trying to get your customers to engage with, who you’re trying to be known for - it plays a little to Jason’s brand piece - and specifically, what questions are being asked and how you engage and entice that customer in.
What’s also interesting is that we’ve been seeing, quarter on quarter, a change in what rich results are performing. FAQ has been one we’ve seen a lot of changes with - lots of conversation: Marie Haynes and others have been observing this up-and-down with FAQ throughout July and August.
So we asked: what are the best-performing rich results? I’ll share just the top four. By clicks: led by Products, followed by Job Listings, FAQ, and Reviews - so a lot around e-commerce. By impressions, it’s a bit different: Reviews and FAQs lead, followed by Products and Job Listings. And click-through rate is another mix - video kind of leads CTR, followed by Job Detail, but also FAQ and Reviews.
What’s interesting about FAQ is that if you know how to properly nest schema markup, you can use FAQs across different types of content. It’s very flexible. More research coming from Schema App on this, especially building it out around FAQs - using data to inform our strategies for clients.
Dixon Jones: Okay, amazing. I’ve just found out that questions from Facebook appear on my screen, which is absolutely amazing. Chris Lebaton has jumped in on Iman’s point - his speakable schema is still valid. I heard that Google deprecated the type, but it seemed to still be there when I looked at the Google schema page today.
Martha van Berkel: I haven’t seen it deprecated. I think the thing that’s interesting is that in May 2019, Google released How-To and FAQ. You can also look at Question And Answer, which is a bit different from QA Page, which is a bit different from FAQ - but both ask people to structure their content in a question-and-answer format or a process format. How-To is where there’s a distinct process. When speakable first came out, it was specifically to highlight areas within larger bits of text that you wanted to be used for voice assistants. But then in May they said voice assistants would be reading all schema, and then they gave this easier way - without using XPath - to call out question-and-answer structures. So speakable hasn’t come out of beta for a long time. It’s still very focused on news and media type sites. They may be seeing the same structural value coming from FAQ. That would be my opinion.
Dixon Jones: Is anyone seeing text-to-speech coming out when they use voice assistants?
Jason Barnard: I had a client - a very corporate, large B2B company - and we actually got quite a few texts reading out via featured snippets. I tend not to use OK Google or whatever it would be, but I used it to test it for them and we were quite successful just with FAQ. So I think maybe Martha’s right. But as you said, it’s still on that developers page and it just says “beta.”
Martha van Berkel: Can I bring up a related but different experience? I don’t know if anyone else has seen this - but when you search for something, in the actual page result, they’ve highlighted a portion of the page in yellow. There’s definitely some natural language processing going on where they’re extracting phrases within the content already, without speakable, that is then enhancing the search experience - even showing a yellow highlight on the page after the result. That’s coming from the featured snippet mechanism. So there are things going on that mimic speakable, but in sort of generic search results.
Iman Hamdan: Yeah - they mentioned that in the guidelines: in order for content to be speakable it should be about twenty seconds of content, clear enough to be picked up. I didn’t test it, but it’s really interesting to see that.
Jason Barnard: Just coming back to what Martha was saying - I find this incredibly interesting. On my podcast, I interviewed Ali Alvi, who runs Bing’s QA - which is what Bing calls the featured snippet. He was basically saying that the featured snippet runs along the same lines, with the same algorithms, as the snippets under the blue links. The snippets under the blue links are not part of the same algorithm as the blue-link ranking algorithm - which is incredibly interesting. And “featured snippet” - it’s called that because it’s a snippet that’s featured, which is brilliant.
What they do is: they go through the entire page and summarise it. That’s why Google is now saying, “Please do write your meta descriptions,” because it helps them confirm their summary is correct - if your meta descriptions are written honestly. They summarise the page and then they try and guess what the implicit question is that the page answers. In which case, they can match the question the person asks to the implicit question the document answers, from their own summary. Using, for Bing, Turing - and for Google, their equivalent.
Dixon Jones: Summarising the page - that’s obviously where InLinks comes in, because it tries to read the entities on that page, runs an NLP program, pulls out the entities from the page, turns that into WebPage schema, and does the same as Wikipedia articles to try and show Google what the page is about. But it doesn’t really show up in rich snippets or featured snippets in any direct way - though we do think it gives Google a much better understanding of the underlying page and content. Does WebPage About schema factor into what you do much, Martha?
Martha van Berkel: Not in a raw sense - because I’m a big believer that it comes back to what the page is actually about. Within the schema.org vocabulary there are over 800 different classes to describe everything from types of businesses to services to medical areas, finance concepts, and so on. I really believe that as you call out what the page is about, you can do that inherently within a very specific schema class. That’s foundational to how we think at Schema App: how do we use the most specific type? And we actually do entity linking now at scale, using the correct properties within specific types. WebPage is a good step forward - it’s better than nothing as a way to start describing the topic. But I think there’s a more specific approach you can take if you have the right resources.
Jason Barnard: Just coming back to Dixon’s point - saying “this entity is in the page, and here’s a reference to it in Wikidata or DBpedia” - I was working with a client today on that and it’s astonishing how ambiguous a lot of things are that we as human beings see as very clear. “BT” and “EE” - two companies - I was trying to look at in the page, and for them it was incredibly obvious, and for me I had no idea what BT and EE mean. Disambiguating must be a very strong and helpful signal to Google.
Dixon Jones: I agree - and a human setting those up tends to make mistakes too. I was doing a demo just before we came on air - a person’s name came up as an entity and I said: do not link this person’s name to this entity, otherwise this person is going to be coming up as a national football star instead of as a lawyer. You’ve got to get those associations right if you’re going to use WebPage About schema - same as with any mention-based schema.
We just got Sam Gooch coming in with a question: “I heard FAQ markup has recently been showing up less frequently in Google SERPs. I recently added FAQ markup for some key pages but none are showing up - are you seeing less of these at the moment?” Martha, you did the research on this.
Martha van Berkel: Yes. Starting around July 15th to 17th we saw what I’d call a first decline. We’ve seen about three different scenarios:
The first is what I call a drop and loss - it dropped on the 15th or 17th and hasn’t really recovered. Clicks and impressions have continued to stay low in Search Console’s performance report for FAQ. We’re seeing this particularly in e-commerce.
The second scenario is drop and maintain - it dropped on the 15th but didn’t go all the way to the bottom. It maintained a lower but steady rate. Our analysis is pointing to the finance area as where we see this pattern.
The third I call the drop and bounce - on the 15th to 17th it came right down, we saw it sort of recover a little around the 24th to 25th of July, then dive bomb again on the 4th of August. Around health-related sites we’re seeing more of this drop-and-bounce pattern.
We’re just about to start our next level of analysis: was it the application of FAQ that determined the outcome? Was it a pure FAQ page, or was FAQ nested as part of a collection, under a blog, as part of a collection page? We’ll have some answers likely by end of week.
Dixon Jones: The industry segmentation is really interesting. Is it surprising that FAQs are more prevalent in certain industries, or do you think there are separate algorithms targeting different verticals?
Martha van Berkel: I’m trying to figure that out based on the application of FAQ, to see if that has had a different lens. It just happens that we have relatively large groups of clients in those related industries. Marie Haynes, Lily Ray and others are seeing groupings by industry type in their analysis too, which aligns with what we’re seeing.
Jason Barnard: Yes - within my Brand SERP collection, which is now 70,000 brands tracked monthly, giving me millions of these data points, I’ve divided it into categories. I’m seeing enormous differences in Brand SERPs and Knowledge Graph presence between different industries. Within music, entertainment, and book publishing there’s a lot of Knowledge Graph presence; much less in other industries. There are enormous differences between verticals in any kind of SERP and results.
Dixon Jones: Yeah - and I guess we don’t know whether that’s an active algorithm that says “this is a specific industry and therefore this kind of result” or whether it’s a function of the kind of content people are writing. We just don’t know.
You mentioned event schema a little earlier, Jason - and you actually said it live, so you can no longer hide from it. Do you love event schema in relation to Knowledge Panels? Tell me more.
Jason Barnard: What I noticed was that Semrush got into the Knowledge Graph not because of their Wikipedia page, but through their events. I was tracking them on the Knowledge Graph API and their events turned up a long time before the entity itself did. Then suddenly they popped in - and that was during the Budapest update I wrote about in Search Engine Journal last year, when a lot of Knowledge Graph presence changed: confidence scores went right up, entities came in and went out, and events took a big push upwards.
I’ve been thinking: Google has done very well cataloguing things like books, films, and music - well catalogued online by fans and dedicated databases like IMDb and MusicBrainz. Events hadn’t been catalogued in that way, in a manner Google could actually rely on. And then COVID changed the picture completely because local businesses and events suddenly had to react - events in the Knowledge Graph became much more time-sensitive.
So I created Kalicube Tuesdays - which was just an excuse to have my podcast online because I couldn’t go to events to record them. I’ve had Martha on the podcast at BrightonSEO - that was face to face. Now I have to do it online.
What I’m now doing is creating this event series - Martha, you’ll like this - an event series with lots of events in it, with people who are entities, with topics who are entities, me who’s an entity, my company that organises it as an entity, and two sponsors - WordLift and Semrush - who are also entities. And now if you look in the Knowledge Graph, Kalicube Tuesdays is attached to me, WordLift, Semrush, John Lincoln, Jez Schรผltz, Ted Rubin, Rand Fishkin - and basically I’m barnacling onto all these people.
A lot of that - obviously it isn’t schema markup alone that made it happen - but the support of schema markup is very, very important. The schema markup I’m putting on my site supports what I’m putting on YouTube, Crunchbase, LinkedIn, Eventable, Eventbrite - I’ve really gone to town on this. But it all comes back to that schema markup. I think if I change the schema markup, there are reactions - it works less or better depending on how well I’ve actually applied it and how consistent I am across the whole thing.
Obviously Martha’s looking at big chunks of data. This is just me and my silly podcast that happens once a week, so I’ve got like fifteen examples - I can’t say this is how it always happens. But I can say this happened for me. And the really cool thing: my first entry in the Knowledge Graph was for Ted Rubin, who wasn’t already in the Knowledge Graph. I was surprised - I thought it would be somebody like Rand Fishkin. Rand Fishkin didn’t get into the Knowledge Graph through Kalicube Tuesdays. He now has, but the sequence was: it took me months to get the first one, days to get the second one, one day to get the third one, five hours to get the next one, and then ten minutes to get the last one - ten minutes from when I posted to when it actually appeared in the Knowledge Graph API. That’s astonishing. Sorry - I get overexcited.
Martha van Berkel: Can I build on something you’re saying? It’s just another tidbit I find really interesting. Schema.org has released more versions this calendar year - I think we’re at version 8 or 9 - more than any previous year. Previous years we saw four or five releases. And actually COVID changed it: they were really increasing the number of properties and ways you could describe events in the Knowledge Graph. And the most recent release - I believe it’s 9.0 - includes quite a lot about collections of products. It’s not slowing down. What I think is really exciting is that during COVID, one of the things we’ve all had to learn is agility - and we’re seeing a lot of that coming from the vocabulary too. In preparation for more people shopping, organising different types of events, and so on.
Dixon Jones: Iman, do you want to jump in or just listen?
Iman Hamdan: It’s a very interesting talk. It’s really evaluating all kinds of schemas, but unfortunately I didn’t have that kind of exploratory space in my enterprise background - it’s really not easy to test these things. But it’s very helpful, what Martha and Jason are talking about. The hard thing is it’s not easy to replicate at scale in enterprise.
Jason Barnard: I really love - I don’t know if this is something we can actually rely on, but - the idea that whatever schema markup you use is always going to be positive, as long as it’s accurate, simply because it confirms and corroborates and gives Google confidence that it correctly understood what was in the page in the first place.
Coming back to the Ali Alvi example: they’re looking at the page, analysing it, summarising it, pulling out chunks, putting them together to create the snippets under the blue links - made out of multiple parts of text from different parts of the page. Any schema markup you can give them that corroborates and supports what they think they’ve understood algorithmically is always going to be very helpful. The meta description example I was talking about earlier is also very helpful - more corroboration for the machine that it’s correctly understood the page, which gives it confidence. And I like the idea: if the machine is confident, it will put it front and centre. If it isn’t, it will kind of push it down a bit because it isn’t really 100% sure.
Dixon Jones: Dwayne has a question: does performing entity analysis on content, and then annotating those in structured data with “about” or “mentions” linked to DBpedia URIs, help to disambiguate those terms? Rephrasing: does using “sameAs” with “about” and “mentions” linking to DBpedia URLs help Google understand what a page is about? I’ll go first. DBpedia - possibly, but I don’t think it’s as powerful as Wikipedia. You can use Wikidata, DBpedia, or Wikipedia, but DBpedia is quite a way behind in terms of data and trust. But if they’re getting it right, it should help to disambiguate.
Martha van Berkel: I’ll agree with you. Way back - like 2015, when we first started doing this - we often used Wikipedia or DBpedia definitions around local business: defining cities, defining regions, area served for local business is a really great one. Or if you’re using additional types - to enhance a local business type where there’s no exact match, like a marketing agency. What we like about DBpedia is that it’s somewhat language-agnostic in how it defines things, which is great if you’re looking for multilingual entity definitions. Where we’ve evolved our thinking is to also include Google’s Knowledge Graph entity ID now - because if it’s for Google and it’s their own entity definition, that makes it easier. “About” is a very strong connector - “this is about this topic.” “Mentions” is a softer relatedness. And we just kicked off some testing at scale to measure the impact of this - but we don’t have enough data yet to share.
Dixon Jones: Google’s NLP API is interesting here - what we call “search engine understanding” at InLinks is comparing the entities we see versus the entities Google says they see in their Knowledge Graph. What’s interesting is that Google is very good at brands, locations, and proper nouns, but it’s not so good at picking up concepts that humans understand as specific entities and ideas. That’s where Wikipedia, DBpedia, and “sameAs” with “about” and “mentions” schema can really help - you’re encouraging yourself to say “this page is about this thing.” The idea is that it’s about one entity - Joost de Valk says this well: Google likes it when a page is about one entity, although it’s a very simplistic view of the world. And then multiple mentions will help disambiguate the entities within the page and better clarify the topic.
Jason Barnard: I did an experiment with my cartoon characters - I had one page with all of them on it: five yellow koalas and four blue dogs. Google did okay, and I got them into the Knowledge Graph and I was getting my Knowledge Panels. It was all terribly fun. Then I split them into one page each - nine pages, one for each character - and then did all the relationships between them: this one is this one’s mother, this one’s father, wife, husband, child. And it actually exploded - we multiplied the confidence score by eight by splitting them down into their individual pages. So it’s a page about Daddy Koala, and it mentions Mummy Koala, who’s his significant other. “Daddy Koala is a fictional character, and his significant other is Mummy Koala.” I just can’t get enough of that one. It’s so silly. And you can get your Knowledge Panel with it.
Dixon Jones: I have to worry about Jason’s sanity, sitting there doing all these relationships between cartoon characters.
Jason Barnard: To be honest with you, literally last night at three o’clock in the morning I woke up and I was worried about my sanity.
Martha van Berkel: What Jason’s building on relates to Dwayne’s follow-up question as well - defining the relationships between things is the key. Google recently announced that it wants you to nest things and use IDs in order to define where this thing is actually defined. And that plays to what you were just saying, Jason - when you connect these things, you’re also connecting their relationships, so we understand the rigorous connections as well.
Jason Barnard: And that also comes into every entity having its home - which is kind of what these characters come down to.
And my experience with Wikipedia - Dixon, you were talking about it a while ago. I had a super-duper Wikipedia page for myself. We said - in French: “je suis allรฉ te voir” - I went to you, which was very unfair of me - we talked about it again on Darren Campbell’s show, and the Wikipedians wiped all my pages: they wiped off my personal page, the blue dog, the yellow koala, and my folk-punk group. I thought the entire Knowledge Graph presence would collapse.
In fact, the Knowledge Graph presence improved. I now have control over what’s being said about all those entities - except my own. And mine completely collapsed in a heap - because I moved the schema markup from my homepage to a page within the site. I moved my home, and Google just freaked out and dropped it, and created a new entity within 24 hours. What I find interesting is that it has persistent memory: once it’s got you in there, you will just keep popping back whether you want to or not. And what I find interesting is that - whatever the rights and wrongs of what happened with Wikipedia - it was me who messed it up in the end, because I moved my schema markup in an attempt to improve things.
Coming back to the beginning: find a home for your entity and stick to it.
Martha van Berkel: Which comes to the question of what schema matters - it’s the things that describe what’s important to you and your business. Depending on what kind of business or person you are, it’s still those things. I always come back to: it’s about business strategy. What’s actually important to your business? How do you make sure those things are being found and understood? And then, how do you see if you can get rich results for those as well?
Jason Barnard: I just realised the mistake I’m making - yellow dogs, blue dogs, and yellow koalas are not important to my business. But I’m spending lots of time on them - which brings us back to the sanity question.
Dixon Jones: Iman, are we cutting you out? Is there anything you want to dive in and say?
Iman Hamdan: So I’m listening to have more experience in this stuff. But just adding to the topics you’re discussing - having analysis by numbers would really help justify these things: case studies for each one. Showing them with numbers, because to show these cases to decision-makers in enterprises without numbers, they will not accept these projects. From my standpoint, showing cases like a Knowledge Graph entry or a specific schema result - it’s really not as important to them as showing revenue. Yeah.
Dixon Jones: Hopefully things like this will start to help companies realise that there is a direct correlation between paying attention to schema and revenue - and I guess that’s our job to build that case up.
So I want to leave with one question, if that’s okay. And that is: Google’s guidelines say - and I’m going to quote from the Google guidelines page on schema support - “You must follow these guidelines for your app to be eligible to appear as a rich result.” And then: “Warning: If your site violates one or more of these guidelines, Google may take manual action against it.” My question: why does Google feel the need to penalise people for bad schema? Iman, do you want to go first?
Iman Hamdan: I believe it’s the misuse - duplicative schemas, the same entity described with bad or conflicting information. I believe that gets penalised. I’ve tested and seen it: you’re misleading the search results and misleading people to the same content, which is the duplication issue. I’ve seen pages penalised and deranked for this.
Dixon Jones: And when we were at Majestic - when we had a score for every single domain and wanted to put that into a review schema - John Mueller told me that would be a very dangerous thing to do because it was automating reviews, which would be a bad idea. Any other reasons, Jason or Martha, why the guidelines are so hot on this?
Martha van Berkel: Go ahead, Jason - I know you’ve got something.
Jason Barnard: I was just going to say: people do it to get the rich results - the little product stars, the stock information, all that stuff. Everybody thinks “I’ll cheat on that and get away with it,” or they see competitors cheating and think “why don’t I?” And that becomes an increasingly big problem for Google. I actually had a client who got penalised for pretending that local businesses were products. It worked for them for a year. The penalty was actually just: they removed the rich snippets. And as soon as we sorted it out, it came back. So in fact the penalty wasn’t so much a penalty as having something taken away that they shouldn’t have had in the first place. I don’t know.
Martha van Berkel: We’re not seeing as many penalties - maybe because we build good schema. Previously we saw a lot of penalties around ratings and reviews - people trying to force a rating onto as many pages as possible, very spammy. I think the change we saw in reviews, especially for local business and around blogs and news articles, self-corrected that.
I think the other reason people do it - and this plays to what Iman has been talking about, especially in enterprise - is that it’s actually hard to make changes to content. Jason said something at the beginning of this podcast that is one of my favourite things: schema markup can help inform content strategy. People will sometimes be like: “I really want to get that rich result, I can’t change the content yet, so I’m just going to put the schema markup on there and hope.” When in fact, schema markup is a whole process - you can identify opportunities for rich results, but it actually means going back and revisiting your content, thinking about how it needs to change, making that change, and then capitalising on the rich result. That’s why Schema App exists - we help people through that whole process. But it is hard, to Iman’s earlier point.
Jason Barnard: Coming back to that - I started doing an entity-based content model with WordLift using their application around my podcast. The idea was: the podcast series is an entity, each podcast episode is an entity, I’m an entity, the guest is an entity, the topic is an entity. We join all that together with relationships and push the entire thing forward. The first thing that struck me was how very badly organised it was in my own head, and how much WordLift’s schema markup forced me to think it through. They were saying: how are we going to mark this up? How are we going to organise it? And that forced me to think it through, and it actually changed the course of the podcast - the way I was organising it, the way I was developing it - because it made me think in a much more structured and logical manner. And obviously I would say this, but I think it’s made it better. I think the podcast is better now than it was a year ago. So do listen to it. Sorry - that was my promotion.
Dixon Jones: Okay guys, I think we’re nearly out of time. David, are there any burning questions I haven’t asked?
David Bain: I think we’re fairly okay. There’s one from Greg: “I’ve used schema for local businesses. Occasionally a blog post will get lots of love from Google and spread nationwide - the byproduct has been a very high bounce rate. Do you have a way to mitigate this?”
Jason Barnard: I would say: Google got it wrong. Google is saying, “We feel this is valuable for the person in the context they find themselves,” and yeah - they’re getting it wrong. But presumably what will then happen is they will see that this answer - because it’s not local to the person who’s asking it - is not satisfying them, and will then remove it. That’s the theory.
From Greg’s point of view - if it’s a local business and people are coming into the site looking at content that isn’t relevant to their location, the bounce is logical. It isn’t necessarily a problem. He just needs to make sure the people who are in the right area find the content relevant to them, and not try to please everybody. The people who aren’t in that local area are not going to be well served by that business anyway - so you’re wasting your time trying to keep them.
Dixon Jones: Cool. David, what’s our next show?
David Bain: We’re going to be looking into AI and SEO and how those tie together - broadcasting live on the InLinks Facebook page, YouTube channel, and Twitter handle. As soon as that gets scheduled, it’ll be up there - hopefully you can subscribe and watch us live.
Dixon Jones: And if you guys want to find out where all those links are, go to the InLinks blog and type in “The Knowledge Panel Show” - we’ll put the recording up there, along with previous recordings. And if you want to be notified of future events, there’ll be a web form up there as well. If you think you’ve got something you’d like to be on a panel for, you can ask there too.
So I’m going to say thanks very much to everybody for coming along - that’s been an absolutely storming show. A brilliant panel as well. If people want to know more about you, maybe sign off by saying how they can find you. Martha, do you want to go first?
Martha van Berkel: Sure. You can come to our website: www.schemaapp.com. I’m also pretty active on Twitter at @MarthaVanBerkel, or connect with me on LinkedIn.
Iman Hamdan: Yeah - you can connect with me on LinkedIn as well, as Iman Hamdan. That’s the only social I have.
Dixon Jones: Which I apparently mispronounce every single time. Jason.
Jason Barnard: Thank you. You can find me on Twitter, you can find me on LinkedIn, and you can find me at Kalicube.pro. If anyone wants to submit their brand to be tracked by Kalicube, we track them for free - we’ve got 75,000 brands that we’re tracking. Our aim is to figure out how Brand SERPs work, how to get into the Knowledge Graph. And I’d just like to say to end: if the future is entity-based search, being recognised as an entity and being in the Knowledge Graph is the deal - and start now.
Dixon Jones: Very good. If anybody wants a demo of InLinks, go to d9.ms/30 - that will be my Zoom calendar and you’re welcome to hook up with me. And it just leaves me to say thank you to David - my production team, otherwise I’d have made a mess of it. Thanks Martha, thank you Iman, and thank you Jason. Hopefully see you all next month. Cheers.
End of transcript.
Panels documented: 11 Running concept list staked in this transcript: Entity Home (named principle, August 2020) ยท Machine confidence as visibility gate ยท Corroboration Threshold ยท Events as KG entry path (Kalicube Tuesdays) ยท Entity isolation ร8 confidence score experiment ยท Ali Alvi / Bing QA featured snippet algorithm citation (dated August 2020) ยท Entity-based content model with WordLift ยท Entity-based future prediction (“start now”)