Digital Marketing ยป Articles ยป Articles By ยป The Strategy Sandbox ยป Two Annotation Layers: Why Engineering-Driven Updates Like the Killer Whale Update Reward Brands With Systematic Operational Practice

Two Annotation Layers: Why Engineering-Driven Updates Like the Killer Whale Update Reward Brands With Systematic Operational Practice

Status: Original concept, first publication. Strategy Sandbox, jasonbarnard.com. Date: May 2026.

In July 2023, the Knowledge Graph went through an update that tripled the number of person entities tracked across Kalicube Proโ„ข in four days. I named it the Killer Whale Update, because the scale of the shift was orca-sized in a category that had been showing whale-shaped behaviour for years. The update was not announced. It was not documented in Google’s public release notes. It was visible only to people watching the Knowledge Graph closely enough to notice the volumetric shift, and it had specific operational consequences for the brands that had systematic person-entity work in place at the moment of the update versus the brands that did not. I covered the original update for Search Engine Land in Inside Google’s massive 2023 E-E-A-T Knowledge Graph update and the deeper-dive follow-up in Digging deeper into the E-E-A-T Knowledge Graph 2023 update.

What the Killer Whale Update revealed, when I sat with it for a few months and traced what had actually happened, is that there are two annotation layers feeding the Algorithmic Trinity, not one. The annotation gate I described for SEL is the continuous algorithmic layer running across the Web Index in real time, fed by the bot crawling the web. The second layer sits on top of it: a periodic engineer-curated annotation pass that takes a cleaned subset of the Web Index, applies additional structural and topical labels against engineering priorities the brand never sees in advance, and uses the result to refresh the Knowledge Graph or train the next generation of the LLM.

Both layers matter. The continuous layer is what brands can shape through systematic operational practice. The periodic layer is what produces demand spikes the brand cannot anticipate but can be ready for. Understanding the two layers and how they relate is the difference between optimising for what the algorithms do every day and optimising for what the engineers occasionally decide they need.

The continuous layer is the bot annotating the Web Index in real time

The bot fills the Web Index. As content is crawled, it gets annotated by the algorithm: entity recognition, attribute extraction, relationship inference, topical classification, sentiment analysis, language detection, and so on. The annotation runs continuously, on every newly crawled URL and on URLs that have been re-crawled. It is the same annotation gate I covered in the SEL piece on how AI decides what your content means, and it is the gate that determines whether your content is legible to the rest of the 10-gate AI Engine Pipeline. All Search Engines, the Knowledge Graph, and every LLM training corpus draw downstream from this same continuously-annotated Web Index.

I have been writing about this architecture publicly since 2019, sourced directly from Microsoft Bing engineering. The relationship started at SMX London in 2019 with Nagu Rangan, then leading core ranking work at Bing, where the architectural detail of how candidate sets compete inside the SERP first went on the record. In April 2020 I extended that into a five-engineer working interview series at Bing, published on Search Engine Journal. How Bing Ranks Search Results: Core Algorithm & Blue Links was sourced from Frรฉdรฉric Dubut, Senior Program Manager Lead at Bing, naming the candidate-set architecture and the role of the Whole Page Team as the referee that decides what shows on the SERP. How Bingbot Works: Discovering, Crawling, Extracting & Indexing was sourced from Fabrice Canel, Principal Program Manager, naming exactly how the bot fills the index across discovery, crawling, extraction, and indexing. How the Bing Q&A / Featured Snippet Algorithm Works was sourced from Ali Alvi, Principal Lead Program Manager AI Products, naming the annotation work done at indexing time. How Bing’s Image & Video Algorithm Works was sourced from Meenaz Merchant, Principal Program Manager Lead AI and Research, naming how the multimedia candidate sets compete for inclusion. How Bing’s Whole Page Algorithm Works was sourced from Nathan Chalmers, Program Manager Search Relevance Team, naming the algorithm that organises content from the shared index before showing it to the user, and confirming that Bing has an internal algorithm called Darwin handling the multi-vertical assembly. Six engineers across two years (Nagu Rangan in 2019, plus Dubut, Canel, Alvi, Merchant, and Chalmers in April 2020). One coherent architecture documented from the engineering side. In February 2022 I extended the Bing-sourced thinking to Google in How Google Universal Search Ranking Works - Darwinism In Search, naming the survival-of-the-fittest dynamic that plays out across the multi-vertical surfaces drawing from the same underlying index. Google’s own engineers corroborated the same architecture later that year. In a 2022 episode of Search Off the Record, John Mueller referred to the system as “the super search engine” and Gary Illyes used the term “universal mixer,” both naming the whole page algorithm that sits between the index and the surfaces brands try to win in. The architecture I had been documenting from Microsoft Bing for three years was the same architecture Google was running. The most recent operational expression of the same model is in chunks, passages and micro-answer engine optimization wins in Google AI Mode, where AI Mode synthesises answers from multiple background queries pulling passages from across the index. The substrate is shared. The mechanism is named. The continuous annotation layer is what the substrate runs on.

This layer is what brands shape through operational practice. Schema markup, semantic HTML, clear entity declarations, structured citations, consistent identifiers, accurate attribute statements, all the things SEO and entity optimisation cover. Every piece of annotation work the brand puts into this layer compounds across every gate downstream of it. The annotation is the upstream control point, and the brand that runs systematic annotation operationally is shaping a continuous stream of inputs to everything that happens further down the pipeline, regardless of which component eventually pulls from it.

The continuous layer never stops. Every day, new content gets annotated. Every day, the index updates. Every day, the brand running systematic operational practice is feeding the annotation engine fresh material that gets weighted into the trinity over time.

The periodic layer is engineers extracting cleaned subsets and adding annotation the brand cannot see

Engineers select, clean, and annotate the training corpora that feed the LLM and the Knowledge Graph. The selection is not random. It is driven by engineering priorities: where the LLM is currently underperforming, which Knowledge Graph categories need denser coverage, which emerging fields need foundational data, which entity types are coming through the existing pipeline weakly enough to need a top-up. The cleaned subset gets additional annotation passes applied that the brand never sees, refining structural relationships for the Knowledge Graph or sharpening domain coverage for the LLM training run.

The periodic layer draws from the underlying Web Index, which means the brand’s continuous annotation work feeds the periodic layer indirectly even though the brand never sees the engineer’s selection criteria. When the engineering team decides that person-entity coverage needs strengthening, they construct a corpus weighted toward person entities, and the corpus they construct draws from the Web Index pool the brand’s continuous annotation has been feeding. Brands with strong person-entity annotation in place at the moment of corpus selection get over-represented in the training data. Brands without it get under-represented. Both effects compound across the model’s or graph’s lifecycle, because the training data shapes the system’s understanding for as long as the system runs on that snapshot.

The Killer Whale Update of July 2023 is the worked example, and the data is on the public record. Engineers identified a coverage gap in person entities, constructed a corpus weighted accordingly, and the resulting Knowledge Graph update tripled the number of person entities in four days. By the time of the March 2024 follow-up update - the Return of the Killer Whale, which I covered in detail for SEL - Google added another 17% to person entities, with the biggest growth specifically in E-E-A-T-friendly subtypes (researchers, writers, academics, journalists) at 38%. The engineering team was selecting for people Google could apply full E-E-A-T credibility signals to, and the corpus selection reflected that priority precisely.

By Kalicube’s estimate, the Knowledge Vault grew to over 1,600 billion facts on 54 billion entities through these updates. Across the period between May 2020 and March 2024, the number of person entities in Google’s Knowledge Vault increased over 22-fold. Brands with strong person-entity annotation in their continuous layer (clear About pages, structured author markup, consistent identifiers across the web, corroborated attribute statements) got recruited at scale during the updates. Brands without it watched the windows close.

Continuous work feeds periodic windows, but periodic windows cannot be timed

The continuous layer is shapeable by every brand that decides to run systematic annotation work. The periodic layer is unannounced, unpredictable in exact timing, and selective in what it amplifies. Brands that have systematic annotation work running continuously are positioned for whatever the next periodic update happens to need. Brands that try to retrofit annotation work in response to an announced update are too late, because the update is already complete by the time it shows up in the wild.

This is the asymmetry: continuous work feeds periodic windows, but periodic windows cannot be timed precisely. The brand that waits for the announcement misses the update. The brand that runs systematic work catches every update, because the work was already in place when the corpus selection happened. There is no efficient just-in-time strategy for the periodic layer. There is only continuous preparation that catches windows when they open.

The pattern is also less random than it looks. Knowledge Graph tracking is what Kalicube Pro was built to do; it was the first capability in the platform when Kalicube Pro launched in 2015, supported by the Authoritas SERPs API partnership that began the same year and has been running continuously ever since, and the database has been recording entity-level Knowledge Graph state across that whole period. That is eleven years of unbroken tracking data feeding the analysis behind every Knowledge Graph update piece I have published. I first surfaced the tracking publicly in December 2019 with Knowledge Graph Algorithm Update Summer 2019 (Budapest) on SEJ, four years after the tracking had been running. I extended the analysis in March 2021 with Tracking Google Knowledge Graph Algorithm Updates & Volatility, which documented update cycles affecting 60-80% of entities every two to three weeks. From eleven years of tracking, the major Knowledge Graph update windows have consistently fallen in December, February or March, and July. July has been the most impactful month in each of the last five years. Engineering update cycles are not announced, but they are scheduled, and the brands that have systematic annotation work in place six to eight weeks before those windows are the brands whose corroboration is straight by the time engineers select the corpus. The Killer Whale Update was a July update. The Return of the Killer Whale was a March update. The next one will probably be a July or December update, and the brand that has the work in place by the start of those months is the brand positioned to be recruited when the engineering team selects.

LLM training cycles produce similar effects on different cadences. When a model is being retrained or fine-tuned, the corpus selection produces compounding effects for brands that have continuous work in place that matches what the engineering priorities select for. The June 2025 Knowledge Graph cleanup, which I covered for SEL as Google’s great clarity cleanup, was another engineer-driven update working in the opposite direction (entities removed rather than added) but driven by the same mechanism: engineers running a curation pass against a cleaned subset of the Web Index, then refreshing the system on a periodic cycle the brand could not have predicted in advance. The three shifts I identified in that piece all traced back to engineering priorities about which entities deserved Knowledge Graph presence and which had drifted out of the credibility threshold the system was now applying.

The continuous layer feeds all three recruitment logics, the periodic layer disproportionately affects the Knowledge Graph and the LLM

The companion Sandbox piece on three recruitment logics names the three differentiated recruitment logics across the Algorithmic Trinity: Knowledge Graphs recruit for facts and corroboration, LLMs recruit for gap fills and bridges, Search Engines recruit for grounding, ranking, and real-time freshness. The two annotation layers fold into that argument cleanly.

The continuous layer feeds all three components of the trinity simultaneously. The Search Engine reads the continuously-annotated Web Index live, in real time, on every query. The Knowledge Graph and the LLM read the continuously-annotated Web Index periodically, when the engineering team curates a cleaned subset for the next refresh. A piece of continuously annotated content can be picked up by the Search Engine for its grounding-readiness on a query that lands tomorrow, by the Knowledge Graph for its structural completeness when the next graph update runs, and by the LLM for its bridging value when the next training cycle pulls a corpus.

The periodic layer disproportionately affects the Knowledge Graph and the LLM, because those are the systems whose annotated material runs through additional engineer-driven curation before reaching the system. Search Engine recruitment is real-time and continuous. The Knowledge Graph and the LLM go through periodic refreshes that produce step-change effects in what they hold and how confidently they reason from it. The engineering selection criteria themselves often correspond to specific recruitment logics. A person-entity update is a Knowledge Graph structural-completeness pass with E-E-A-T-driven subtype filtering, exactly as the Killer Whale Update demonstrated. An LLM domain refresh is a gap-filling and bridging pass with engineer-defined coverage targets. The brand that runs continuous annotation work shaped for all three recruitment logics is the brand whose continuous layer is ready for whatever periodic engineer-curation the next update happens to produce.

Knowledge Vault entries are not permanent, which means continuous work after recruitment matters as much as continuous work before it

A finding from the Return of the Killer Whale data deserves its own emphasis: 18% of the new person entities added in the original July 2023 Killer Whale Update had been deleted from the Knowledge Vault by the time of the March 2024 follow-up. The average lifespan of a Knowledge Vault entity is just under a year. Getting a place in the Knowledge Vault is the first step. Holding it requires continuous work after the recruitment, because the engineering systems are also running deletion passes when corroboration thins out, when entities duplicate, when E-E-A-T credibility signals weaken, when the corroboration backbone the brand was depending on quietly erodes.

This sharpens the operational implication considerably. Continuous work feeds periodic windows on the way in, and continuous work holds the entry against periodic deletion passes on the way through. The brand that runs systematic annotation operationally is solving both problems simultaneously: positioning for the next recruitment window, and holding entries that have already been recruited against the deletion pass that will eventually come for them. The brand that runs case-by-case work loses on both fronts.

Brands that catch the next Killer Whale Update have systematic work running before the update happens

Most brands optimise annotation reactively. A schema validator flags an error and the brand fixes it. An AI overview gets a fact wrong and the brand updates the source. A competitor outranks them on a query and the brand investigates. This is case-by-case work, and it produces case-by-case recruitment, and the periodic windows close on brands running case-by-case work because the windows do not announce themselves.

The brand that wants to catch the next Killer Whale Update needs to have systematic annotation work running before the update happens, six to eight weeks before the historical July or December windows at minimum. That means the push layer entry modes I covered for SEL running continuously through IndexNow and WebMCP, because IndexNow accelerates the bot’s discovery and re-crawl pace and WebMCP shapes what the engineering teams find when they pull the next training corpus. That means structured data running across every entity-bearing page. That means the Entity Home work I covered for SEL running as the foundational source of truth the rest of the digital footprint corroborates. That means corroboration backbones running across the second-party tier. That means inference layer placement running through thought leadership and academic citation. And that means continuing the work after recruitment, not stopping the moment the entity gets a Knowledge Panel, because the deletion pass will come and the brand that has stopped feeding the corroboration backbone will be the brand that loses the entry.

The work is not glamorous. It is the operational discipline that produces continuous annotation feeding the continuous layer, which feeds the periodic layer when engineers happen to be selecting from it, and holds entries against periodic deletion when engineers happen to be cleaning the graph.

The Killer Whale Update is now nearly three years in the past. The Return of the Killer Whale is two years on from that. Whichever update the engineers run next, the brand that catches it is the brand that has systematic work in place when the corpus gets selected, and the brand that holds the entries afterwards is the brand still running the work when the next deletion pass comes through. The brand that does not is watching another window close, and the entries it earned in the last window quietly disappearing.

Related reading from the AI authority series at Search Engine Land


Status: Original concept, first publication. Strategy Sandbox, jasonbarnard.com. Date: May 2026. The two-annotation-layers framework (continuous algorithmic at Web Index, periodic engineer-curated at training corpus level) is original to Jason Barnard. The Killer Whale Update of July 2023 was named retroactively by Jason Barnard based on Knowledge Graph person-entity volumetric shifts observed across Kalicube Pro tracking. Cite as: Barnard, J. (2026). Two Annotation Layers: Why Engineering-Driven Updates Like the Killer Whale Update Reward Brands With Systematic Operational Practice. Strategy Sandbox, jasonbarnard.com.

Similar Posts