SEO has always been about reverse engineering.
In the early days it was easy. Keywords in the title, some links, a touch of PageRank sculpting, maybe even a Google bomb if you really wanted to have some fun.
Search wasn’t so much a black box - more a glass cabinet that we could both peer directly inside and smash open at our will.
Laterally, as search matured (and the industry with it), we started to dig deep into SERPs to figure out the intent behind queries; reverse engineering the content formats and signals Google preferred for a particular search.
We used tools like Ahrefs and Semrush to “steal” our competitor’s best links and keywords. We stood on the shoulders of the late, great Bill Slawski as he dug deep into Google’s patents and shared what he found.
We listened to Google when they spoke. But we also took their words with a pinch of salt and used the evidence of our own eyes to see what was actually working.
We read between the lines. We reverse engineered. We followed what worked (some more “ethically” than others). We kept things simple.
We did the very human thing of spotting patterns, and we matched them.
And then in November of 2022 Sam gave birth to ChatGPT and we all lost our minds.
Google seemingly was not immune to this insanity, as within a year or so, our gateway to the internet was telling us to eat rocks and glue cheese to our pizza.
It got better of course. And these days, other than the odd, well documented freakout (AI being dumb is a great follow) Google’s AI Overviews and AI Mode generally give reasonable summaries for most searches.
But it seems the SEO industry as a whole, perhaps fuelled by our new diet of gravel and loctite covered dough, has not recovered.
Day-by-day, week-by-week, we make things more and more complex. Ask Nano Banana to generate an infographic about AI search and it will burn up a sun and drain a sea while figuring out how to cram it all in.
Along the way we forgot two of the fundamentals covered above that had served us so well for nearly three decades:
- Reverse engineering and pattern matching
- Not taking everything the ones pulling the levers tell us at face value
Because the truth is this:
LLMs are not machine gods. In a meta twist of fate, they are in fact, quite literally, pattern matchers. And there are only two things you need to do to be consistently cited for a query (“prompts” converge on queries) in AI Mode and AI Overviews:
- Rank in search. No ranking = no citations. Sorry “SEO is dead” people.
- Find the patterns in the LLM answers for a given query. Match them. Just give the damn thing what it wants.
In simple terms: the answer is in the answers.
Although, spoiler, you’re looking at the wrong ones.

Most answers are a story in two-parts
I’m sure you already know this, but let’s make things absolutely clear.
When you ask an LLM a (non-generative intent) question, it can either answer directly from its training data, or it’s going to require augmentation with context from search to form an answer. This does not necessarily always mean retrieving from classic search engines, but regardless it’s still search.
In ChatGPT the decision on whether or not augmentation is required is made by a classifier model (covered in my ChatGPT guide here).
When search is required, our final answer is a hybrid of what the model “knows” and what’s been injected into its context (this is RAG).
But let’s be clear here; what the model “knows” is hazy, fuzzy, fragmented. It’s a memory of a memory. A dream of a dream. Despite some evidence to the contrary (which I’ll touch on briefly), it’s not a database lookup.
In simple terms a model’s knowledge is formed from ingesting (tokenized) large corpuses of text (i.e. the internet) and learning the patterns. This is what allows an LLM to predict the next token in a sequence, and is machine learning 101.
There are of course additional post training steps that improve prediction, but this isn’t a technical guide to how LLMs work, so if you want to read more on pre and post training, here are a couple of solid articles and a good video on the pre-training process:
- LLM Pre-Training and Custom LLMs
- How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference
For our purposes we’ll just say that the more times a sequence appears in pre-training (i.e. on the internet), the more it’s likely to be what the model spits out when it’s predicting tokens.
Although of course in isolation (without clear context, and even with it) we still get randomness.
For an input sequence “Harry Potter and The Philosopher’s Stone is a…”
The next token prediction could be: book/film/novel/movie/fantastic/terrible etc.
But the model doesn’t just learn sequences, it also learns concepts. Words (or parts of words, maybe even small sequences, but we’ll keep this simple) cluster together in vector space.
In the Hogwarts neighborhood we might have words like:
wizard, rowling, wand, spell, book, hermione, harry, ron, potter, magic etc.
Clustered around plumbing we might have:
plumber, taps, water, service, rating, reviews, local, best etc.
Again, this is way, way oversimplifying things. But it’s an easy way to understand it. And ultimately, and this is the point, you don’t need a degree in machine learning to figure out how to optimize for it.
Suffice to say, from the massive body of text it ingested in pre-training, the model learned the words and concepts associated with particular brands, products, services, TV shows… well everything.
It’s how it knows that a UK gas engineer needs to be gas safe registered. It’s how it knows the loft on a golf club affects how high the shot goes. And it’s how it knows that The Rise of Skywalker was a travesty - not because it has watched the movie, but because in 2019 millions of Reddit voices suddenly cried out in terror (as of yet they have not been suddenly silenced).
Small diversion: There’s plenty of debate around whether or not LLMs are memorizing verbatim passages from pre-training, and there’s reasonably compelling evidence that they are (NYT lawsuits, the Moby Dick/Harry Potter reciting, Getty watermark in the image models).
Does this matter for us? Not really. Because even if LLMs are a compressed version of the internet, if Hypeman has been gaslighting us with a big vectorized zip file, then most of us are going to be in the particularly lossy bit. We’re not the New York Times. And we’re not J.K.Rowling. Additionally, this hypothetical zipped internet is also frozen in time, which means live retrieval is always going to take priority for anything transient (which most things are).
And anyway, to finish here and get back to what we’re actually focused on, if the CEO of Anthropic is constantly telling us that he doesn’t know exactly how their models work, then you should probably question some random guy on LinkedIn that tells you that he does.
Although finally, finally, here are some quotes from Google’s response to the Copyright Office’s notice of inquiry, Artificial Intelligence and Copyright, 88 Fed. Reg. 59942 (Aug. 30, 2023).
"A “large language model” (LLM) is a generative AI model that finds patterns in human language, making it suitable for a range of writing tasks, including predicting the next words to complete a sentence or suggesting grammatical edits that preserve what you mean to say. During training, a model evaluates the proximity, order, frequency, and other attributes of portions of words, called tokens, in its training data. In fact, the model itself selects which attributes to use. In this way, training is the discovery of probabilities of relationships between the tokens — ultimately not in any individual text, but in all of the text on which the model is trained. The trained model then comprises a large network of weights that represent these learned relationships. The model can then respond to a prompt and generate new text with a probability of addressing the prompt as determined by its training."
"The technical process of “learning” for an LLM begins with training the model to identify relationships and patterns among words in a large dataset. Through this process, a generative AI model will adjust its parameters to reflect the mathematical relationships in the data. Once the model has adjusted its parameters to accurately reflect these relationships, it can then use them to generate new outputs based on those parameters. The number of parameters
needed to capture the complexities and nuances of human language and facts about theworld is vast."
See the pattern?
Ok, we went on a bit of a tangent there (hey, that’s how I roll, I ain’t no GPT). So let’s get back on track.
LLMs have some knowledge. They know relationships between words, they “understand” concepts. They know the Star Wars sequels were rubbish.
How does this help us?
It helps us when we get to the other part of the answer story, the bit we can control (in semi real-time), the search part.
And it particularly helps us with Google’s AI Mode and AI Overviews because of how they actually work.
John Iwuozor did a pretty solid breakdown on AI Mode here. But like everyone seems to be doing, he got lost in the math.
Here’s the relevant bit:
Here’s how verification works:
Google converts both the AI-generated statement and candidate source documents into embeddings—mathematical representations of meaning and then uses a distance measure to see how semantically close they are.
Only sources that are mathematically “close enough” to the AI’s summary get cited.
This all sounds rather scary until you actually stop to think about it.
What is this math? Where does the semantic similarity come from?
It comes from the words. What are the words? They’re the patterns (and the concepts) the model learned in its training.
To anthropomorphize the math, LLMs, particularly Google’s LLMs, are opinionated. When you search for “best plumber in Austin” they expect to see certain things. The things they learned in pre-training. In training they turned those things into numbers. Like a friendly neighborhood Carol Vordeman, they did the math so you don’t have to. It’s done.
But funnily enough, searching for “best plumber in austin” (on Google) won’t necessarily show what these “things” are. And that’s where everyone is going wrong.
And it’s why tracking “who are the best plumbers in Austin?” 500 times a month leaves you scratching your head and reaching for the “bulk generate best plumber listicles” button.
But this is not the way.
Reverse engineering the decision criteria behind the answer
Look, I’ve already shared this. If you follow me on social media you may have seen it. You probably skipped it… even when I wrapped it in a meme.
I’ll explain it in more detail below, but most of it is already covered in this video:
What I’ll add before I give a more detailed breakdown is this is not theory. I’ve been testing it for the past couple of months on highly competitive terms ($50-$100+ cpc) and the pages stick to the top of Google’s AI Mode and AI Overviews like glue. I am Mr. “AI answers are probabilistic fool!”, but the thing is, if you’re the only one doing this for a service or product, you kind of… win.
But here’s the important part, you only win if you deserve to win.
We’re not spamming here. We’re not making things up. We’re only surfacing facts, and we’re only matching the relevant patterns when we (genuinely) can.
If you choose to game this by making inaccurate claims, you may see short term success, but it won’t last. Don’t do it.
I’m going to explain how to do this manually, but, you know, I’ve spent time figuring all this out, and building tools to help you with it. So, call me a shill… but I would recommend you just get a QueryBurst account, which is still only $59 per month.
Anyway, since I can’t tell you what to do, if you really want to do it manually here’s the process…
We’ll keep it simple (see the theme) and stick with “best plumber austin”. Prompt wise that’s going to map to “can you recommend some plumbers in austin?”, “hey, I’m in Austin and my tap won’t stop dripping. Can you recommend some plumbers near me?” etc.
Step 1: Interrogate the model
You can pretty much use any model here. While some are better than others output wise, they’re all trained on the same internet.
One thing you might want to experiment with is search enabled/disabled, but you don’t necessarily have to.
Go to your LLM overlord of choice and ask questions like the following:
- "What specific criteria should I consider when evaluating a plumber in Austin, Texas?"
- "What are the red flags to avoid when hiring an emergency plumber?"
- "What specific licenses or insurance are required for plumbing work in Texas?"
- “What questions should I ask a plumber in Austin before hiring them?”
Run 4-5 all in. Copy the answers.
These are service specific questions. For product pages, adapt the questions accordingly.
Step 2: Build a spreadsheet with the criteria
Take the answers from step 1 and extract the decision criteria. You’re not looking for generic advice like 'good reputation.' You’re looking for Entities and Hard Constraints.
For our Austin plumber, the model won't just say 'experience.' It’s going to flag specific decision factors, which might include things like:
- License Verification: Specifically, a 'Responsible Master Plumber' (RMP) License Number.
- The "Austin" Factor: Specific mention of Slab Leaks (an issue in Texas due to the soil).
- Pricing Model: Perhaps it prefers 'Upfront/Flat-Rate' over 'Hourly' to avoid scams.
- Availability: Explicit '24/7' confirmation for emergencies.
- 10+ years of experience
From 4-5 strategic probe prompts you’ll likely end up with 50+ criteria. Go wide here, extract as many as you can and don’t worry about duplicates across answers - duplicates are in fact good.
Create a spreadsheet. List the criteria extracted from each answer in a column (i.e. criteria from probe prompt 1 in column A, criteria from probe prompt 2 in column B etc.)
Step 3: Use an LLM to deduplicate and assign confidence scores
You can do this manually, but this is the kind of thing LLMs are particularly good at. Give your model of choice the spreadsheet (or the copy pasta data) and ask it to consolidate, deduplicate, and assign confidence scores to each of the criteria.
Here’s an example prompt you can adapt:
You are synthesizing criteria extracted from {n} different LLM responses about the same topic.
TOPIC: {topic}
EXTRACTED CRITERIA FROM ALL RESPONSES:
{all_criteria}
1. DEDUPLICATE only TRUE DUPLICATES - criteria that mean exactly the same thing
- MERGE: "contingency fee" and "no upfront cost" → same concept
- DO NOT MERGE: "24/7 availability" and "responds within 2 hours" → these are DISTINCT criteria
- Preserve granularity - each unique actionable insight should remain separate
- When in doubt, keep criteria separate rather than merging
- IMPORTANT: Retain 40-60% of input criteria. If you're outputting less than 30%, you're merging too aggressively.
2. COUNT frequency - how many of the {n} responses mentioned each criterion
- High confidence: mentioned in 3+ responses (of {n})
- Medium confidence: mentioned in 2 responses
- Low confidence: mentioned in 1 response
- PRESERVE LOW CONFIDENCE CRITERIA - these unique insights are valuable even if only mentioned once
3. CATEGORIZE into logical groups:
- Expertise & Specialization
- Trust Signals & Credibility
- Accessibility & Service
- Pricing & Fees
- Content & Information
4. FLAG actionability:
- "content": Can be addressed by adding/improving website content
- "business": Requires operational/business changes
- "external": Requires third-party validation (awards, reviews, etc.)
5. Provide a brief RECOMMENDATION for each criterionStep 4: Find the gaps
So now you know which criteria the models deem important for a product or service.
If you’re using QueryBurst you’ll be able to find out whether or not you satisfy them already with our Site Investigation; an agentic loop, which equips the model with tools for lexical search, semantic search, hybrid search and more (get headings, read specific sections etc).
But if not, you’ll need to manually go through and check. Start with the core product or service page you’re working on, the one that ranks in search, as that’s going to be where you’ll want to be putting all this.
Do you satisfy each of the criteria? Can you?
In the wise words of David Beckham… “be honest”.
Step 5: Place this prominently
That new “GEO” tactic about answering the question quickly. The one about starting with a summary. It’s not that it’s bad advice, it’s just that it’s something we’ve been doing in SEO for well over a decade.
Anyway, same rules apply for product and service pages.
Kick off with a 100(ish) word paragraph that knocks off the absolutely critical criteria (the ones appearing in the most probe prompt answers). Confidently state what your service or product is, who it’s for, and why it’s the best fit.
Follow it up with a couple of headings, each with 5-6 bullet points underneath that cover additional criteria.
In short: summarize (pun intended).
Knock off 20+ criteria in the first 200-300 words (1 paragraph, 2 headings, 2 lists, oh my!).
Watch Google’s AI Mode and AI Overviews fall in love with your page.
Step 6: Write for humans and machines (in that priority)
The rest of your page? Write it for humans. Don’t slopify it.
Not that the intro paragraph and lists at the start are not written for humans. They very much should be, and they’re going to be covering what humans are looking for anyway, since that’s what the LLM learned from.
Just we’re giving them a little extra TLC to make sure they’re easy to synthesize and include the key information that the LLM is looking for. We’re pre-matching the answer.
You’ll want to make sure the rest of your page is well written and well structured. But I mean, you really don’t need someone to tell you to use headings in (looks at calendar in disbelief) 2026 do you?
SEO has always, always, been about writing for both humans and machines. If someone tells you that you have to write for machines only now, tell them to recursively embed that methodology into a posterior context window.
Step 6b: Add some FAQs for the key criteria at the end
You know LLMs like FAQs, Google does too. Your customers like them. Add some. Again, not exactly jaw dropping news here.
Step 6c: Create individual pages covering the key criteria
Spot something (one of the evaluation criteria) that’s particularly important. Create a dedicated page for it.
Watch it show up in the fan-out queries (this is where they come from guys - the LLM is doing research on these criteria).
Step 6d: Profit?
Why this works (something, something probabilities…)
What we’ve just done seems almost insultingly simple. We asked the funny robot what it wants, and then we gave it to it.
I could wrap all this in technical jargon and math. I choose not to.
So why does it work?
1. By covering the criteria, you’re using the right words, and you’re optimizing the math
You could sit there all day robotically writing, scoring, and rewriting the perfect chunk. You might get reasonable results, I’m not here to argue with you.
But remember the patent: the AI generates a statement first (its 'opinion'), then searches for a source to back it up.
The searching here is the math. By covering the criteria, you’re using the right words to make the math work.
You’re gift-wrapping a fact-checked dossier that mathematically aligns with the LLM’s pre-conceived outline of the correct answer.
2. You're forcing the probability (closer) to 1.
LLMs are probabilistic. But probability collapses when there’s only one clear choice.
Remember the verbatim recital of Harry Potter? The model recites the first line of the book verbatim not because it's 'thinking,' but because the probability of the next word in that specific sequence is nearly 100%.
When the LLM has a checklist of ten things, and your page is the sole result in the top 10-20 organic results that satisfies them all (in a clean, extractable format), the odds (and the math) are very much in your favor.
3. You're the path of least resistance (you short-circuit the fan-out).
Some might say that LLMs are lazy. Others might say that they’re computationally efficient. But either way, by doing all this, you’re doing the AI’s homework. You’re making its job easier.
No need to perform five extra ‘fan-out’ searches to verify your license, your experience, and your warranty. It got it all in one go.
It can read your summaries, verify all its criteria in a single pass, and confidently generate the answer. You didn't just give it the best answer; you gave it the easiest one.
You’re not gaming the math, if anything, you’re gaming the LLM. But importantly, you’re gaming it with the truth.
Is this the whole story?
Of course it’s not the whole story.
Was on-page SEO the whole story to ranking?
Do you want to be mentioned where your competitors are mentioned? Yes you do. Do you want to be cited in reviews and high quality third-party “best X for Y listicles”. Of course. Do you need to cover the mid to long-tail with separate pages? Naturally.
If there are a load of bad reviews about your product or service on the internet are you going to be cited? Probably not.
If you’re a well known brand that appears over and over in training data do you have an advantage? 100%.
Do you need to add keywords to your URLs? Dude… come on…
Did the same apply in October 2022?
Yes it did.
And back on site…
Do you want to make sure facts, credentials etc are consistent across your website? Yes. Guess what, you can do that with QueryBurst.
Do you want to make sure marketing claims have sufficient supporting evidence. Yes. Let the record state that you can do that with QueryBurst.
Do you need to check the cosine similarity and lexical score of every paragraph, heading section, and sentence on your site? Do you need to compare entities on your page against a competitor?
As I’ve just explained, probably not. But if you really, really want to, QueryBurst has a full set of tools for doing just that.
Some new tools for "chunk" testing, semantic/BM25/hybrid scoring + entity extraction.
— David McSweeney (@top5seo) February 4, 2026
Do you need to bother with this stuff?
Most of the time... probably not.
More on that soon. pic.twitter.com/iIrY9OzmKt
And finally, do you need to rank in search for any of this to work?
Yes my friend, you do.
It's why all this continues to be part of SEO. And it's why us experienced SEOs (I've been in the trenches for almost three decades now) are in the best position to advise on how to navigate these new search surfaces (which is what "AI" is ultimately).
So think very carefully before filling your website with bulk generated, low effort, AI slop. There are indications that it’s already starting to get filtered out, and downstream, that of course means a decline in AI visibility.
I started by talking about how we figured things out by observing Google. I’ll finish by saying that Google has always combated spam by staring right back at us.
And reiterate once again, that LLMs are pattern matchers. So match the patterns.

