What Does AI Know About My Brand?

AI systems form their understanding of a brand from multiple sources — the brand's own website, third-party mentions, knowledge graph entries (Google's Knowledge Graph, and emerging equivalents), and patterns compressed into model weights during training. This understanding includes entities, claims, relationships, and topical associations. The resulting representation determines how AI answers questions about the brand, whether it gets recommended, and how accurately it's described. Most of these sources are outside direct control — but the brand's own website should be the primary, authoritative source of truth. If the website doesn't clearly and consistently express its entities, claims, and relationships, AI systems will fill the gaps from third-party sources or not represent the brand at all.

Site Intelligence in QueryBurst extracts this same structured representation from a website — entities, claims, relationships, topics — making visible what AI can learn from the site's own content, and where gaps leave the brand's AI representation to others.

What the Pipeline Produces

Site Intelligence extracts five layers of structured data from your content:

LayerDescription
EntitiesPeople, organisations, products, concepts, locations, and events mentioned across the site
StatementsSubject–predicate–object triples expressing relationships (e.g. "Company → founded in → 2015")
TopicsThematic clusters of related entities that define the site's subject matter
ClaimsFactual assertions, opinions, and claims made in the content — verifiable or not
QuestionsQuestion-style headings that signal FAQ-type content to AI systems

These layers power every subtab within the Intelligence section, as well as Entity Flow, Entity Analysis (page-level), and the verification tools.

How It Works

Two-Stage Extraction

  1. Stage 1 — Primary Entity Pass: A lightweight pass over each page's title and meta description identifies the primary entity (the main subject of the page).
  2. Stage 2 — Full Extraction: A deeper pass over the page content uses the primary entity as context to extract all entities, relationships (triples), summaries, target queries, and claims.

Post-Processing

After extraction:

  • Entities are deduplicated using embedding similarity and automated review
  • Entity prominence scores are computed based on mention frequency, page coverage, and internal linking
  • Topic clusters are formed by grouping semantically similar entities
  • Missing internal links between related pages are identified

Subtabs

Site Intelligence contains seven subtabs, each providing a different view of the extracted data:

SubtabWhat It Shows
Knowledge GraphThe main entity table — every entity profile with statements, reinforcement, focus ratings, prominence scores, and linking gaps. Includes the Entity Universe scatter plot and Dedup Groups panel.
Graph ExplorerInteractive hop-by-hop navigation through entity relationships, with degree distance and personalised PageRank.
Topics & FocusTopical concentration score, semantic link alignment score, topic ring map, redundant page detection, and linking opportunity analysis.
Facts & ClaimsEvery claim extracted from the site, filterable by type (claim, fact, opinion, statement), searchable by keyword or semantic query, with inline verification.
QuestionsAll question-format headings across the site — the patterns AI models look for when generating answers.
Comparison PagesBrand and product mentions extracted from roundup and comparison content, with positioning and co-occurrence data.
ArchitectureInteractive treemap of the site's URL structure showing content distribution across folders.

Running the Pipeline

  1. Navigate to Site Intelligence in the sidebar
  2. If no analysis has been run, click Run analysis
  3. Processing time depends on site size (typically a few minutes for sites under 500 pages)
  4. Results appear progressively — the entity table loads first, followed by enrichment scores (Focus, Score, Gaps, Peers)

The pipeline only processes pages with new or changed content on subsequent runs.

Key Metrics

MetricDescriptionHealthy Range
Entity profilesTotal unique entities extractedDepends on site size
StatementsTotal subject–predicate–object triplesMore = richer knowledge
ReinforcedEntities mentioned on more than one pageHigher = stronger signal
Topical FocusHow tightly content clusters around core themes (0–100)60+
Semantic Link AlignmentHow well internal links match semantic similarity (0–100)60+
Dedup groupsEntity name variants detectedFewer = cleaner

How Intelligence Connects to Other Reports

Site Intelligence data feeds into several other areas of QueryBurst:

  • Entity Flow — Uses entity prominence and linking data to score structural support
  • Entity Analysis (page-level) — Shows per-page entity extraction plotted against the site-wide entity universe
  • Page Overview — Displays AI-extracted summary, claims, and target queries for individual pages
  • Claim Verification / Fact Verification — Uses extracted claims as the dataset to verify against
  • Answer Spy — Extracted entities and claims provide context for AI recommendation criteria analysis

Frequently Asked Questions

What does AI actually "know" about my brand?

AI systems form their understanding of a brand from multiple sources: the brand's own website content, third-party mentions (reviews, news articles, directories), knowledge graph entries (Google's Knowledge Graph, and emerging equivalents from other providers), and patterns compressed into model weights during training. The result is a composite representation — entities, claims, relationships, and topical associations — that determines how AI answers questions about the brand, whether it recommends the brand, and how accurately it describes it.

Does AI literally build a knowledge graph from my website?

Not exactly. During pre-training, models compress patterns from vast amounts of text into statistical weights — they don't store a structured graph. However, Google maintains an explicit Knowledge Graph, and other providers are building similar structured representations. At query time, retrieval systems (RAG) extract and summarise content from live web pages. The end effect is similar: AI systems have a structured understanding of entities and relationships, even if the internal mechanism varies. Site Intelligence shows what that structure looks like when extracted from your content.

Why does my own website matter if AI also uses other sources?

Your website is the one source you fully control. Third-party sources may be outdated, inaccurate, or missing entirely. If your site doesn't clearly express your entities, claims, and relationships, AI systems will either fill the gaps from whatever third-party content they find — or leave your brand underrepresented. A well-structured site acts as the primary authoritative signal that anchors AI's understanding, even when other sources contribute.

Can I influence what AI knows about my brand at the pre-training level?

Not directly for models that have already been trained — their weights are fixed. But future training runs incorporate updated web content, and retrieval-augmented systems (used by ChatGPT, Perplexity, Google AI Overview, and others) pull live content at query time. Improving your website's entity coverage, claim clarity, and structural linking has an immediate effect on retrieval-based AI answers and a longer-term effect on pre-trained knowledge as models are updated.

What's the difference between Site Intelligence and page-level Entity Analysis?

Entity Analysis (in the Page Reports section) extracts entities and relationships from a single page. Site Intelligence processes the entire site, producing a site-wide knowledge graph with cross-page reinforcement, prominence scoring, topic clustering, and gap detection. Page-level analysis shows what one page contributes; Site Intelligence shows the complete picture and how pages relate to each other.

How often should I run Site Intelligence?

After every significant content update — new pages, major rewrites, or structural changes. The pipeline only re-processes pages with changed content, so incremental runs are fast. For active sites, running after each crawl ensures the knowledge representation stays current.

What does a "gap" in the knowledge graph mean?

A gap means there are pages discussing the same entity that don't link to each other internally. This weakens the entity's authority signal because search engines and AI systems use internal links to infer topical relationships. Closing gaps — by adding internal links between related pages — reinforces the entity and strengthens the site's topical authority around it.

  • Knowledge Graph — The default subtab: entity table, universe scatter plot, dedup groups
  • Entity Flow — Entity prominence and structural support analysis
  • Entity Profiles — Deep dive into any individual entity
  • Page Reports — Page-level health metrics and filtering