How to Extract Entities and Semantic Relationships from Page Content
Entity Analysis in QueryBurst extracts every named entity (people, products, organisations, concepts, locations, events) and semantic relationship from a page, then scores the results for density, coherence, and alignment. Results are visualised as an arc diagram, a filterable entity list, relationship triples, topic clusters, and a document view with attention mode that highlights what AI focuses on. Pages with low entity density or orphan entities (mentioned but unconnected) are flagged.
How to Access Entity Analysis
- Navigate to the Page Reports tab in QueryBurst
- Select a page from the crawled pages list
- Click on the Entities tab in the page details view
- Click Start Entity Analysis to run the analysis
Note: Entity Analysis is computationally intensive and may take 30-60 seconds depending on content length.
Understanding the Results
Summary Bar
After analysis completes, you'll see a compact summary showing:
| Metric | Description | Good Score |
|---|---|---|
| Overall Score | Combined score of all metrics | 70+ |
| Density | How many entities per unit of text | 60+ |
| Relations | How many relationships between entities | 50+ |
| Coherence | How well entities cluster into topics | 65+ |
| Alignment | How well content matches expected topics | 60+ |
| Semantic | Entity signal vs filler word ratio | 40+ |
You'll also see counts of:
- Entities - Total unique entities extracted
- Triples - Total relationships discovered
- Clusters - Topic groups formed
Primary Topics
Tags showing the main topics your content covers, derived from entity clustering.
Inferred Topic
An AI-generated summary of what your page appears to be about, based on the extracted entities and relationships.
Orphan Entities Warning
Entities that are mentioned but have no relationships to other entities. These represent potential content gaps where concepts are introduced but not connected to your main narrative.
View Modes
Graph View
An interactive arc diagram visualization showing:
- Nodes - Entities positioned in a circle, sized by importance (salience/mentions)
- Arcs - Curved lines connecting related entities
- Colors - Each entity type has a distinct color
Interactions:
- Hover over an entity to highlight its connections
- Click an entity to lock the selection and see all its relationships in the detail panel
- Click again to deselect
The right panel shows all relationships for the selected entity in the format:
Subject → predicate → Object
Entities View
A filterable list of all extracted entities:
Entity Types:
| Type | Color | Examples |
|---|---|---|
| Person | Blue | Names, roles, titles |
| Organization | Purple | Companies, institutions, brands |
| Product | Emerald | Products, services, offerings |
| Concept | Amber | Abstract ideas, methodologies |
| Location | Rose | Places, regions, addresses |
| Event | Cyan | Dates, occasions, milestones |
| Attribute | Slate | Properties, characteristics |
Features:
- Filter by entity type using the pills at the top
- Click any entity row to expand and see context - where it appears in your content
- See mention count for each entity
Relationships View
Shows all semantic triples extracted from your content in the format:
Subject → predicate → Object
Examples:
- "Organic mattress" → is made from → "natural latex"
- "Company" → founded in → "2015"
- "Product" → certified by → "GOTS"
Relationships reveal how your content connects concepts and are crucial for AI understanding.
Clusters View
Groups of semantically related entities that form topic clusters:
Each cluster shows:
- Primary topic - The central concept
- Entity count - How many entities belong to this cluster
- Member entities - All entities in the cluster
Well-structured content typically has:
- 2-5 clear topic clusters
- Strong central themes with supporting entities
- Minimal overlap between clusters
Document View
The most powerful view for understanding how entities appear in context. Four sub-modes:
Normal Mode
Full document text with entities highlighted in their type colors. Hover over any entity to see its type and mention count.
Attention Mode
Dims all non-entity text, making entities "pop" visually. This shows what an AI might focus on when extracting meaning from your content.
Condensed Mode
Removes all non-entity text entirely, showing only the entities in sequence. This represents the "essence" of your content after removing filler words.
Themes Mode
Shows the extracted core themes with their descriptions and member entities. This is the final "distilled" representation of your content.
Entity Type Filters: Use the colored pills to toggle which entity types are highlighted. This helps focus on specific aspects of your content.
Chunk Density Scores: Each content chunk shows a density percentage indicating how much of that section's text consists of meaningful entities vs filler.
Watch Distillation
Click the Watch Distillation button to see an animated visualization of how your content is progressively condensed:
- Web Page → The original page
- Extracted Text → Raw text content (word count)
- Attention → Key terms highlighted (filler fading)
- Condensed → Only entities remain
- Themes → Final core themes
This visualization helps explain how AI systems might process and understand your content.
Interpreting Results
Healthy Entity Structure
✅ High entity density - Content rich in meaningful concepts
✅ Many relationships - Concepts are connected, not just listed
✅ Clear topic clusters - Content is well-organized thematically
✅ Few orphan entities - All concepts tie back to main narrative
✅ Strong coherence - Topics stay focused, don't drift
Warning Signs
⚠️ Low entity density - Too much filler, not enough substance
⚠️ Few relationships - Concepts mentioned but not connected
⚠️ Many orphan entities - Disconnected ideas that confuse readers
⚠️ Too many clusters - Content lacks focus, tries to cover too much
⚠️ Low coherence - Topics jump around without clear structure
Score Interpretation
| Score Range | Interpretation |
|---|---|
| 80-100 | Excellent - Well-structured, entity-rich content |
| 60-79 | Good - Solid foundation with room for improvement |
| 40-59 | Fair - Consider adding connections and depth |
| 0-39 | Needs work - Content may be too thin or unfocused |
Tips & Best Practices
Improving Entity Density
- Be specific - Use proper nouns and specific terms instead of generic ones
- Name things - Give products, features, and concepts clear names
- Add context - Explain what entities are when introducing them
Building Stronger Relationships
- Use connecting language - "X is designed for Y", "A includes B"
- Explain causation - "Because of X, we developed Y"
- Show hierarchy - "X, which is part of Y, enables Z"
Reducing Orphan Entities
- Connect back - When introducing a new concept, relate it to something already mentioned
- Use transitions - Bridge between topics explicitly
- Summarize - Recap how concepts relate at section ends
Improving Topic Coherence
- Outline first - Plan content structure before writing
- One topic per section - Don't mix unrelated concepts
- Use headers - Clear section breaks help maintain focus
Technical Details
How Entity Extraction Works
- Chunking - Content is split into semantic chunks
- NER Processing - Named Entity Recognition identifies entities
- Type Classification - Entities are categorized by type
- Relationship Extraction - Semantic triples are identified
- Embedding - Entities are converted to vectors
- Clustering - Similar entities are grouped by semantic similarity
- Topic Inference - An overall topic is synthesized
Entity Salience
Entities are ranked by "salience" - how important they are to the content:
- Frequency of mentions
- Position in content (earlier = more important)
- Relationship connections
- Semantic centrality in clusters
Analysis Caching
Previous analyses are saved and can be loaded from the history panel. This lets you:
- Compare analyses over time as content changes
- Quickly review past results without re-running
- Track improvements in entity structure
Related Reports
- Page Reports - Page-level health metrics, filtering, and semantic search
- Links Analysis - Internal linking structure
- AI Simulation - How AI models respond to queries about your page
- Knowledge Graph - Site-wide entity profiles and relationships