How URL Structure Affects Crawlability and AI Content Discovery
A site's URL structure determines how crawlers traverse content and how AI systems infer topical grouping. Deep or unbalanced folder hierarchies, excessive nesting, and inconsistent path conventions reduce crawl efficiency and weaken structural signals. These imbalances are invisible in flat page lists — they only become apparent when the full hierarchy is visualised.
The Architecture view in QueryBurst renders a site's full URL structure as an interactive treemap, showing content distribution, folder depths, and page counts per directory at a glance.
How to Access
- Open a site in QueryBurst
- Click Site Intelligence in the sidebar
- Select the Architecture subtab
Understanding the Interface
Stats Strip
Five key metrics at the top:
| Stat | Description |
|---|---|
| Pages | Total pages in the current view |
| Folders | Number of URL path segments (directories) |
| Max depth | Deepest nesting level in the URL structure |
| Total words | Combined word count across all pages |
| Avg words/page | Average content length |
Treemap
A squarified treemap where each rectangle represents a URL folder or page:
- Size corresponds to the number of pages (toggle Balanced for log-scaled sizing that makes small sections more visible)
- Colour indicates folder depth
- Click a folder to drill into it
- Use the breadcrumb above to navigate back up
Folder Table
Below the treemap, a table lists subfolders at the current level:
| Column | Description |
|---|---|
| Folder | URL path segment |
| Pages | Total pages in this folder and its subfolders |
| Direct | Pages directly in this folder (not in subfolders) |
| Avg words | Average word count of pages in this folder |
| Total words | Combined word count |
Click a folder name to drill into it.
Pages Table
On the right side, a searchable list of all pages at or below the current path:
- URL — Shown as clickable path segments for quick drill-down, with a link to the page detail in the Crawl tab
- Title — Page title
- Words — Word count
Search
The search box supports advanced syntax:
term— Match pages containing this term"exact phrase"— Exact match-exclude— Exclude pages with this termterm1 term2— AND (both required)term1 *or* term2— OR (either matches)
Toggle Path and Title checkboxes to control whether search matches against URLs, titles, or both.
Interpreting Results
Healthy Signs
- Balanced structure — Content is distributed evenly across sections, not all in one flat folder
- Logical grouping — Related pages are in the same folder
- Reasonable depth — Most content is 2–4 levels deep
- Consistent section sizes — No single folder dominates while others are nearly empty
Warning Signs
- Flat structure — Hundreds of pages at the root level with no folder organisation
- Extreme depth — Content buried 5+ levels deep may be hard for crawlers to reach
- Imbalanced sections — One folder has 500 pages while others have 5
- Orphaned folders — Sections with very few pages that could be consolidated
Tips
- Drill into large sections — Use the treemap to explore your biggest content areas
- Check for content sprawl — Look for folders with too many direct pages that could benefit from subfolder organisation
- Search for page types — Use search to find specific content (e.g., "blog", "product", "faq")
- Compare word counts — Folders with very low average word counts may contain thin content
- Toggle Balanced mode — Switch to balanced sizing to see small sections that might be invisible in proportional mode
Technical Details
The treemap is built from the URL path hierarchy of all crawled pages. Each URL is split by / to construct the folder tree. Page counts and word counts are aggregated up the hierarchy. The treemap layout uses a squarified algorithm for optimal rectangle proportions.
Related Reports
- Topics & Focus — Topical clustering analysis
- Link Overview — Internal link structure and click depth
- Crawl — Full page list with filtering and search