<- Blog.Crawling

Internal Link Architecture Explained: Hub-and-Spoke, Link Depth, and PageRank Flow

Site architecture is the mechanism by which a website distributes two resources that are always finite: PageRank and crawl budget. Every internal link is both a crawl pathway and an authority transfer. The hub-and-spoke model...

FoundationPracticalOfficial docInteractive
Jun 8, 2026.11 min read
Internal Link Architecture Explained: Hub-and-Spoke, Link Depth, and PageRank Flow

Site architecture is the mechanism by which a website distributes two resources that are always finite: PageRank and crawl budget. Every internal link is both a crawl pathway and an authority transfer. The hub-and-spoke model organizes these pathways deliberately, concentrating authority at the top of the hierarchy and flowing it downward to content that needs to rank. Pages more than four clicks from the homepage receive systematically lower crawl priority. Orphan pages receive no internal PageRank at all. Redirect chains lose authority at every hop through compounding decay. This article teaches site architecture as a crawl and authority-distribution problem, because that is the frame that explains why link decisions produce ranking consequences.


Site Architecture as a Crawl and Authority Problem

Most internal linking decisions are made with navigation in mind: can users find what they need? That is a valid concern, but it is secondary to the question that determines whether search engines will rank any page at all: can Googlebot find it, and does it carry enough authority to compete?

A page can have excellent content, a clean URL structure, and a correct sitemap entry. If it sits seven clicks from the homepage, receives two internal links from low-authority tag pages, and is not included in any navigation element, it will carry near-zero PageRank and receive intermittent crawling at best. No amount of content quality or external link building fully compensates for an architecture that starves pages of internal authority.

The decision to treat architecture as a user-experience problem produces sites that are navigable but not crawlable. The decision to treat it as a crawl and authority problem produces sites where every important page is both reachable and well-supplied with ranking signals.


PageRank is a page-level score calculated iteratively across the entire link graph. Every page begins with a base score. When Page A links to Page B, it passes a fraction of its PageRank to B. The more PageRank A has, and the fewer outbound links it carries, the more authority each individual link passes.[1]Source 1Page, L., Brin, S., Motwani, R., and Winograd, T. "The PageRank Citation Ranking: Bringing Order to the Web." Stanford InfoLab Technical Report 1999-66. 1998/1999.View source[2]Source 2Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze. Introduction to Information Retrieval, Chapter 21: Link Analysis. Cambridge University Press, 2008.View source

The original PageRank formula from Brin and Page's 1998 paper is:

\[\mathrm{PR}(A)=(1-d)+d\sum_i\frac{\mathrm{PR}(T_i)}{C(T_i)}\]

Where d is the damping factor (approximately 0.85), PR(Tᵢ) is the PageRank of each page linking to A, and C(Tᵢ) is the total number of outbound links on that page. The formula is computed iteratively across the full graph until scores converge.

The Damping Factor: Why Authority Decays with Each Hop

The damping factor of approximately 0.85 models the probability that a random web user continues clicking rather than stopping. With each additional link hop, the transmitted authority is multiplied by 0.85. A page one hop from the homepage receives 85% of what a direct homepage link would pass. A page two hops away receives roughly 72%. Three hops: 61%. The authority that a deeply buried page receives from a high-PageRank source diminishes geometrically with each intermediary.

This decay is not a flaw in the algorithm. It is intentional: it prevents authority from artificially concentrating in link loops and models the diminishing likelihood that a user exploring a site will reach a page buried many clicks deep.

How Outbound Links Divide the Authority Flow

When a hub page links to ten spoke pages, each spoke receives one-tenth of the hub's available authority. When the same hub links to fifty pages, each receives one-fiftieth. This means that navigation elements linking to every page on the site are efficient for user access but inefficient for authority concentration. A footer containing 200 links to individual blog posts dilutes the authority those links carry to the point of near-meaninglessness.

Strategic internal linking concentrates links on the pages that most need authority: pages targeting competitive queries, new pages that need indexing momentum, and conversion pages that produce business results.


What Is the Hub-and-Spoke Architecture?

The hub-and-spoke model organizes a site's content into clusters. Each cluster has a central hub page covering a broad topic with depth, and a set of spoke pages covering specific subtopics that link back to the hub. The hub links to every spoke; every spoke links back to the hub; related spokes cross-link where topically relevant.

The Three Tiers

A typical hub-and-spoke architecture has three tiers that correspond to a natural hierarchy of authority and specificity:

Tier 1: Homepage. The highest-authority page on most sites, carrying the most inbound backlinks and serving as the crawl entry point. It links to the major hub pages and concentrates authority there.

Tier 2: Hub pages. Category pages, pillar articles, or service pages covering a topic area comprehensively. They receive authority from the homepage and pass it to spoke pages. Examples: a product category page linking to every product in that category, or a pillar article on "technical SEO" linking to individual articles on crawl budget, canonicalization, and site speed.

Tier 3: Spoke pages. Individual content pieces, product pages, or detailed guides on specific subtopics. They receive authority from their hub, link back to reinforce it, and cross-link to adjacent spokes where relevant.

The Bidirectional Linking Rule and Why It Works

The hub-and-spoke model requires links in both directions: hub-to-spoke and spoke-to-hub. This bidirectionality serves two functions. First, the hub-to-spoke link distributes the hub's authority to spoke pages that need it to rank. Second, the spoke-to-hub link consolidates authority back to the hub page, reinforcing its position as the topical center of the cluster and improving its ranking for the cluster's broadest queries.

When an external backlink lands on any spoke page, some of that authority flows back to the hub through the spoke-to-hub internal link. The entire cluster rises together when any member earns external authority, because of this recirculation.

Notice how this architecture differs from a flat blog structure where every post links to every other post randomly. The hub-and-spoke model concentrates authority at deliberate focal points. A flat structure diffuses it across the entire graph with no prioritization.


Why Does Click Depth Determine Crawl Priority?

Click depth is the minimum number of links Googlebot must follow from the homepage to reach a given page. It is a crawl priority signal because Googlebot's discovery process is a breadth-first-like traversal: it prioritizes pages it can reach quickly over pages it must chain many link hops to reach.[4]Source 4Botify. "Crawl Ratio, Render Ratio, & Why They Matter For SEO." Botify blog, July 2019.View source

Data from large-scale crawl audits quantifies this effect. Analysis of crawl behavior across sites of varying depth shows that sites with an average page depth under 4 clicks achieve a crawl ratio of approximately 58 percent. Sites with an average depth of 6 to 9 clicks see that ratio drop to approximately 39 percent. Every additional layer of depth reduces the probability that Googlebot will reach and reindex a page within a given crawl cycle.

The practical threshold from the curriculum and consistent with industry guidance: important pages should be reachable within 4 clicks of the homepage. Pages beyond that depth are at meaningfully higher risk of being discovered inconsistently, crawled infrequently, and accumulating less internal PageRank through the geometric decay described above.

Click depthApproximate crawl ratio (from audit data)Authority received relative to homepage link
1 clickHigh~85% (one damping step)
2 clicksHigh~72% (two damping steps)
3 clicksModerate~61% (three damping steps)
4 clicksModerate~52% (four damping steps)
5+ clicksLow (drops to ~39% average)Less than 44% and declining

This table makes the architectural case for flat site structures: reducing click depth simultaneously improves crawl coverage and increases PageRank received by every affected page.


What Happens to PageRank When a Page Is an Orphan?

An orphan page has no inbound internal links from any other page on the same site. The PageRank formula shows the consequence directly: if no PR(Tᵢ) values exist for a page because no other pages link to it, the page receives only the base constant from the formula (1 - d, approximately 0.15 units normalized). Its PageRank is near zero regardless of the site's overall authority.

Orphan pages face two compounding problems, as covered in article 2.3:

Discovery failure. With no inbound internal links, Googlebot has no link path to follow to reach the page. It remains invisible to normal crawl traversal and depends entirely on sitemap submission or external backlinks to enter the frontier.

Authority starvation. Even when Google discovers an orphan page through a sitemap, the page receives no PageRank distribution from the rest of the site. Without internal authority, the page starts at a severe ranking disadvantage for every query it targets, regardless of its content quality.

The fix is always the same: add at least one contextually relevant internal link from an already-indexed, authoritative page. For pages that need competitive ranking potential, multiple internal links from high-authority pages in the same topic cluster produce proportionally better results.


How Do Redirect Chains Lose Authority?

A single well-implemented 301 redirect passes approximately 95 percent of the link equity from the origin URL to the destination URL. This is consistent with Google's official position, confirmed by Gary Illyes in 2016, that 30x redirects no longer lose significant PageRank.[3]Source 3Search Engine Roundtable. "Google: Any 301, 302, 3xx Redirect Does Not Lose PageRank At All." July 2016.View source

The problem is chains. When URL A redirects to URL B, which redirects to URL C, the equity compounds downward through the damping at each hop:

\[\begin{aligned}1\text{-hop chain: }&0.95^1=95.0\%\text{ retained}\\2\text{-hop chain: }&0.95^2=90.3\%\text{ retained}\\3\text{-hop chain: }&0.95^3=85.7\%\text{ retained}\\4\text{-hop chain: }&0.95^4=81.5\%\text{ retained}\\5\text{-hop chain: }&0.95^5=77.4\%\text{ retained}\end{aligned}\]

A backlink pointing at the start of a 5-hop redirect chain delivers only 77 percent of its authority to the final destination. On a page with ten such backlinks, the site is effectively discarding 23 percent of its earned external authority.

Redirect chains accumulate naturally over the life of a site: a URL gets restructured, then the restructured URL gets migrated to a new domain, then the domain switches to HTTPS, then a subfolder is reorganized. Each event adds a hop. Left unaudited, core pages on mature sites can accumulate 3-to-5 hop chains that silently bleed authority from every external link pointing at them.

Google's documentation notes that Googlebot follows at most five redirect hops in a chain before abandoning the request. A six-hop chain produces a crawl failure: the destination is not fetched, any content there is not indexed, and the link equity is not transferred at all.

The fix: audit all redirect chains and flatten each one to a single direct 301 from the origin URL to the final destination. The intermediate URLs in a multi-hop chain should not exist.


A practical audit follows four steps that move from the high level to the specific.

Step 1: Map click depth distribution. Use a crawler tool (Screaming Frog, Sitebulb, Ahrefs Site Audit) starting from your homepage. The resulting report shows the click depth of every crawled page. Identify any important pages (high-commercial-value, high-traffic, or newly published) sitting at depth 5 or greater. These are priority targets for internal link additions.

Step 2: Identify orphan pages. Compare the crawler's results against your XML sitemap. Pages that appear in the sitemap but were not reached by the crawler following internal links are confirmed orphans. The URL Inspection tool confirms this: a "None detected" result for the referring page field means no crawled link path exists to that URL.

Step 3: Trace redirect chains. Screaming Frog's redirect chain report identifies all multi-hop chains across the site. Sort by chain length and address the longest chains first, particularly those involving URLs with significant inbound backlink counts.

Step 4: Audit PageRank distribution. Tools like Ahrefs (Internal Link Rank), Sitebulb, and Botify simulate internal PageRank distribution and identify pages with disproportionately low authority relative to their importance. Pages sitting in hub positions should carry the most internal PageRank; if a key category page has less internal authority than an old blog post linked from a popular sidebar, the architecture is not serving the site's priority pages.

Sources

  1. Page, L., Brin, S., Motwani, R., and Winograd, T. "The PageRank Citation Ranking: Bringing Order to the Web." Stanford InfoLab Technical Report 1999-66. 1998/1999.

  2. Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze. Introduction to Information Retrieval, Chapter 21: Link Analysis. Cambridge University Press, 2008.

  3. Search Engine Roundtable. "Google: Any 301, 302, 3xx Redirect Does Not Lose PageRank At All." July 2016.

  4. Botify. "Crawl Ratio, Render Ratio, & Why They Matter For SEO." Botify blog, July 2019.

Share

About the Contributors

Frequently Asked Questions (FAQs)

What is the hub-and-spoke model in SEO?+

The hub-and-spoke model is an internal linking architecture where a central hub page covering a broad topic links to multiple spoke pages covering specific subtopics, and each spoke links back to the hub. The bidirectional linking concentrates PageRank at the hub, distributes it to spokes, and ensures that external authority landing on any cluster page flows back to reinforce the hub's topical dominance.

How does click depth affect SEO?+

Click depth is the number of link hops from the homepage to a given page. Google's BFS-like crawl traversal means pages at greater depth receive proportionally less crawl priority. Audit data shows sites with average depth under 4 clicks achieve a crawl ratio of approximately 58 percent; sites with depth 6-9 drop to 39 percent. Each additional depth level also adds a PageRank damping step of approximately 0.85, reducing authority transmitted to deeper pages.

Do redirect chains lose PageRank?+

Yes, in practice. While a single 301 redirect passes approximately 95 percent of link equity (confirmed by Google's Gary Illyes in 2016), each additional hop in a chain applies that 5 percent decay again. A 3-hop chain retains approximately 85.7 percent; a 5-hop chain retains only 77.4 percent. Googlebot also abandons redirect chains longer than approximately five hops, causing a complete crawl and authority failure beyond that point.

Why does an orphan page rank poorly even when it has good content?+

An orphan page (no inbound internal links) receives near-zero PageRank from the site's internal graph because no other page's authority flows to it. Content quality is a ranking factor, but it operates on top of the base authority signal. A high-quality orphan page competing against lower-quality pages with strong internal authority typically loses, because Google's ranking systems weight authority heavily. Adding internal links to an orphan page directly increases its ranking potential.

How many internal links should a hub page have?+

There is no universal number, but the strategic principle is: a hub page should link to every spoke in its cluster and receive links from all spokes, the homepage or a parent navigation element, and contextually related pages across the site. The practical range for a well-defined topic cluster is typically 5 to 20 spoke pages per hub. Significantly more than 20 often indicates over-segmentation; fewer than 5 may indicate the topic is not broad enough to warrant a dedicated cluster.

How do I fix orphan pages?+

Identify them by comparing your sitemap's URL list against your crawler's discovered URL list. Any sitemap URL not reachable by following internal links from the homepage is an orphan. The fix is to add at least one contextually relevant internal link from an already-indexed page to the orphan. For pages that need competitive ranking potential, add links from topic-adjacent hub pages or from the most authoritative pages in the same cluster.

Contributors

Reviewed by people
who know the system.

All Authors ->