Site architecture is the mechanism by which a website distributes two resources that are always finite: PageRank and crawl budget. Every internal link is both a crawl pathway and an authority transfer. The hub-and-spoke model organizes these pathways deliberately, concentrating authority at the top of the hierarchy and flowing it downward to content that needs to rank. Pages more than four clicks from the homepage receive systematically lower crawl priority. Orphan pages receive no internal PageRank at all. Redirect chains lose authority at every hop through compounding decay. This article teaches site architecture as a crawl and authority-distribution problem, because that is the frame that explains why link decisions produce ranking consequences.
Site Architecture as a Crawl and Authority Problem
Most internal linking decisions are made with navigation in mind: can users find what they need? That is a valid concern, but it is secondary to the question that determines whether search engines will rank any page at all: can Googlebot find it, and does it carry enough authority to compete?
A page can have excellent content, a clean URL structure, and a correct sitemap entry. If it sits seven clicks from the homepage, receives two internal links from low-authority tag pages, and is not included in any navigation element, it will carry near-zero PageRank and receive intermittent crawling at best. No amount of content quality or external link building fully compensates for an architecture that starves pages of internal authority.
The decision to treat architecture as a user-experience problem produces sites that are navigable but not crawlable. The decision to treat it as a crawl and authority problem produces sites where every important page is both reachable and well-supplied with ranking signals.
How Does PageRank Flow Through Internal Links?
PageRank is a page-level score calculated iteratively across the entire link graph. Every page begins with a base score. When Page A links to Page B, it passes a fraction of its PageRank to B. The more PageRank A has, and the fewer outbound links it carries, the more authority each individual link passes.[1]Source 1Page, L., Brin, S., Motwani, R., and Winograd, T. "The PageRank Citation Ranking: Bringing Order to the Web." Stanford InfoLab Technical Report 1999-66. 1998/1999.View source[2]Source 2Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze. Introduction to Information Retrieval, Chapter 21: Link Analysis. Cambridge University Press, 2008.View source
The original PageRank formula from Brin and Page's 1998 paper is:
Where d is the damping factor (approximately 0.85), PR(Tᵢ) is the PageRank of each page linking to A, and C(Tᵢ) is the total number of outbound links on that page. The formula is computed iteratively across the full graph until scores converge.
The Damping Factor: Why Authority Decays with Each Hop
The damping factor of approximately 0.85 models the probability that a random web user continues clicking rather than stopping. With each additional link hop, the transmitted authority is multiplied by 0.85. A page one hop from the homepage receives 85% of what a direct homepage link would pass. A page two hops away receives roughly 72%. Three hops: 61%. The authority that a deeply buried page receives from a high-PageRank source diminishes geometrically with each intermediary.
This decay is not a flaw in the algorithm. It is intentional: it prevents authority from artificially concentrating in link loops and models the diminishing likelihood that a user exploring a site will reach a page buried many clicks deep.
How Outbound Links Divide the Authority Flow
When a hub page links to ten spoke pages, each spoke receives one-tenth of the hub's available authority. When the same hub links to fifty pages, each receives one-fiftieth. This means that navigation elements linking to every page on the site are efficient for user access but inefficient for authority concentration. A footer containing 200 links to individual blog posts dilutes the authority those links carry to the point of near-meaninglessness.
Strategic internal linking concentrates links on the pages that most need authority: pages targeting competitive queries, new pages that need indexing momentum, and conversion pages that produce business results.
What Is the Hub-and-Spoke Architecture?
The hub-and-spoke model organizes a site's content into clusters. Each cluster has a central hub page covering a broad topic with depth, and a set of spoke pages covering specific subtopics that link back to the hub. The hub links to every spoke; every spoke links back to the hub; related spokes cross-link where topically relevant.
The Three Tiers
A typical hub-and-spoke architecture has three tiers that correspond to a natural hierarchy of authority and specificity:
Tier 1: Homepage. The highest-authority page on most sites, carrying the most inbound backlinks and serving as the crawl entry point. It links to the major hub pages and concentrates authority there.
Tier 2: Hub pages. Category pages, pillar articles, or service pages covering a topic area comprehensively. They receive authority from the homepage and pass it to spoke pages. Examples: a product category page linking to every product in that category, or a pillar article on "technical SEO" linking to individual articles on crawl budget, canonicalization, and site speed.
Tier 3: Spoke pages. Individual content pieces, product pages, or detailed guides on specific subtopics. They receive authority from their hub, link back to reinforce it, and cross-link to adjacent spokes where relevant.
The Bidirectional Linking Rule and Why It Works
The hub-and-spoke model requires links in both directions: hub-to-spoke and spoke-to-hub. This bidirectionality serves two functions. First, the hub-to-spoke link distributes the hub's authority to spoke pages that need it to rank. Second, the spoke-to-hub link consolidates authority back to the hub page, reinforcing its position as the topical center of the cluster and improving its ranking for the cluster's broadest queries.
When an external backlink lands on any spoke page, some of that authority flows back to the hub through the spoke-to-hub internal link. The entire cluster rises together when any member earns external authority, because of this recirculation.
Notice how this architecture differs from a flat blog structure where every post links to every other post randomly. The hub-and-spoke model concentrates authority at deliberate focal points. A flat structure diffuses it across the entire graph with no prioritization.
Why Does Click Depth Determine Crawl Priority?
Click depth is the minimum number of links Googlebot must follow from the homepage to reach a given page. It is a crawl priority signal because Googlebot's discovery process is a breadth-first-like traversal: it prioritizes pages it can reach quickly over pages it must chain many link hops to reach.[4]Source 4Botify. "Crawl Ratio, Render Ratio, & Why They Matter For SEO." Botify blog, July 2019.View source
Data from large-scale crawl audits quantifies this effect. Analysis of crawl behavior across sites of varying depth shows that sites with an average page depth under 4 clicks achieve a crawl ratio of approximately 58 percent. Sites with an average depth of 6 to 9 clicks see that ratio drop to approximately 39 percent. Every additional layer of depth reduces the probability that Googlebot will reach and reindex a page within a given crawl cycle.
The practical threshold from the curriculum and consistent with industry guidance: important pages should be reachable within 4 clicks of the homepage. Pages beyond that depth are at meaningfully higher risk of being discovered inconsistently, crawled infrequently, and accumulating less internal PageRank through the geometric decay described above.
| Click depth | Approximate crawl ratio (from audit data) | Authority received relative to homepage link |
|---|---|---|
| 1 click | High | ~85% (one damping step) |
| 2 clicks | High | ~72% (two damping steps) |
| 3 clicks | Moderate | ~61% (three damping steps) |
| 4 clicks | Moderate | ~52% (four damping steps) |
| 5+ clicks | Low (drops to ~39% average) | Less than 44% and declining |
This table makes the architectural case for flat site structures: reducing click depth simultaneously improves crawl coverage and increases PageRank received by every affected page.
What Happens to PageRank When a Page Is an Orphan?
An orphan page has no inbound internal links from any other page on the same site. The PageRank formula shows the consequence directly: if no PR(Tᵢ) values exist for a page because no other pages link to it, the page receives only the base constant from the formula (1 - d, approximately 0.15 units normalized). Its PageRank is near zero regardless of the site's overall authority.
Orphan pages face two compounding problems, as covered in article 2.3:
Discovery failure. With no inbound internal links, Googlebot has no link path to follow to reach the page. It remains invisible to normal crawl traversal and depends entirely on sitemap submission or external backlinks to enter the frontier.
Authority starvation. Even when Google discovers an orphan page through a sitemap, the page receives no PageRank distribution from the rest of the site. Without internal authority, the page starts at a severe ranking disadvantage for every query it targets, regardless of its content quality.
The fix is always the same: add at least one contextually relevant internal link from an already-indexed, authoritative page. For pages that need competitive ranking potential, multiple internal links from high-authority pages in the same topic cluster produce proportionally better results.
How Do Redirect Chains Lose Authority?
A single well-implemented 301 redirect passes approximately 95 percent of the link equity from the origin URL to the destination URL. This is consistent with Google's official position, confirmed by Gary Illyes in 2016, that 30x redirects no longer lose significant PageRank.[3]Source 3Search Engine Roundtable. "Google: Any 301, 302, 3xx Redirect Does Not Lose PageRank At All." July 2016.View source
The problem is chains. When URL A redirects to URL B, which redirects to URL C, the equity compounds downward through the damping at each hop:
A backlink pointing at the start of a 5-hop redirect chain delivers only 77 percent of its authority to the final destination. On a page with ten such backlinks, the site is effectively discarding 23 percent of its earned external authority.
Redirect chains accumulate naturally over the life of a site: a URL gets restructured, then the restructured URL gets migrated to a new domain, then the domain switches to HTTPS, then a subfolder is reorganized. Each event adds a hop. Left unaudited, core pages on mature sites can accumulate 3-to-5 hop chains that silently bleed authority from every external link pointing at them.
Google's documentation notes that Googlebot follows at most five redirect hops in a chain before abandoning the request. A six-hop chain produces a crawl failure: the destination is not fetched, any content there is not indexed, and the link equity is not transferred at all.
The fix: audit all redirect chains and flatten each one to a single direct 301 from the origin URL to the final destination. The intermediate URLs in a multi-hop chain should not exist.
How Do You Audit Your Internal Link Architecture?
A practical audit follows four steps that move from the high level to the specific.
Step 1: Map click depth distribution. Use a crawler tool (Screaming Frog, Sitebulb, Ahrefs Site Audit) starting from your homepage. The resulting report shows the click depth of every crawled page. Identify any important pages (high-commercial-value, high-traffic, or newly published) sitting at depth 5 or greater. These are priority targets for internal link additions.
Step 2: Identify orphan pages. Compare the crawler's results against your XML sitemap. Pages that appear in the sitemap but were not reached by the crawler following internal links are confirmed orphans. The URL Inspection tool confirms this: a "None detected" result for the referring page field means no crawled link path exists to that URL.
Step 3: Trace redirect chains. Screaming Frog's redirect chain report identifies all multi-hop chains across the site. Sort by chain length and address the longest chains first, particularly those involving URLs with significant inbound backlink counts.
Step 4: Audit PageRank distribution. Tools like Ahrefs (Internal Link Rank), Sitebulb, and Botify simulate internal PageRank distribution and identify pages with disproportionately low authority relative to their importance. Pages sitting in hub positions should carry the most internal PageRank; if a key category page has less internal authority than an old blog post linked from a popular sidebar, the architecture is not serving the site's priority pages.
Sources
Page, L., Brin, S., Motwani, R., and Winograd, T. "The PageRank Citation Ranking: Bringing Order to the Web." Stanford InfoLab Technical Report 1999-66. 1998/1999.
Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schutze. Introduction to Information Retrieval, Chapter 21: Link Analysis. Cambridge University Press, 2008.
Search Engine Roundtable. "Google: Any 301, 302, 3xx Redirect Does Not Lose PageRank At All." July 2016.
Botify. "Crawl Ratio, Render Ratio, & Why They Matter For SEO." Botify blog, July 2019.





