October 6, 2023 • 9 min read
One of the most common quick-win opportunities that I see when auditing sites right now is resolving stacks and stacks of orphan pages.
And one of the fastest ways to drive growth on a site that has lots of these in place is to clean them up. Essentially, help Google to find the pages that matter and make sure there are signals across your site that highlight the most important content. At the very least, make sure you’re not hiding important content away in a place that’s barely, or even not at all, accessible to crawlers or users.
In this guide, you’ll learn how to quickly find orphan pages on your site and some of the best ways to fix these, as well as a little more on why they’re maybe more of an issue than you previously thought.
An orphan page is one that isn’t linked to from any other page on the site and, therefore, cannot be accessed unless the direct link is known.
They’re pages that Google doesn’t know exist.
These are pages that are cut off from the rest of the site, and this can impact their ability to rank as prominently on the SERPs as they otherwise could do.
Source: Semrush
And whilst where a page is linked to from elsewhere on your site might not be something you really stop to think about when hitting publish, this can happen for numerous reasons.
Think about this scenario…
Whilst a simplified scenario, this is common. Way more common than you’d expect, especially when publishing a page and adding internal links involves multiple teams. Sometimes it’s not as easy as just doing the things we know we need to do.
And things get forgotten (but that’s why I’m the first to recommend building a solid monthly SEO workflow to help you find issues like this regularly, as well as things like linked 404s, to be able to keep on top of them).
You could be forgiven for thinking that, as long as Google knows your page exists (usually because it’s listed in your XML sitemap), it’ll rank.
But that’s not necessarily the case, and to explain this in a little more detail, let’s take a look at the real reasons why orphan pages are an issue.
In fact, a quick read of Google’s ‘How Search Works’ guide should give all the justification you need to see why orphan pages can result in this content not performing on the SERPs…
“Most of our Search index is built through the work of software known as crawlers. These automatically visit publicly accessible web pages and follow links on those pages, much like you would if you were browsing content on the web. They go from page to page and store information about what they find on these pages and other publicly accessible content in Google’s Search index.” – How Google Search organises information
And SEO aside, there are other reasons why you should be taking the time to find and fix orphan pages, too.
The main issues with orphan pages are that:
Of course, it’s one thing finding and fixing orphan pages, but successful SEO should be as much about preventing issues from happening again as much as it is about resolving ones that already exist.
If you don’t take the steps to find the cause of issues and make changes to avoid the same thing happening again, you’ll always be in a position to fix things after they’ve happened.
Be proactive, not reactive.
So with that in mind, what are some of the most common causes of orphan pages?
There may be times when you intentionally have orphan pages, such as landing pages from paid search, and these are fine. If you’re purposefully creating orphan pages for a specific reason, there’s no need for concern; we’re talking here about orphan pages that shouldn’t exist.
In short, if you want a page to rank on the search engines, it shouldn’t be orphaned.
So how do you go about finding orphan pages?
It very much depends on your SEO toolset of choice, and rather than go into detail on every one of these, here are links to guides for the most common crawlers:
Personally, I’m a fan of using Semrush’ Site Audit tool (connected to Google Analytics) to find orphan pages that are either found within a site’s XML sitemap or that have had pageviews tracked in GA.
As there’s more than one case of orphan pages, there’s also more than one way to go about fixing these issues.
Which approach you take depends on what the cause is, of course, but effective approaches to eliminating orphan pages are as follows:
Your site’s most important pages should be accessible from your site’s main navigation menus. It’s as simple as that.
And whilst there’s often a tendency for non-SEOs to want to try and oversimplify menus for various reasons, restructuring these is one of the most common reasons why orphan pages exist in the first place.
Whilst there’s never going to be an ideal way to include everything you want to include in any menu, ask yourself when fixing orphan page issues whether each page you’re looking at should realistically be in the main menu.
Take a look at ASOS’ menu for inspiration on navigation prioritisation…
Realistically, there are hundreds of links that could be included here, but prioritisation has been used to determine those deemed to be most important, likely by a mix of both search traffic potential and revenue opportunities.
Maybe the most impactful way to fix orphan page issues and ensure that they’re receiving the contextual signals and PageRank that they need in order to rank to their full potential is by internally linking through to these.
Add internal links from other content on the site (but make sure they naturally fit, don’t force them in for the sake of it), being sure to consider:
By strategically internally linking, you’re fully eliminating the issue of orphan pages and making sure you’re passing PageRank; both link authority and contextual signals, and opening up the realistic possibility of ranking gains.
Whilst I wouldn’t recommend relying only on an HTML sitemap to resolve orphan page issues, if your site doesn’t have one, go ahead and get one in place.
At the bare minimum, an HTML sitemap means that all of these pages are at least accessible, but they won’t be acquiring the contextual signals that internal linking should pass.
That said, having an HTML sitemap is something I commonly recommend that all sites put in place, if only to help the discovery of new content.
If you’re using WordPress, you can easily create an HTML sitemap using one of these plugins:
And if you’re using another CMS, take a look at these guides:
One thing that I’ve seen a couple of times in recent months is orphan pages being flagged on a site crawl that, on first look, shouldn’t be.
These specific examples were the faceted navigation pages on Magento 2 stores.
In these cases, they were available on the page, and users could access these pages via the faceted navigation … but search engines couldn’t!
These links were being generated via client-side JavaScript, and the bulk of these hadn’t actually been indexed by Google.
This is just a quick reminder to check and double-check that Google can actually crawl links as HTML across your site. On these sites, we also discovered that internal link blocks on category pages were also being generated in this way…
Resolving orphan page issues is a quick-win way to help these pages rank better on the SERPs, yet it’s ones that are so commonly left to stack up.
Be sure to run this as a monthly check. It’s one of those things that’s way easier to keep on top of monthly, and checking for these issues should be part of your ongoing workflow.
If you see success from this, I’d love to see the impact! Drop me a line on Twitter or LinkedIn and share your wins!