Skip to main content

Non-Indexable Pages with Bot Hits

All pages that are non-indexable but still receive google bot hits

Priority: Low

Impact: Negative

What issues it may causeโ€‹

Crawl budget is being spent on pages that are not able to drive any organic search traffic and that could be better spent on higher value pages or pages that are more frequently updated.

How do you fix itโ€‹

The pages can be disallowed to prevent them being crawled, although this prevents any PageRank from backlinks being passed to other pages.

Internal links can be removed, or have a nofollow applied which will hide the links from search engines.

The pages should be removed from any Sitemaps which is a signal to search engines that the pages have value and might have become indexable.

What is the positive impactโ€‹

Crawl budget spent crawling the non-indexable pages may be reduced, allowing crawl budget to be used more important pages, or saving on server costs.

How to fetch the data for this report templateโ€‹

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) {
getCrawl(id: $crawlId) {
reportsByCode(
input: {
reportTypeCodes: Basic
reportTemplateCodes: [$reportTemplateCode]
}
) {
rows {
nodes {
... on CrawlUrls {
pageTitle
url
foundAtUrl
logRequestsTotal
indexable
httpStatusCode
noindex
canonicalizedPage
nofollowedPage
disallowedPage
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
}
}
}
}
}

Try in explorer