Skip to main content

Error Pages with Bot Hits

All pages that are invalid (return an error HTTP Status code) but still receive google bot hits

Priority: High

Impact: Negative

What issues it may causeโ€‹

Pages which have changed to an error status, such as 404 or 410, need to be crawled by search engines in order for the error status to be discovered so the pages can be removed from the index.

Search engines will also need to periodically recrawl pages which return an error status so check the pages are not working again, otherwise they would not be able to reindex the pages.

However if a significant amount of crawl budget is being spent crawling the permanently removed pages then this may consume crawl budget which could be better used on updating other pages.

How do you fix itโ€‹

  • If a 404 status is being used, consider switching to a 410 status which is a stronger indication that the page has been permanently removed and will not need to be crawled as often.

  • Internal links to the broken pages should be removed.
  • The error pages should be removed from any Sitemaps they are included in.

What is the positive impactโ€‹

Crawl budget can be saved so other pages may be crawled more frequently, or save on server costs.

How to fetch the data for this report templateโ€‹

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) {
getCrawl(id: $crawlId) {
reportsByCode(
input: {
reportTypeCodes: Basic
reportTemplateCodes: [$reportTemplateCode]
}
) {
rows {
nodes {
... on CrawlUrls {
pageTitle
url
foundAtUrl
deeprank
level
logRequestsTotal
httpStatusCode
indexable
duplicatePage
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
}
}
}
}
}

Try in explorer