Skip to main content

Unauthorized Pages

Pages which return a 401, 403, or 407 HTTP response code, indicating that the content could not be served, and therefore will not be indexed in search engines.

Priority: Critical

Impact: Negative

What issues it may causeโ€‹

These URLs will not be indexed in search engines, and the body content will be ignored.

Search engines will have to crawl the pages in order to discover the restricted status code, wasting crawl budget.

How do you fix itโ€‹

These pages should be reviewed to confirm they are genuine restricted access pages.

If the pages should not be restricted and could be indexed in search engines, they should be changed return a 200 status code.

If the pages should be restricted, methods can be considered to prevent the URLs being crawled such as a disallow in robots.txt.

What is the positive impactโ€‹

Crawl budget spent crawling the restricted pages may be reduced, allowing crawl budget to be used more important pages, or saving on server costs.

How to fetch the data for this report templateโ€‹

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) {
getCrawl(id: $crawlId) {
reportsByCode(
input: {
reportTypeCodes: Basic
reportTemplateCodes: [$reportTemplateCode]
}
) {
rows {
nodes {
... on CrawlUrls {
pageTitle
url
description
foundAtUrl
deeprank
level
httpStatusCode
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
}
}
}
}
}

Try in explorer