Unauthorized Pages

Pages which return a 401, 403, or 407 HTTP response code, indicating that the content could not be served, and therefore will not be indexed in search engines.

Priority: Critical

Impact: Negative

What issues it may cause

These URLs will not be indexed in search engines, and the body content will be ignored.

Search engines will have to crawl the pages in order to discover the restricted status code, wasting crawl budget.

How do you fix it

These pages should be reviewed to confirm they are genuine restricted access pages.

If the pages should not be restricted and could be indexed in search engines, they should be changed return a 200 status code.

If the pages should be restricted, methods can be considered to prevent the URLs being crawled such as a disallow in robots.txt.

What is the positive impact

Crawl budget spent crawling the restricted pages may be reduced, allowing crawl budget to be used more important pages, or saving on server costs.

How to fetch the data for this report template

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

Query
Variables
cURL

query GetReportStatForCrawl(
    $crawlId: ObjectID!
    $reportTemplateCode: String!
    $after: String
   ) {
      getReportStat(
        input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
      ) {
        crawlUrls(after: $after, reportType: Basic) {
          nodes {
            pageTitle
            url
            description
            foundAtUrl
            deeprank
            level
            httpStatusCode
            foundInGoogleAnalytics
            foundInGoogleSearchConsole
            foundInBacklinks
            foundInList
            foundInLogSummary
            foundInWebCrawl
            foundInSitemap
          }
          totalCount
          pageInfo {
            endCursor
            hasNextPage
          }
        }
     }
   }

{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"unauthorised_pages"}

curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url description foundAtUrl deeprank level httpStatusCode foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"unauthorised_pages"}}' https://api.lumar.io/graphql

Try in explorer

What issues it may cause​

How do you fix it​

What is the positive impact​

How to fetch the data for this report template​

What issues it may cause

How do you fix it

What is the positive impact

How to fetch the data for this report template