Disallowed Pages with Bot Hits

Pages that are disallowed, but which were crawled by search engine crawlers

Priority: Critical

Impact: Negative

What issues it may cause

These pages may have been disallowed within the timeframe of the log data (in which case you should ensure that they are intentionally disallowed), or the server logs may be recording a different URL to the one which was requested.

How do you fix it

If the pages were disallowed during the timeframe of the log data they will disappear from this report when the timeframe of the log data is changed.

If the pages were not disallowed during the timeframe of the log data, the server logs should be checked to ensure they use the exact requested URLs.

What is the positive impact

The log files will provide a more accurate understanding of crawl budget useage.

How to fetch the data for this report template

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

Query
Variables
cURL

query GetReportStatForCrawl(
    $crawlId: ObjectID!
    $reportTemplateCode: String!
    $after: String
   ) {
      getReportStat(
        input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
      ) {
        crawlUrls(after: $after, reportType: Basic) {
          nodes {
            pageTitle
            url
            description
            foundAtUrl
            logRequestsTotal
            deeprank
            level
            disallowedPage
            logRequestsDesktop
            logRequestsMobile
            robotsTxtRuleMatch
            foundInGoogleAnalytics
            foundInGoogleSearchConsole
            foundInBacklinks
            foundInList
            foundInLogSummary
            foundInWebCrawl
            foundInSitemap
          }
          totalCount
          pageInfo {
            endCursor
            hasNextPage
          }
        }
     }
   }

{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"disallowed_pages_with_bot_hits"}

curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url description foundAtUrl logRequestsTotal deeprank level disallowedPage logRequestsDesktop logRequestsMobile robotsTxtRuleMatch foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"disallowed_pages_with_bot_hits"}}' https://api.lumar.io/graphql

Try in explorer

What issues it may cause​

How do you fix it​

What is the positive impact​

How to fetch the data for this report template​

What issues it may cause

How do you fix it

What is the positive impact

How to fetch the data for this report template