Disallowed Pages with Bot Hits
Pages that are disallowed, but which were crawled by search engine crawlers
Priority: Critical
Impact: Negative
What issues it may cause
These pages may have been disallowed within the timeframe of the log data (in which case you should ensure that they are intentionally disallowed), or the server logs may be recording a different URL to the one which was requested.
How do you fix it
If the pages were disallowed during the timeframe of the log data they will disappear from this report when the timeframe of the log data is changed.
If the pages were not disallowed during the timeframe of the log data, the server logs should be checked to ensure they use the exact requested URLs.
What is the positive impact
The log files will provide a more accurate understanding of crawl budget useage.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlUrls(after: $after, reportType: Basic) {
nodes {
pageTitle
url
description
foundAtUrl
logRequestsTotal
deeprank
level
disallowedPage
logRequestsDesktop
logRequestsMobile
robotsTxtRuleMatch
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"disallowed_pages_with_bot_hits"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url description foundAtUrl logRequestsTotal deeprank level disallowedPage logRequestsDesktop logRequestsMobile robotsTxtRuleMatch foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"disallowed_pages_with_bot_hits"}}' https://api.lumar.io/graphql