Non-Indexable Pages with Bot Hits
All pages that are non-indexable but still receive google bot hits
Priority: Low
Impact: Negative
What issues it may cause
Crawl budget is being spent on pages that are not able to drive any organic search traffic and that could be better spent on higher value pages or pages that are more frequently updated.
How do you fix it
The pages can be disallowed to prevent them being crawled, although this prevents any PageRank from backlinks being passed to other pages.
Internal links can be removed, or have a nofollow applied which will hide the links from search engines.
The pages should be removed from any Sitemaps which is a signal to search engines that the pages have value and might have become indexable.
What is the positive impact
Crawl budget spent crawling the non-indexable pages may be reduced, allowing crawl budget to be used more important pages, or saving on server costs.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlUrls(after: $after, reportType: Basic) {
nodes {
pageTitle
url
foundAtUrl
logRequestsTotal
indexable
httpStatusCode
noindex
canonicalizedPage
nofollowedPage
disallowedPage
unavailableAfter
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"non_indexable_pages_with_bot_hits"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url foundAtUrl logRequestsTotal indexable httpStatusCode noindex canonicalizedPage nofollowedPage disallowedPage unavailableAfter foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"non_indexable_pages_with_bot_hits"}}' https://api.lumar.io/graphql