Non-Indexable Pages with Bot Hits
All pages that are non-indexable but still receive google bot hits
Priority: Low
Impact: Negative
What issues it may causeโ
Crawl budget is being spent on pages that are not able to drive any organic search traffic and that could be better spent on higher value pages or pages that are more frequently updated.
How do you fix itโ
The pages can be disallowed to prevent them being crawled, although this prevents any PageRank from backlinks being passed to other pages.
Internal links can be removed, or have a nofollow applied which will hide the links from search engines.
The pages should be removed from any Sitemaps which is a signal to search engines that the pages have value and might have become indexable.
What is the positive impactโ
Crawl budget spent crawling the non-indexable pages may be reduced, allowing crawl budget to be used more important pages, or saving on server costs.
How to fetch the data for this report templateโ
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) {
getCrawl(id: $crawlId) {
reportsByCode(
input: {
reportTypeCodes: Basic
reportTemplateCodes: [$reportTemplateCode]
}
) {
rows {
nodes {
... on CrawlUrls {
pageTitle
url
foundAtUrl
logRequestsTotal
indexable
httpStatusCode
noindex
canonicalizedPage
nofollowedPage
disallowedPage
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
}
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"non_indexable_pages_with_bot_hits"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) { getCrawl(id: $crawlId) { reportsByCode( input: { reportTypeCodes: Basic reportTemplateCodes: [$reportTemplateCode] } ) { rows { nodes { ... on CrawlUrls { pageTitle url foundAtUrl logRequestsTotal indexable httpStatusCode noindex canonicalizedPage nofollowedPage disallowedPage foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } } } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"non_indexable_pages_with_bot_hits"}}' https://api.lumar.io/graphql