Broken/Disallowed Sitemaps
Sitemaps included in the crawl which return a broken status code (400, 404, 410, 500, or 501) or are disallowed by the robots.txt file.
Priority: Critical
Impact: Negative
What issues it may cause
Sitemaps must be crawlable and return a 200 status in order to be processed by search engines.
How do you fix it
If the Sitemaps are broken, they should be reviewed to determine if they are expected to work. If they are no longer used then they should be removed from the crawl and the robots.txt if they are included.
If the Sitemap URLs are disallowed, the robots.txt rules should be reviewed to identify the specific rules, and changed to allow the Sitemaps to be crawled.
What is the positive impact
The Sitemap can be crawled and processed by search engines.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlSitemaps(after: $after, reportType: Basic) {
nodes {
url
sitemapType
urlCount
httpStatusCode
restrictedReason
sitemapType
logRequestsTotal
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"sitemaps_broken_disallowed"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlSitemaps(after: $after, reportType: Basic) { nodes { url sitemapType urlCount httpStatusCode restrictedReason sitemapType logRequestsTotal } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"sitemaps_broken_disallowed"}}' https://api.lumar.io/graphql