Sitemaps with >50,000 URLs
Sitemaps included in the crawl which contain more than 50,000 URLs.
Priority: Critical
Impact: Negative
What issues it may cause
Sitemaps may contain up to 50,000 URLs, which means that all pages above this limit will be ignored.
How do you fix it
If your sitemaps contain more than 50K URLs you should consider splitting them into smaller Sitemaps with less than 50K URLs in each.
What is the positive impact
All the pages included in the Sitemaps can be discovered by search engines and process additional information about them such as when they were updated and/or if they have any alternate language version.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlSitemaps(after: $after, reportType: Basic) {
nodes {
url
sitemapType
urlCount
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"sitemaps_too_many_urls"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlSitemaps(after: $after, reportType: Basic) { nodes { url sitemapType urlCount } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"sitemaps_too_many_urls"}}' https://api.lumar.io/graphql