Skip to main content

Disallowed/Malformed URLs in Sitemaps

URLs which were found in sitemaps, but could not be crawled because they were disallowed, or malformed.

Priority: Medium

Impact: Negative

What issues it may cause

Disallowed and malformed URLs cannot be crawled by search engines and should not be included within sitemaps.

How do you fix it

Review the robots.txt rules to ensure the URLs have been disallowed correctly, or remove all of the disallowed URLs from sitemaps.

Review the malformed URLs and either remove them from the Sitemap or update the URLs to be valid URLs.

What is the positive impact

Having clean Sitemaps with all valid, indexable and unique pages help Search Engines like Google to crawl, index and update all of the important pages of your website more efficiently.

How to fetch the data for this report template

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlUncrawledUrls(after: $after, reportType: Basic) {
nodes {
url
foundAtUrl
foundAtSitemap
rewriteChain
level
restrictedReason
robotsTxtRuleMatch
foundInWebCrawl
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInSitemap
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}

Try in explorer