Pages with Repeated Paths
Priority: Critical
Impact: Negative
What issues it may cause
Google may not index a URL with the same path repeated 3 or more times as it matches the pattern of a malformed URL resulting in an infinite spider trap.
A significant number of malformed URLs may waste crawl budget or cause indexing issues of the pages are duplicated and not corrected by a canonical tag.
How do you fix it
Identify if the URL is created by malformed internal links in which case the links to the URL should be corrected.
- Otherwise the URL format may need to be modified to prevent the same path being repeated.
What is the positive impact
The URLs are likely to be duplicates and may be crawled Search Engines wasting crawl budget and causing indexing issues.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlUrls(after: $after, reportType: Basic) {
nodes {
pageTitle
url
description
foundAtUrl
deeprank
level
urlContainsRepeatedPaths
httpStatusCode
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"pages_with_repeated_paths"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url description foundAtUrl deeprank level urlContainsRepeatedPaths httpStatusCode foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"pages_with_repeated_paths"}}' https://api.lumar.io/graphql