Duplicate Pages in SERPs
Duplicate pages which had impressions in Google Organic SERPs.
Priority: High
Impact: Negative
What issues it may cause
Although search engines will attempt to automatically identify duplicate pages and roll them together, this may not be completely effective on very large websites or those with a high churn of URLs.
Duplicate pages can result in the dilution of authority signals, which can affect the ranking performance and reduce the crawl efficiency of the site wasting crawl budget.
How do you fix it
It's likely that the duplicates shown are not considered the primary version but this should be verified. The duplicates can be eliminated by either;
- removing internal links to the URLs
- redirecting all duplicate URLs to the primary URL in each set
- adding canonical tags which point to the primary duplicate
What is the positive impact
Reducing the amount of duplicate pages can avoid the dilution of PageRank helping the remaining pages to rank better, resulting in more traffic and conversions.
Canonicalised or redirected pages will be crawled less often, improving crawl efficiency and saving on server costs.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlUrls(after: $after, reportType: Basic) {
nodes {
pageTitle
url
foundAtUrl
deeprank
level
searchConsoleTotalClicks
searchConsoleTotalImpressions
httpStatusCode
indexable
duplicatePage
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"duplicate_pages_in_serp"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url foundAtUrl deeprank level searchConsoleTotalClicks searchConsoleTotalImpressions httpStatusCode indexable duplicatePage foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"duplicate_pages_in_serp"}}' https://api.lumar.io/graphql