Skip to main content

Duplicate Body Sets

Instances of body content which is shared by a set of pages.

Priority: Critical

Impact: Negative

What issues it may causeโ€‹

Pages with duplicated or similar content should be avoided as they may not be indexed and can result in crawl inefficiency and dilution of PageRank.

How do you fix itโ€‹

Pages that have near duplicate page body content with another page should be reviewed and either updated to include more unique text or redirected to the primary version.

What is the positive impactโ€‹

  1. Reducing the amount of duplicate pages in search engine's indexes can save crawl budget for more important pages and avoid the dilution of PageRank helping the remaining pages to rank better.

  2. Canonicalised or redirected pages will be crawled less often, improving crawl efficiency and saving on server costs.

How to fetch the data for this report templateโ€‹

You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) {
getCrawl(id: $crawlId) {
reportsByCode(
input: {
reportTypeCodes: Basic
reportTemplateCodes: [$reportTemplateCode]
}
) {
rows {
nodes {
... on CrawlDuplicateUrls {
pageTitle
description
primaryUrl
exampleDuplicate1
exampleDuplicate2
duplicateCount
deeprank
level
duplicateType
}
}
}
}
}
}

Try in explorer