Skip to main content

Duplicate Pages with Backlinks

Pages which have backlinks, and share an identical title, description and near identical content with other pages found in the same crawl, excluding the primary page from each duplicate set

Priority: High

Impact: Negative

What issues it may causeโ€‹

Duplicate pages can result in the dilution of authority signals which can negatively impact the rankings of all the pages within the duplicate set.

Although search engines will attempt to automatically identify duplicate pages and roll them together, this may not be completely effective, particularly on very large websites or pages with a short lifespan. Search engines may choose a duplicate version of the page which is not considered the primary version.

The duplicate pages will be crawled by search engines, wasting crawl budget, incurring additional server costs and reducing the crawl efficiency of the site.

How do you fix itโ€‹

It's likely that the duplicates shown are not considered the primary version but this should be verified. The duplicates can be eliminated by either;

  • redirecting all duplicate URLs to the primary URL in each set
  • adding canonical tags which point to the primary duplicate
  • What is the positive impactโ€‹

    1. Reducing the amount of duplicate pages can avoid the dilution of PageRank helping the remaining pages to rank better, resulting in more traffic and conversions.

    2. Canonicalised or redirected pages will be crawled less often, improving crawl efficiency and saving on server costs.

    How to fetch the data for this report templateโ€‹

    You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:

    query GetReportForCrawl($crawlId: ObjectID!, $reportTemplateCode: String!) {
    getCrawl(id: $crawlId) {
    reportsByCode(
    input: {
    reportTypeCodes: Basic
    reportTemplateCodes: [$reportTemplateCode]
    }
    ) {
    rows {
    nodes {
    ... on CrawlUrls {
    pageTitle
    url
    foundAtUrl
    backlinkDomainCount
    backlinkCount
    deeprank
    level
    httpStatusCode
    indexable
    duplicatePage
    foundInGoogleAnalytics
    foundInGoogleSearchConsole
    foundInBacklinks
    foundInList
    foundInLogSummary
    foundInWebCrawl
    foundInSitemap
    }
    }
    }
    }
    }
    }

    Try in explorer