Broken Pages (4xx Errors)
URLs that return a 4xx status code, such as a 404, indicating a valid page could not be returned by the server because it doesn't exist.
Priority: Critical
Impact: Negative
What issues it may cause
If the URL was previously serving a working page which was ranking in search search engines and driving traffic, the URL will be dropped from the index and the traffic/conversions may be lost.
If a user visits the broken page whilst the page is still indexed, or is generating traffic from backlinks on external websites, the experience may be poor and is more likely to result in a bounce or exit from the site.
If the URLs are linked internally or included in Sitemaps, it will encourage search engines to crawl the URLs which will result in some wasted crawl budget.
How do you fix it
- If the page was removed accidentally, it can be restored.
- If the page was removed deliberately and there is an equivalent page, then the URL can be redirected to it.
If there is no appropriate alternative, a 410 status may be used instead of a 404 to reduce the time it takes to remove from the search engine's index, and reduce further recrawling which may save crawl budget.
What is the positive impact
A redirect to an appropriate alternative URL may result in authority signals being transferred to the redirect target, and rankings potentially retained by the redirect target page.
Removing any internal links to pages which return a 4xx status is likely to reduce crawling activity and save crawl budget.
How to fetch the data for this report template
You will need to run a crawl for report template to generate report. When report has been generated and you have crawl id you can fetch data for the report using the following query:
- Query
- Variables
- cURL
query GetReportStatForCrawl(
$crawlId: ObjectID!
$reportTemplateCode: String!
$after: String
) {
getReportStat(
input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode}
) {
crawlUrls(after: $after, reportType: Basic) {
nodes {
pageTitle
url
description
foundAtUrl
deeprank
level
httpStatusCode
foundInGoogleAnalytics
foundInGoogleSearchConsole
foundInBacklinks
foundInList
foundInLogSummary
foundInWebCrawl
foundInSitemap
}
totalCount
pageInfo {
endCursor
hasNextPage
}
}
}
}
{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"4xx_errors"}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetReportStatForCrawl( $crawlId: ObjectID! $reportTemplateCode: String! $after: String ) { getReportStat( input: {crawlId: $crawlId, reportTemplateCode: $reportTemplateCode} ) { crawlUrls(after: $after, reportType: Basic) { nodes { pageTitle url description foundAtUrl deeprank level httpStatusCode foundInGoogleAnalytics foundInGoogleSearchConsole foundInBacklinks foundInList foundInLogSummary foundInWebCrawl foundInSitemap } totalCount pageInfo { endCursor hasNextPage } } } }","variables":{"crawlId":"TjAwNUNyYXdsNDAwMA","reportTemplateCode":"4xx_errors"}}' https://api.lumar.io/graphql