Get URL Data
Using the correct query
There are two ways to retrieve raw URL (/link/sitemap/etc) data from Lumar:
- This page describes how to retrieve defined metrics for URLs in the crawl. This query can be filtered, sorted, etc. but requires you to paginate URLs 100 at a time. This is perfect for getting a sample of the available data, but is not well suited to getting all data for a crawl.
- The Download Raw Data query allows you to download all data from a datasource in a single request, however this cannot be filtered or sorted. This is the most efficient way to access all data.
Using the getReportStats
query to access Crawl URL data
The sample query below will return 5 properties (fetchTime
, pageTitle
, responsive
, url
, wordCount
) from the crawled URL but hundreds are available - for the comprehensive list, inspect type CrawlUrl
.
- Query
- Response
- cURL
query GetUrlData($crawlId: ObjectID!) {
getReportStat(input: { crawlId: $crawlId, reportTemplateCode: "all_pages" }) {
crawlUrls(reportType: Basic, first: 3) {
nodes {
fetchTime
pageTitle
responsive
url
wordCount
}
totalCount
}
}
}
{
"data": {
"getReportStats": [
{
"crawlUrls": {
"nodes": [
{
"fetchTime": 0.38,
"pageTitle": "FAQ - Lumar",
"responsive": true,
"url": "https://www.lumar.io/faq/",
"wordCount": 4055
},
{
"fetchTime": 0.03,
"pageTitle": "About - Lumar",
"responsive": true,
"url": "https://www.lumar.io/about",
"wordCount": 1074
},
{
"fetchTime": 0.04,
"pageTitle": "Blog - Lumar",
"responsive": true,
"url": "https://www.lumar.io/blog/",
"wordCount": 605
}
],
"totalCount": 2186
}
}
]
}
}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query GetUrlData($crawlId: ObjectID!) { getReportStat(input: { crawlId: $crawlId, reportTemplateCode: \"all_pages\" }) { crawlUrls(reportType: Basic, first: 3) { nodes { fetchTime pageTitle responsive url wordCount } totalCount } } }"}' https://api.lumar.io/graphql