Skip to main content

Tutorial: Run a Crawl and Get Results

This tutorial walks through the complete workflow of authenticating, creating a project, running a crawl, waiting for it to finish, and fetching the results.

Step 1: Verify authentication

Before doing anything, confirm your token is valid by querying your user details and available accounts.

Operation: query VerifyAuth { me { id username accounts(first: 1) { nodes { id name } } } }Response Example: { "data": { "me": { "id": "TjAwNFVzZXI4NjE", "username": "your.username@example.com", "accounts": { "nodes": [ { "id": "TjAwN0FjY291bnQ3MTU", "name": "Your Account Name" } ] } } } }
VerifyAuthTry in Explorer
GraphQL
query VerifyAuth {
me {
id
username
accounts(first: 1) {
nodes {
id
name
}
}
}
}

If this returns an error, see Authentication to obtain a valid token.

Step 2: Create a project (if needed)

If you do not already have a project, create one. See How to Create a Project for the full guide. Here is a quick example:

Operation: mutation CreateSEOProject($input: CreateProjectInput!) { createSEOProject(input: $input) { project { ...ProjectDetails } } } fragment ProjectDetails on Project { id name primaryDomain # ...other fields you want to retrieve }Variables: { "input": { "accountId": "TjAwN0FjY291bnQ3MTU", "name": "www.lumar.io SEO Project", "primaryDomain": "https://www.lumar.io/" } }Response Example: { "data": { "createSEOProject": { "project": { "id": "TjAwN1Byb2plY3Q2MTM0", "name": "www.lumar.io SEO Project", "primaryDomain": "https://www.lumar.io/" } } } }
CreateSEOProjectTry in Explorer
GraphQL
mutation CreateSEOProject($input: CreateProjectInput!) {
createSEOProject(input: $input) {
project {
...ProjectDetails
}
}
}

fragment ProjectDetails on Project {
id
name
primaryDomain
# ...other fields you want to retrieve
}

Step 3: Run a crawl

Trigger a crawl for your project using the runCrawlForProject mutation:

Operation: mutation RunCrawl($input: RunCrawlForProjectInput!) { runCrawlForProject(input: $input) { crawl { id statusEnum createdAt } } }Variables: { "input": { "projectId": "TjAwN1Byb2plY3Q2MTMy" } }Response Example: { "data": { "runCrawlForProject": { "crawl": { "id": "TjAwNUNyYXdsMTc2NjI0MQ", "statusEnum": "Queued", "createdAt": "2025-01-15T10:00:00.000Z" } } } }
GraphQL
mutation RunCrawl($input: RunCrawlForProjectInput!) {
runCrawlForProject(input: $input) {
crawl {
id
statusEnum
createdAt
}
}
}

The crawl starts in Queued status and progresses through Crawling, Finalizing, and finally Finished.

Step 4: Poll for crawl completion

Query the crawl status periodically until it reaches Finished. A polling interval of 30--60 seconds is reasonable for most crawls.

Operation: query PollCrawlStatus($crawlId: ObjectID!) { getCrawl(id: $crawlId) { id status createdAt finishedAt crawlUrlsTotal } }Variables: { "crawlId": "TjAwNUNyYXdsMTc2NjI0MQ" }Response Example: { "data": { "getCrawl": { "id": "TjAwNUNyYXdsMTc2NjI0MQ", "status": "Finished", "createdAt": "2025-01-15T10:00:00.000Z", "finishedAt": "2025-01-15T10:30:00.000Z", "crawlUrlsTotal": 2186 } } }
PollCrawlStatusTry in Explorer
GraphQL
query PollCrawlStatus($crawlId: ObjectID!) {
getCrawl(id: $crawlId) {
id
status
createdAt
finishedAt
crawlUrlsTotal
}
}
async function waitForCrawl(crawlId: string): Promise<void> {
while (true) {
const result = await executeQuery(POLL_QUERY, { crawlId });
const status = result.data.getCrawl.status;

if (status === "Finished") {
console.log("Crawl finished!");
return;
}

if (status === "Archived" || status === "Paused") {
throw new Error(`Crawl ended with status: ${status}`);
}

console.log(`Crawl status: ${status}. Checking again in 30s...`);
await new Promise(resolve => setTimeout(resolve, 30000));
}
}

Step 5: Fetch the results

Once the crawl is finished, query the report data:

Operation: query GetCrawlResults($crawlId: ObjectID!) { getReportStat( input: { crawlId: $crawlId, reportTemplateCode: "all_pages" } ) { basic crawlUrls(first: 5) { nodes { url httpStatusCode pageTitle } totalCount } } }Variables: { "crawlId": "TjAwNUNyYXdsMTc2NjI0MQ" }Response Example: { "data": { "getReportStat": { "basic": 2186, "crawlUrls": { "nodes": [ { "url": "https://www.example.com/", "httpStatusCode": 200, "pageTitle": "Example - Home" }, { "url": "https://www.example.com/about", "httpStatusCode": 200, "pageTitle": "About Us" } ], "totalCount": 2186 } } } }
GetCrawlResultsTry in Explorer
GraphQL
query GetCrawlResults($crawlId: ObjectID!) {
getReportStat(
input: { crawlId: $crawlId, reportTemplateCode: "all_pages" }
) {
basic
crawlUrls(first: 5) {
nodes {
url
httpStatusCode
pageTitle
}
totalCount
}
}
}

You can request different report templates by changing the reportTemplateCode parameter. See Report Templates Overview for how to discover available templates.

Next steps