Create URL File Upload

When you want to crawl URLs from a predefined list, you need to create a URL File Upload. It's a two-step process. First, call the createSignedUrlFileUpload mutation and retrieve the signedS3UploadUrl:

Mutation
Variables
Response
cURL

mutation createSignedUrlFileUpload($input: CreateSignedUrlFileUploadInput!) {
  createSignedUrlFileUpload(input: $input) {
    signedS3UploadUrl
    urlFileUpload {
      id
      fileName
      status
    }
  }
}

{
  "input": {
      "crawlTypeCode": "List",
      "fileName": "url_list.txt",
      "projectId": 282970,
      "projectUploadType": "ListTxt"
    }
}

{
  "data": {
    "createSignedUrlFileUpload": {
      "signedS3UploadUrl": "https://devops-infra-s3-urlfileuploads-resources-prod-use1.s3.us-east-1.amazonaws.com/UrlFileUploads/42319/url_list.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=CREDENTIAL&X-Amz-Date=20230621T073841Z&X-Amz-Expires=900&X-Amz-Security-Token=VERY_LONG_SECURITY_TOKEN&X-Amz-Signature=SIGNATURE&X-Amz-SignedHeaders=host&x-id=PutObject",
      "urlFileUpload": {
        "id": "TjAxM1VybEZpbGVVcGxvYWQ0MjMxOQ",
        "fileName": "url_list.txt",
        "status": "Draft"
      }
    }
  }
}

curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"mutation createSignedUrlFileUpload($input: CreateSignedUrlFileUploadInput!) { createSignedUrlFileUpload(input: $input) { signedS3UploadUrl urlFileUpload { id fileName status } } }","variables":{"input":{"crawlTypeCode":"List","fileName":"url_list.txt","projectId":282970,"projectUploadType":"ListTxt"}}}' https://api.lumar.io/graphql

Try in explorer

Initially, your URL File Upload is in a Draft state. It's awaiting the upload of the file which is the second step. The signedS3UploadUrl is a pre-signed Amazon S3 bucket URL allowing you to upload your file so that we can process it. Assuming the file is named "url_list.txt" and you are in the same directory as the file, you can do it via a curl command:

curl -X PUT --data-binary @url_list.txt https://devops-infra-s3-urlfileuploads-resources-prod-use1.s3.us-east-1.amazonaws.com/UrlFileUploads/42319/url_list.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=CREDENTIAL&X-Amz-Date=20230621T073841Z&X-Amz-Expires=900&X-Amz-Security-Token=VERY_LONG_SECURITY_TOKEN&X-Amz-Signature=SIGNATURE&X-Amz-SignedHeaders=host&x-id=PutObject

The signedS3UploadUrl is valid for 15 minutes so make sure you upload your file within that period. Afterwards, we will process your file, count the URLs inside and change the URL File Upload status to Processed:

Query
Variables
Response
cURL

query getProjectAndUrlFileUploads($projectId: ObjectID!, $fileName: String!) {
  getProject(id: $projectId) {
    urlFileUploads(filter: { fileName: { eq: $fileName } }) {
      nodes {
        id
        fileName
        status
        totalRows
      }
    }
  }
}

{
  "fileName": "url_list.txt",
  "projectId": 282970  
}

{
  "data": {
    "getProject": {
      "urlFileUploads": {
        "nodes": [
          {
            "id": "TjAxM1VybEZpbGVVcGxvYWQ0MjMxOQ",
            "fileName": "url_list.txt",
            "status": "Processed",
            "totalRows": 1
          }
        ]
      }
    }
  }
}

curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query getProjectAndUrlFileUploads($projectId: ObjectID!, $fileName: String!) { getProject(id: $projectId) { urlFileUploads(filter: { fileName: { eq: $fileName } }) { nodes { id fileName status totalRows } } } }","variables":{"fileName":"url_list.txt","projectId":282970}}' https://api.lumar.io/graphql

Try in explorer

Now your file can be used for crawling.