Create URL File Upload
When you want to crawl URLs from a predefined list, you need to create a URL File Upload. It's a two-step process. First, call the createSignedUrlFileUpload
mutation and retrieve the signedS3UploadUrl
:
- Mutation
- Variables
- Response
- cURL
mutation createSignedUrlFileUpload($input: CreateSignedUrlFileUploadInput!) {
createSignedUrlFileUpload(input: $input) {
signedS3UploadUrl
urlFileUpload {
id
fileName
status
}
}
}
{
"input": {
"crawlTypeCode": "List",
"fileName": "url_list.txt",
"projectId": 282970,
"projectUploadType": "ListTxt"
}
}
{
"data": {
"createSignedUrlFileUpload": {
"signedS3UploadUrl": "https://devops-infra-s3-urlfileuploads-resources-prod-use1.s3.us-east-1.amazonaws.com/UrlFileUploads/42319/url_list.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=CREDENTIAL&X-Amz-Date=20230621T073841Z&X-Amz-Expires=900&X-Amz-Security-Token=VERY_LONG_SECURITY_TOKEN&X-Amz-Signature=SIGNATURE&X-Amz-SignedHeaders=host&x-id=PutObject",
"urlFileUpload": {
"id": "TjAxM1VybEZpbGVVcGxvYWQ0MjMxOQ",
"fileName": "url_list.txt",
"status": "Draft"
}
}
}
}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"mutation createSignedUrlFileUpload($input: CreateSignedUrlFileUploadInput!) { createSignedUrlFileUpload(input: $input) { signedS3UploadUrl urlFileUpload { id fileName status } } }","variables":{"input":{"crawlTypeCode":"List","fileName":"url_list.txt","projectId":282970,"projectUploadType":"ListTxt"}}}' https://api.lumar.io/graphql
Initially, your URL File Upload is in a Draft state. It's awaiting the upload of the file which is the second step. The signedS3UploadUrl
is a pre-signed Amazon S3 bucket URL allowing you to upload your file so that we can process it. Assuming the file is named "url_list.txt" and you are in the same directory as the file, you can do it via a curl command:
curl -X PUT --data-binary @url_list.txt https://devops-infra-s3-urlfileuploads-resources-prod-use1.s3.us-east-1.amazonaws.com/UrlFileUploads/42319/url_list.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=CREDENTIAL&X-Amz-Date=20230621T073841Z&X-Amz-Expires=900&X-Amz-Security-Token=VERY_LONG_SECURITY_TOKEN&X-Amz-Signature=SIGNATURE&X-Amz-SignedHeaders=host&x-id=PutObject
The signedS3UploadUrl
is valid for 15 minutes so make sure you upload your file within that period. Afterwards, we will process your file, count the URLs inside and change the URL File Upload status to Processed:
- Query
- Variables
- Response
- cURL
query getProjectAndUrlFileUploads($projectId: ObjectID!, $fileName: String!) {
getProject(id: $projectId) {
urlFileUploads(filter: { fileName: { eq: $fileName } }) {
nodes {
id
fileName
status
totalRows
}
}
}
}
{
"fileName": "url_list.txt",
"projectId": 282970
}
{
"data": {
"getProject": {
"urlFileUploads": {
"nodes": [
{
"id": "TjAxM1VybEZpbGVVcGxvYWQ0MjMxOQ",
"fileName": "url_list.txt",
"status": "Processed",
"totalRows": 1
}
]
}
}
}
}
curl -X POST -H "Content-Type: application/json" -H "apollographql-client-name: docs-example-client" -H "apollographql-client-version: 1.0.0" -H "x-auth-token: YOUR_API_SESSION_TOKEN" --data '{"query":"query getProjectAndUrlFileUploads($projectId: ObjectID!, $fileName: String!) { getProject(id: $projectId) { urlFileUploads(filter: { fileName: { eq: $fileName } }) { nodes { id fileName status totalRows } } } }","variables":{"fileName":"url_list.txt","projectId":282970}}' https://api.lumar.io/graphql
Now your file can be used for crawling.