Filtering
To filter results, we use the filter argument. It takes the ConnectionFilterInput corresponding to the entity we're
retrieving as a value. These inputs are defined as:
input ExampleConnectionFilterInput {
_and: [ExampleConnectionFilterInput!]
_or: [ExampleConnectionFilterInput!]
url: ConnectionStringFilterInput
# ...and so on for other field types like Boolean, Date, Int etc.
}
Each field in the ConnectionFilterInput can be filtered by a subset of predicates available for the given data type. The sections below list every predicate grouped by type.
String predicates
ConnectionStringFilterInput -- used for text fields such as URLs, names, and descriptions.
| Predicate | Type | Description |
|---|---|---|
eq | String | Exact match. |
ne | String | Not equal. |
contains | String | Field contains the substring. |
notContains | String | Field does not contain the substring. |
beginsWith | String | Field starts with the value. |
endsWith | String | Field ends with the value. |
matchesRegex | String | Field matches the regular expression. |
notMatchesRegex | String | Field does not match the regular expression. |
in | [String!] | Field value is one of the provided values. |
notIn | [String!] | Field value is not one of the provided values. |
isEmpty | Boolean | When true, matches empty strings. When false, matches non-empty strings. |
isNull | Boolean | When true, matches null values. When false, matches non-null values. |
Examples:
# URLs containing "blog"
filter: { url: { contains: "blog" } }
# URLs starting with "https://example.com"
filter: { url: { beginsWith: "https://example.com" } }
# Match a regex pattern
filter: { url: { matchesRegex: "^https://[^/]+/products/\\d+" } }
# One of several exact values
filter: { name: { in: ["Google", "Wikipedia", "GitHub"] } }
Numeric predicates (Int, Float, BigInt)
ConnectionIntFilterInput, ConnectionFloatFilterInput, and ConnectionBigIntFilterInput share the same set of predicates.
| Predicate | Type | Description |
|---|---|---|
eq | Int | Equal to. |
ne | Int | Not equal to. |
gt | Int | Greater than. |
ge | Int | Greater than or equal to. |
lt | Int | Less than. |
le | Int | Less than or equal to. |
in | [Int!] | Value is one of the provided values. |
notIn | [Int!] | Value is not one of the provided values. |
isNull | Boolean | When true, matches null values. |
Examples:
# Pages returning a 404
filter: { httpStatusCode: { eq: 404 } }
# Pages with more than 100 inlinks
filter: { inLinksInternalCount: { gt: 100 } }
# Status codes in a set
filter: { httpStatusCode: { in: [301, 302, 307] } }
# Combine range predicates for a between filter
filter: { pageSize: { ge: 1000, le: 5000 } }
Boolean predicates
ConnectionBooleanFilterInput -- used for true/false fields.
| Predicate | Type | Description |
|---|---|---|
eq | Boolean | Matches true or false. |
ne | Boolean | Does not match the given value. |
isNull | Boolean | When true, matches null values. |
Example:
# Pages without structured data
filter: { hasStructuredData: { eq: false } }
Date predicates
ConnectionDateFilterInput -- used for timestamp fields such as createdAt and updatedAt.
| Predicate | Type | Description |
|---|---|---|
eq | DateTime | Exact date match. |
ne | DateTime | Not equal. |
gt | DateTime | After the given date. |
ge | DateTime | On or after the given date. |
lt | DateTime | Before the given date. |
le | DateTime | On or before the given date. |
in | [DateTime!] | One of the provided dates. |
notIn | [DateTime!] | Not one of the provided dates. |
isNull | Boolean | When true, matches null values. |
Example:
# Projects created after Jan 1 2025
filter: { createdAt: { gt: "2025-01-01T00:00:00Z" } }
Enum predicates
Enum fields (e.g. CrawlStatus, CrawlPriority) use dedicated filter inputs that follow the same pattern.
| Predicate | Type | Description |
|---|---|---|
eq | <EnumType> | Exact match on enum value. |
ne | <EnumType> | Not equal. |
in | [<EnumType>!] | Value is one of the provided enum values. |
notIn | [<EnumType>!] | Value is not one of the provided enum values. |
isNull | Boolean | When true, matches null values. |
Example:
# Only finished crawls
filter: { status: { eq: Finished } }
# Crawls that are queued or running
filter: { status: { in: [Queued, Running] } }
Array predicates
ConnectionIntArrayFilterInput and ConnectionStringArrayFilterInput -- used for array-valued fields.
| Predicate | Type | Description |
|---|---|---|
arrayContains | String / Int | Array includes the given element. |
arrayNotContains | String / Int | Array does not include the given element. |
arrayContainsLike | String | Array includes an element matching the pattern (string arrays only). |
arrayNotContainsLike | String | Array does not include an element matching the pattern (string arrays only). |
isNull | Boolean | When true, matches null values. |
Example:
# Pages that have a specific tag in their tags array
filter: { tags: { arrayContains: "navigation" } }
Combining filters with _and and _or
The _and and _or arrays are special properties allowing you to write more complex filters. ConnectionFilterInput
objects in the _and array are combined using logical AND operator, and those inside _or array are combined using
logical OR operator. All root-level conditions are always combined using logical AND operator. That means a filter such
as this:
{
_and: [
{ sitemapsInCount: { eq: 0 } }
],
_or: [
{ url: { contains: "news" } },
{ url: { contains: "guides" } }
],
hasStructuredData: { eq: false }
}
is exactly the same as:
{
_and: [
{ sitemapsInCount: { eq: 0 } },
{
_or: [
{ url: { contains: "news" } },
{ url: { contains: "guides" } }
]
},
{ hasStructuredData: { eq: false } },
]
}
Nesting _and inside _or (and vice versa)
You can nest logical operators to express arbitrarily complex conditions:
filter: {
_or: [
{
_and: [
{ httpStatusCode: { eq: 200 } },
{ pageSize: { gt: 50000 } }
]
},
{
_and: [
{ httpStatusCode: { eq: 301 } },
{ redirectUrl: { contains: "legacy" } }
]
}
]
}
This returns pages that are either (200 AND large) or (301 AND redirecting to a legacy URL).
Multiple predicates on the same field
You can apply more than one predicate to a single field to create range filters:
filter: {
httpStatusCode: { ge: 400, lt: 500 }
}
This returns all pages with a 4xx client error status code.
Example -- Getting Projects with a filter
query getAccount($id: ObjectID!) {
getAccount(id: $id) {
projects(
filter: {
_or: [{ name: { eq: "Google" } }, { name: { eq: "Wikipedia" } }]
}
) {
nodes {
name
}
}
}
}
Example -- Complex filtering on crawl URLs
The following example combines _and and _or to filter crawl URLs that are HTTP 200 and belong to either the /blog/ or /news/ path:
query FilteredCrawlUrls($crawlId: ObjectID!) {
getReportStat(
input: { crawlId: $crawlId, reportTemplateCode: "all_pages" }
) {
crawlUrls(
first: 5
filter: {
_and: [
{ httpStatusCode: { eq: 200 } }
{
_or: [
{ url: { contains: "/blog/" } }
{ url: { contains: "/news/" } }
]
}
]
}
) {
nodes {
url
httpStatusCode
}
totalCount
}
}
}