Crawl

Asynchronously crawl an entire website and collect every page as clean content.

POST/api/v1/crawl

Start a crawl

FieldTypeDescription
urlrequiredstringRoot URL to start crawling from.
limitnumberMaximum pages to crawl (1–500). Credits are reserved up-front.
Default: 50
includePathsstring[]Glob patterns to include (e.g. ["/docs/*"]).
excludePathsstring[]Glob patterns to exclude.
maxDepthnumberHow many link hops to follow.
Default: 3
scrapeOptionsobjectForwarded to each page scrape — e.g. { formats: ["markdown", "html"], onlyMainContent: true }. Defaults to { formats: ["markdown"] }.
bash
curl https://neuroapi.me/api/v1/crawl \
-H "Authorization: Bearer $NEUROAPI_KEY" \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com/docs","limit":100,"includePaths":["/docs/*"]}'

The API returns a job descriptor immediately. The id is a UUID:

json
{
"success": true,
"data": { "id": "550e8400-e29b-41d4-a716-446655440000", "status": "running" },
"creditsCharged": 100
}

Check crawl status

GET/api/v1/crawl/{id}
bash
curl https://neuroapi.me/api/v1/crawl/550e8400-e29b-41d4-a716-446655440000 \
-H "Authorization: Bearer $NEUROAPI_KEY"
json
{
"success": true,
"data": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "done",
"pages_processed": 100,
"credits_charged": 100,
"result": {
"status": "completed",
"data": [
{ "markdown": "...", "metadata": { "sourceURL": "https://example.com/docs/intro" } }
]
}
}
}

Cancel a crawl

DELETE/api/v1/crawl/{id}

Cancel an in-flight crawl. Pages already processed are billed; the remaining reserved credits are refunded to your balance automatically.

bash
curl -X DELETE https://neuroapi.me/api/v1/crawl/550e8400-e29b-41d4-a716-446655440000 \
-H "Authorization: Bearer $NEUROAPI_KEY"
json
{
"success": true,
"data": { "id": "550e8400-e29b-41d4-a716-446655440000", "status": "cancelled", "refunded": 73 }
}
Polling cadence
Poll every 2–5 seconds for small crawls and 10–30 seconds for large ones. We charge nothing extra for status checks.