Solutions · RAG
Ingestion that just works.
The hardest part of RAG isn't the embeddings — it's getting clean text out of the open web. NeuroAPI gives you markdown-first ingestion with batch jobs, scheduled recrawls, and webhooks.
Markdown-first
Boilerplate-stripped, heading-preserved markdown ready to chunk. No HTML cleanup.
Batch crawl
Fan out 1,000 URLs in one async job with concurrency control and smart retries.
Scheduled recrawls
Keep your index fresh. Cron-trigger crawls and ingest only changed pages.
Webhooks
Push completed jobs into Pinecone, Weaviate, Supabase, or your own pipeline.
Structured + unstructured
Mix raw markdown with extracted JSON. Hybrid retrieval ready out of the box.
Citation-ready
Every chunk traces back to a source URL and DOM path. Grounded answers, no guessing.
Reference pipeline
- 1
POST /v1/mapDiscover every URL on the target domain.
- 2
POST /v1/batch-scrapePass the URL list. Get markdown + metadata in one async job.
- 3
Webhook fires on completionStream chunks straight into your vector store.
- 4
POST /v1/crawl on a cronDetect updates. Re-ingest only the deltas.