Solutions · RAG

Ingestion that just works.

The hardest part of RAG isn't the embeddings — it's getting clean text out of the open web. NeuroAPI gives you markdown-first ingestion with batch jobs, scheduled recrawls, and webhooks.

Markdown-first

Boilerplate-stripped, heading-preserved markdown ready to chunk. No HTML cleanup.

Batch crawl

Fan out 1,000 URLs in one async job with concurrency control and smart retries.

Scheduled recrawls

Keep your index fresh. Cron-trigger crawls and ingest only changed pages.

Webhooks

Push completed jobs into Pinecone, Weaviate, Supabase, or your own pipeline.

Structured + unstructured

Mix raw markdown with extracted JSON. Hybrid retrieval ready out of the box.

Citation-ready

Every chunk traces back to a source URL and DOM path. Grounded answers, no guessing.

Reference pipeline

  1. 1
    POST /v1/map

    Discover every URL on the target domain.

  2. 2
    POST /v1/batch-scrape

    Pass the URL list. Get markdown + metadata in one async job.

  3. 3
    Webhook fires on completion

    Stream chunks straight into your vector store.

  4. 4
    POST /v1/crawl on a cron

    Detect updates. Re-ingest only the deltas.