Parse documents
Upload a PDF, Word, Excel, HTML, ODT or RTF file and get back clean, agent-ready content — no URL required. Great for private documents, contracts, reports, and user-uploaded files.
POST
/api/v1/parseLimits & pricing
Maximum file size is 50 MB. Billed at 1 credit base + 1 credit per pageactually processed. A single-page PDF costs 2 credits; a 20-page PDF costs 21.
Request — multipart/form-data (recommended)
| Field | Type | Description |
|---|---|---|
| filerequired | binary | The document to parse. Up to 50 MB. |
| options | string (JSON) | Optional JSON string with the same shape as the JSON request body below (formats, pdf, onlyMainContent, timeout). |
Request — application/json
Provide either url (we fetch the document server-side) or fileBase64 + filename.
| Field | Type | Description |
|---|---|---|
| url | string | Absolute URL to a document. We fetch it and infer the filename from the path. Use this OR fileBase64. |
| fileBase64 | string | Base64-encoded file bytes. Use this OR url. |
| filename | string | Original filename — required with fileBase64; inferred from url otherwise. |
| contentType | string | Optional MIME type override (e.g. "application/pdf"). |
| options.formats | string[] | Any of "markdown", "html", "rawHtml", "json". Default: ["markdown"] |
| options.onlyMainContent | boolean | Strip headers/footers from each page. Default: true |
| options.pdf.mode | "fast" | "auto" | "ocr" | Parsing strategy for PDFs. Use "ocr" for scanned documents. Default: "auto" |
| options.pdf.maxPages | number | Cap the number of pages parsed (and credits charged). |
| options.timeout | number | Hard request timeout in ms. Default: 60000 |
Example — upload a PDF
bash
curl https://neuroapi.me/api/v1/parse \ -H "Authorization: Bearer $NEUROAPI_KEY" \ -F "file=@./contract.pdf" \ -F 'options={"formats":["markdown"],"pdf":{"mode":"auto"}}'Example — parse a remote URL
bash
curl https://neuroapi.me/api/v1/parse \ -H "Authorization: Bearer $NEUROAPI_KEY" \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/whitepaper.pdf", "options": { "formats": ["markdown"] } }'Example — JSON / base64
bash
curl https://neuroapi.me/api/v1/parse \ -H "Authorization: Bearer $NEUROAPI_KEY" \ -H "Content-Type: application/json" \ -d '{ "fileBase64": "JVBERi0xLjQK...", "filename": "contract.pdf", "options": { "formats": ["markdown"], "pdf": { "mode": "auto" } } }'Example response
json
{ "success": true, "data": { "markdown": "# Mutual NDA\n\nThis Mutual Non-Disclosure Agreement...", "metadata": { "filename": "contract.pdf", "contentType": "application/pdf", "pages": 4 } }, "creditsCharged": 5}Tips
- Use
pdf.mode: "ocr"for scanned PDFs without a text layer. - Set
pdf.maxPagesto cap cost on very large documents. - For HTML files,
onlyMainContent: truebehaves the same way as/scrape.