v2 Stateful API
The v2 API introduces a stateful layer on top of the existing extraction engine — upload files once, save configs, and reference them across calls usingfile:// and config:// URIs.- File management — upload a file with
POST /api/v2/filesand receive afile://<uuid>URI. Use that URI in any subsequent parse, extract, batch, or classify call instead of re-uploading the file. - Config management — save a named extraction config with
POST /api/v2/configsand reference it asconfig://<uuid>. Supports parse, extract, and classify config types so you can reuse the same schema or prompt across requests without repeating parameters. - Parse endpoint —
POST /api/v2/parseconverts documents to markdown or HTML. Accepts a file upload, a URL, or afile://URI. - Extract endpoint —
POST /api/v2/extractruns structured extraction and returns JSON or CSV. Accepts a file upload, URL, orfile://URI plus an inline config orconfig://URI. - Classify endpoint —
POST /api/v2/classifyclassifies documents by type with optional split mode that identifies page-level document boundaries within a single multi-page file. - Batch endpoint —
POST /api/v2/batchaccepts a list of items, each with its own input and config, and returns a per-item result array in one response.
Example: Upload then extract
OCR 3 Engine for JSON Extraction
JSON extraction now uses the NanoNets OCR 3 model, improving accuracy for structured field extraction on dense or complex documents. The upgraded engine is applied automatically for both single-page and multi-page flows — no change to your API calls required.Automatic Image Unwarping
Documents with physically curved, skewed, or warped pages are now automatically corrected before extraction. The unwarping step runs prior to OCR and significantly improves accuracy on scanned books, photographed receipts, and other non-flat documents.AWS Marketplace
Nanonets Extraction API is now available as a SaaS subscription through AWS Marketplace. Marketplace customers can subscribe directly from the AWS console and have usage metered and billed through their existing AWS account.- Consolidated billing — charges appear on your AWS invoice alongside other Marketplace subscriptions.
- Existing Stripe plans unaffected — current Stripe subscribers continue on their existing plan; Marketplace billing is a separate path.
- Same API, same keys — Marketplace accounts use the same API surface and credentials as direct-signup accounts.
New File Format Support
The following file types can now be submitted to all extraction endpoints:| Category | New formats |
|---|---|
| Images | PSD, PCX, PPM, APNG, CUR, DCX |
| Documents | DOTX, WPD |
| Spreadsheets | XLSM, XLTX, XLTM, QPW |
MCP Access and Reusable Extraction Workflows
Since the February 20 updates, Nanonets has added several new ways to connect documents and reuse extraction workflows without resubmitting the same files.- Connect AI assistants through the hosted MCP server — use OAuth 2.1 sign-in for Claude and other MCP clients, with built-in document extraction and exploration tools.
- Reuse existing uploads by
record_id— rerun sync, async, streaming, or batch extraction on previously uploaded files instead of sending the same file again. - Upload files in one request via
/api/v1/upload/file— send multipart form data and get arecord_idback immediately for follow-on extraction calls. - Embed metadata directly in JSON results — use
metadata_options=unifiedto merge bounding boxes and confidence data into the response structure you consume.
Batch File Organization
Files can now be organized into named batches (folders) directly from the Files page.- Create and delete batches — give each batch a name and optional description.
- Move files into batches — assign any uploaded file to a batch using the batch picker.
- Filter by batch — the Files page filters to show only files in the selected batch, keeping large workspaces organized.
Document Navigation
Multi-page document results now include next/previous page controls:- Page navigation — step through pages of extraction results without returning to the file list.
- JSON viewer improvements — unified metadata fields (value, confidence score, and bounding box) are displayed inline per field. Table results with many rows are paginated at 50 rows per page.
Larger File Uploads
The file upload limit has been increased to 500 MB for paid plans (free plan remains at 50 MB). The limit is now consistent across the API, MCP proxy, and UI upload paths.TypeScript SDK
The official TypeScript SDK for the Nanonets Extraction API is now available, with full type coverage for all v1 endpoints.- Full type coverage — every request and response is typed, including nested objects, enums, and optional fields.
- Sync and async extraction — call
/api/v1/extract/sync,/api/v1/extract/async,/api/v1/extract/batch, and/api/v1/extract/streamwith typed parameters. - File upload helpers — pass a
File,Blob, orReadStreamdirectly; the SDK handles multipart encoding. - Documentation — full usage guide and examples available in the TypeScript SDK docs.
Excel Viewer
Extraction results for Excel and spreadsheet files are now displayed in a native viewer directly in the results UI.- In-browser spreadsheet rendering — rows, columns, and cell values are rendered in a spreadsheet-style grid without downloading the file.
- Multi-sheet navigation — documents with multiple sheets show a tab bar; switch between sheets without leaving the results view.
- Integrated with the results page — the viewer loads automatically when the extracted file is a spreadsheet format.
Excel Extraction Pipeline
Excel files are now processed through a dedicated extraction pipeline that preserves spreadsheet structure in the output.- Sheet-aware extraction — each sheet is extracted separately and included in the structured response.
- Preprocessing step — spreadsheet data is normalized before extraction to improve field and table accuracy.
- Compatible with existing options —
output_format,json_options, andmetadata_optionsall work with Excel inputs.
Python SDK
The official Python SDK for the Nanonets Extraction API is now available, with support for both sync and async usage patterns.- Sync and async clients — use
NanonetsClientfor synchronous calls orAsyncNanonetsClientwithasyncio. - Typed models — all request and response objects are Pydantic models with IDE autocompletion.
- File upload support — pass a file path, bytes, or file-like object directly to extraction methods.
- Documentation — full usage guide and examples available in the Python SDK docs.
AI-Powered Schema Generation from Natural Language
You can now generate JSON extraction schemas directly from natural language descriptions in the schema builder interface.- Natural language input — describe what you want to extract in plain English instead of manually building complex JSON schemas.
- One-click generation — click the new schema generation icon in the schema builder and provide a text prompt to instantly generate a valid JSON schema.
- Automatic validation — generated schemas are validated and can be edited using the visual schema builder or kept as raw JSON.
- Smart parsing — complex schemas with nested objects and arrays are automatically detected and handled appropriately.
How to Use
- Navigate to the Configuration page in your extraction workflow
- In the JSON Options schema builder section, click the new Generate Schema from Prompt icon (layers icon) next to the Paste Schema button
- Enter a description of your desired schema, for example:
“I want to extract invoice information including invoice number, date, total amount, customer name, and a list of line items with product name, quantity, and price.”
- Click Generate Schema and the AI will create a structured JSON schema matching your requirements
- The generated schema automatically populates the schema builder where you can review, edit, or use it directly
Example Input
Generated Schema
Webhooks
You can now configure a webhook URL to receive the full extraction result automatically when async file processing completes — no polling required.- Complete data delivery — the webhook POST payload contains the same
resultstructure as theGET /v1/extract/results/{record_id}API, including all requested formats (markdown, json, csv, html) with metadata. - New Webhooks page — configure and manage your webhook URL from the dedicated Webhooks page in the sidebar.
- Request context included — the payload includes
filename,output_format,pages_processed,processing_time,created_at, and the originalrequest_config.
Example Webhook Payload
Document Classification API
You can now classify documents with dedicated endpoints for single-file and high-volume workflows.- New
/api/v1/classify/syncendpoint — classify a single document and get category, document type, and explanation in one response. - New
/api/v1/classify/batchendpoint — classify up to 50 files per request for higher-throughput document triage. - Routing-ready output — classification labels make it easier to route documents to downstream extraction or review flows.
Bounding Boxes in JSON Responses
Bounding boxes are now supported in JSON extraction responses (output_format=json), for both single-page images and multi-page PDFs.- Per-field bounding boxes — each extracted field in the JSON output includes a bounding box mapping back to its location in the source document.
- Multi-page support — page-aware resolution of bounding boxes.
Example Request
Example Response
Word-Level Bounding Boxes
New word-level bounding box extraction for precise per-word coordinate mapping:- Block-level (
include_metadata=bounding_boxes) — Existing behavior. Returns one bounding box per paragraph or region detected by layout analysis. Ideal for highlighting sections, table rows, or paragraphs. - Word-level (
include_metadata=bounding_boxes_word) — New. Returns one bounding box per individual word using advanced OCR. Enables fine-grained annotation, search-hit highlighting, and word-by-word document overlays.
markdown_line, word_offset) that maps each word back to its position in the original markdown, preserving full markdown rendering (tables, headings, formatting) while enabling individual word highlighting.Response Structure
Both block and word-level bounding box responses follow the same structure:API Playground UI
New interactive web playground for testing document extraction:- Output format selection - Choose from Markdown, JSON, CSV/Excel, or HTML output
- Schema Builder - Visual JSON schema editor with support for nested objects, arrays, and enums up to 10 levels deep
- Field List mode - Quick extraction with simple field name arrays
- Metadata options - Enable confidence scores and bounding boxes per field
Streaming Extraction Endpoint
New/v1/extract/stream endpoint for real-time extraction via Server-Sent Events (SSE):- Streaming mode - Content delivered in small chunks as it’s generated
- Batch mode - Content sent all at once when extraction completes
v1 Extraction API
New/v1/extract endpoints with a cleaner, more consistent interface:- Sync extraction (
/v1/extract/sync) - Process documents synchronously with immediate results - Async extraction (
/v1/extract/async) - Queue documents for background processing - Batch extraction (
/v1/extract/batch) - Process up to 50 files in a single request - Fetch result (
/v1/extract/results/<record_id>) - Fetch the results for a single record_id - Results endpoints (
/v1/extract/results) - List and retrieve extraction results with pagination
Multi-Page JSON Extraction with Confidence Scoring
Enhanced JSON extraction now processes multi-page documents and returns responses based on the best confidence score, improving accuracy for complex documents.Bounding Box Extraction API
New/extract-with-bounding-boxes endpoint that returns extracted data with precise coordinate information for each field, enabling document annotation and validation workflows.Response Dimensions
API responses now include dimension metadata (width/height) for processed documents, useful for coordinate calculations and rendering.Streaming & Partial Results
- New streaming extraction endpoint at
/v1/extract/stream - Partial results API to retrieve in-progress extractions
- Improved delimiter handling for chunked responses
Billing & Usage APIs
- Credit usage reporting integration with Stripe
- Subscription status tracking
- Document processing limits per plan
On-Premise License APIs
New license management APIs for enterprise on-premise deployments, including activation and validation endpoints.Repetition Detection & Retry
Improved extraction reliability with automatic retry when repetition patterns are detected in model outputs.Excel & DOCX Processing
Fixed file processing for Excel spreadsheets and Word documents with improved error handling.OpenAI-Compatible Chat Completions API
Full OpenAI-compatible/v1/chat/completions endpoint supporting:- PDF and document uploads directly in requests
- All major file types (PDF, Excel, Word, images)
- Drop-in replacement for OpenAI SDK integrations
Hierarchy Extraction API
New API for extracting document hierarchies and structure, including parent-child relationships and table of contents with linked IDs.Custom Prompt Instructions
Support for custom prompt instructions in markdown format, allowing fine-tuned extraction behavior for specific use cases.Expanded File Type Support
Extended support for additional file formats in chat completions:- PDF documents
- Excel spreadsheets (.xlsx, .xls)
- Word documents (.docx)
- All major image formats
OpenAI SDK Upgrade
Updated to latest OpenAI SDK version for improved compatibility and performance.Coming Soon
Batch Processing
Process multiple documents in a single API call with consolidated results.
Webhook Retries
Automatic retry with exponential backoff for failed webhook deliveries.