Documentation Index
Fetch the complete documentation index at: https://docs.trueparser.com/llms.txt
Use this file to discover all available pages before exploring further.
The TrueParser API is a REST-based service that handles document ingestion, status tracking, and result retrieval. All protected endpoints require a Bearer JWT issued by the TrueParser Dashboard.
Base URL
The TrueParser API uses a single production endpoint for all traffic.
| Environment | URL |
|---|
| Production | https://api.trueparser.com |
Authentication
Authentication is handled via the Authorization: Bearer <JWT> header.
Public Endpoints
The following endpoints do not require a security token:
GET /health/live
GET /health/ready
GET /health/node-state
GET /openapi/v1.json
Document Parsing
Ingest Document
POST /api/v1/documents/upload
Submit a document for asynchronous parsing. This endpoint accepts the file as the raw HTTP request body.
Request Parameters
| Field | Type | Required | Description |
|---|
fileName | Query String | No | File name used for extension/type detection. |
documentId | String | No | A custom identifier. If provided, used for idempotency/overwrites. |
documentType | String | No* | The format (e.g., Pdf, ShpZip, MapInfo, FileGdb). Required for ZIP uploads. |
csvRoute | String | No | Required for CSV. Must be Spatial or Tabular. |
sqlDialect | String | No | Required for SQL files (e.g., PostgreSql, MsSql, Snowflake). |
pdfMode | String | No | Required for PDF. Valid values: SingleColumn, MultiColumn, Ocr, Advanced. |
parquetMode | String | No | Optional for Parquet. Valid values: MetadataOnly, MetadataPlusRows. Default is MetadataOnly. |
customMetadataJson | String(JSON) | No | JSON object encoded as a query value; attached to the job. |
chunkHints | Boolean | No | Optional chunking hint for supported Office/OpenDocument/PDF flows. |
Response (202 Accepted)
{
"documentId": "7b1c0d8c-...",
"status": "Queued",
"statusInt": 2
}
Create Presigned Upload Request
POST /api/v1/documents/upload-request
Creates a tenant-scoped upload session for larger files.
Complete Presigned Upload
POST /api/v1/documents/upload-complete
Finalizes a previously created upload session and enqueues parsing.
Lifecycle & Results
Check Status
GET /api/v1/documents/{documentId}/status
Returns the current processing state of the job.
Status Values
statusInt is the numeric job-state code. status is the human-readable string returned by the API.
statusInt | status | Meaning |
|---|
0 | job not yet queued | Upload has not been accepted into queue state yet. |
1 | received | The upload has been accepted, but queue admission has not completed yet. |
2 | queued | Waiting in the orchestration queue. |
3 | processing | The engine is actively extracting data. |
4 | materializing | Parsed content is being materialized. |
5 | retrying | The job is being retried after a transient failure. |
6 | waiting-for-provider | The job is waiting for an external provider slot. |
7 | provider-processing | Provider-backed extraction is running. |
8 | completed | Result is ready for retrieval. |
9 | failed | Extraction failed. See the error field in the response. |
Response (200 OK)
{
"status": "completed",
"statusInt": 8,
"startedAt": "2024-03-20T10:00:00Z",
"completedAt": "2024-03-20T10:00:05Z"
}
Retrieve Result
GET /api/v1/documents/{documentId}/result
Streams the final, materialized JSON artifact directly.
Response Behaviors
- 200 OK: Returns the parsed JSON file as
application/json.
- 202 Accepted: Document is still processing. Poll the status endpoint again.
- 422 Unprocessable Entity: The extraction failed. The body contains the specific error reason.
- 404 Not Found: The document ID is invalid or the artifact has expired (3-hour default retention).
Check Status (Batch)
POST /api/v1/documents/status/batch
Polls status for multiple documents in one call.
Request body:
{
"documentIds": ["doc-1", "doc-2"]
}
Health & Monitoring
Readiness Probe
GET /health/ready
Checks if the node is ready to accept traffic.
- Status Checks: Currently verifies Redis connectivity and registry initialization.
Node State
GET /health/node-state
Returns high-granularity telemetry for the specific node, including CPU, memory, and active request counts. Used primarily for load balancer routing decisions.
Common Error Codes
| Error Code | HTTP | Description |
|---|
missing_token | 401 | No Bearer token provided in the Authorization header. |
token_expired | 401 | The provided JWT has expired. |
domain_not_allowed | 403 | The request Origin/Referer is not in the App’s allowed domain list. |
quota_exceeded | 429 | Document unit quota reached for the current billing window. |
document_too_large | 413 | The file exceeds the maximum allowed size for your plan. |
extraction_failed | 422 | The parser engine encountered an unrecoverable error. |