Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.trueparser.com/llms.txt

Use this file to discover all available pages before exploring further.

The TrueParser API is a REST-based service that handles document ingestion, status tracking, and result retrieval. All protected endpoints require a Bearer JWT issued by the TrueParser Dashboard.

Base URL

The TrueParser API uses a single production endpoint for all traffic.
EnvironmentURL
Productionhttps://api.trueparser.com

Authentication

Authentication is handled via the Authorization: Bearer <JWT> header.

Public Endpoints

The following endpoints do not require a security token:
  • GET /health/live
  • GET /health/ready
  • GET /health/node-state
  • GET /openapi/v1.json

Document Parsing

Ingest Document

POST /api/v1/documents/upload Submit a document for asynchronous parsing. This endpoint accepts the file as the raw HTTP request body.

Request Parameters

FieldTypeRequiredDescription
fileNameQuery StringNoFile name used for extension/type detection.
documentIdStringNoA custom identifier. If provided, used for idempotency/overwrites.
documentTypeStringNo*The format (e.g., Pdf, ShpZip, MapInfo, FileGdb). Required for ZIP uploads.
csvRouteStringNoRequired for CSV. Must be Spatial or Tabular.
sqlDialectStringNoRequired for SQL files (e.g., PostgreSql, MsSql, Snowflake).
pdfModeStringNoRequired for PDF. Valid values: SingleColumn, MultiColumn, Ocr, Advanced.
parquetModeStringNoOptional for Parquet. Valid values: MetadataOnly, MetadataPlusRows. Default is MetadataOnly.
customMetadataJsonString(JSON)NoJSON object encoded as a query value; attached to the job.
chunkHintsBooleanNoOptional chunking hint for supported Office/OpenDocument/PDF flows.

Response (202 Accepted)

{
  "documentId": "7b1c0d8c-...",
  "status": "Queued",
  "statusInt": 2
}

Create Presigned Upload Request

POST /api/v1/documents/upload-request Creates a tenant-scoped upload session for larger files.

Complete Presigned Upload

POST /api/v1/documents/upload-complete Finalizes a previously created upload session and enqueues parsing.

Lifecycle & Results

Check Status

GET /api/v1/documents/{documentId}/status Returns the current processing state of the job.

Status Values

statusInt is the numeric job-state code. status is the human-readable string returned by the API.
statusIntstatusMeaning
0job not yet queuedUpload has not been accepted into queue state yet.
1receivedThe upload has been accepted, but queue admission has not completed yet.
2queuedWaiting in the orchestration queue.
3processingThe engine is actively extracting data.
4materializingParsed content is being materialized.
5retryingThe job is being retried after a transient failure.
6waiting-for-providerThe job is waiting for an external provider slot.
7provider-processingProvider-backed extraction is running.
8completedResult is ready for retrieval.
9failedExtraction failed. See the error field in the response.

Response (200 OK)

{
  "status": "completed",
  "statusInt": 8,
  "startedAt": "2024-03-20T10:00:00Z",
  "completedAt": "2024-03-20T10:00:05Z"
}

Retrieve Result

GET /api/v1/documents/{documentId}/result Streams the final, materialized JSON artifact directly.

Response Behaviors

  • 200 OK: Returns the parsed JSON file as application/json.
  • 202 Accepted: Document is still processing. Poll the status endpoint again.
  • 422 Unprocessable Entity: The extraction failed. The body contains the specific error reason.
  • 404 Not Found: The document ID is invalid or the artifact has expired (3-hour default retention).

Check Status (Batch)

POST /api/v1/documents/status/batch Polls status for multiple documents in one call. Request body:
{
  "documentIds": ["doc-1", "doc-2"]
}

Health & Monitoring

Readiness Probe

GET /health/ready Checks if the node is ready to accept traffic.
  • Status Checks: Currently verifies Redis connectivity and registry initialization.

Node State

GET /health/node-state Returns high-granularity telemetry for the specific node, including CPU, memory, and active request counts. Used primarily for load balancer routing decisions.

Common Error Codes

Error CodeHTTPDescription
missing_token401No Bearer token provided in the Authorization header.
token_expired401The provided JWT has expired.
domain_not_allowed403The request Origin/Referer is not in the App’s allowed domain list.
quota_exceeded429Document unit quota reached for the current billing window.
document_too_large413The file exceeds the maximum allowed size for your plan.
extraction_failed422The parser engine encountered an unrecoverable error.
Last modified on April 28, 2026