Documentation Index
Fetch the complete documentation index at: https://docs.trueparser.com/llms.txt
Use this file to discover all available pages before exploring further.
MsOffice
Use this contract for Microsoft Office and web-text results. It is designed for everyday document intelligence workflows such as knowledge extraction, document QA, editorial review, report ingestion, and RAG across Word, Excel, PowerPoint, and web text content.Top-level envelope
Document fields
| Field | Notes |
|---|---|
source_file | Source file name. |
format | Concrete source format. |
format_family | One of word, excel, powerpoint, or web_text. |
title, author, subject, company | Document metadata fields. |
created_at, modified_at | Timestamps when available. |
page_count, sheet_count, slide_count | Family-specific summary counts. |
Universal content/block shape
Every public content record uses the same base shape.| Field | Notes |
|---|---|
id | Stable record id. |
type | Public record type. |
path | Public structural path. |
parent_id | Parent record id. |
depth | Structural depth. |
page_number | Page, sheet, or slide reference when applicable. |
source_ref | Provenance object. |
is_inferred | Inference marker. |
chunk_hint | Present only when you request it. |
text | Searchable text projection. |
attributes | Office-specific structured data. |
word_metadata, section, heading, paragraph, list, table, table_row, table_cell, image, header_footer, bookmark, footnote, endnote, toc, field, hyperlink, word_formula, word_chart, word_smartart, and page_break.
Warnings
warningsis always present.- Use plain strings.
- Keep warnings readable.
- Do not use warnings as a substitute for missing fields.
What clients can rely on
- Order stays deterministic.
- Structure stays explicit.
attributesholds family-specific details.- Optional fields may be omitted when they do not apply.
- The public contract does not expose internal transport or worker details.

