OCR structure analysis
The "OCR structure analysis" service task automatically recognizes the layout structure of a document — including tables, headings, text areas, and other visual elements. Unlike plain text extraction, this service analyzes the spatial arrangement of the document.
This enables tabular content such as item lists on delivery notes, invoice line items, or structured forms to be captured automatically and processed in subsequent steps.
Input parameters
Provide the following fields as task input:
{
"pdf": {
"referenceId": "...",
"filename": "invoice.pdf"
},
"enableHandwriting": true,
"outputFormat": "json"
}
Explanation:
pdf.referenceId: Reference to the uploaded file (PDF or image). The file must be available as afileReference.pdf.filename: The filename including extension. Supported formats: PDF, JPG, PNG, BMP, TIFF.enableHandwriting: Optional (true/false, default:true). Enables handwriting recognition and automatic document de-skewing.outputFormat: Optional (jsonormarkdown, default:json). When set tomarkdown, the recognized structure is returned as Markdown text — particularly useful as input for AI services.
Output
JSON format (default)
With outputFormat: "json", the recognized structure is returned page by page:
{
"metadata": {
"source_file": "invoice.pdf",
"total_pages": 1,
"total_text_blocks": 0,
"total_tables": 1
},
"pages": [
{
"page_number": 0,
"text": "Item | Qty | Product | Price",
"tables": [
{
"index": 0,
"html": "<table><tr><td>Item</td><td>Qty</td><td>Product</td><td>Price</td></tr><tr><td>1</td><td>50</td><td>Bolt M8</td><td>€0.12</td></tr></table>"
}
]
}
],
"full_text": "Item | Qty | Product | Price"
}
Markdown format
With outputFormat: "markdown", the result is returned as Markdown text:
{
"markdown": "## Invoice\n\n| Item | Qty | Product | Price |\n|------|-----|---------|-------|\n| 1 | 50 | Bolt M8 | €0.12 |",
"ocr_text": "Invoice\nItem Qty Product Price\n1 50 Bolt M8 €0.12"
}
Explanation:
tables: Recognized tables as HTML — ready for parsing in subsequent steps.markdown: Structured Markdown text preserving tables, headings, and paragraphs.ocr_text: Additionally extracted plain text (included in Markdown format) that also captures text areas outside of tables.
JSONata examples
// Reference a file from a previous step
{
"pdf": {
"referenceId": uploadResult.referenceId,
"filename": uploadResult.filename
},
"outputFormat": "markdown"
}
// Check whether tables were recognized (gateway condition)
$count(ocrResult.pages[0].tables) > 0
Notes
- Processing time varies depending on document size — typically 20–120 seconds, somewhat longer than plain text extraction.
- The Markdown format is particularly well suited as input for AI services such as "AI: Extract structured data" or "AI: Query JSON content".
- Tables are returned as HTML, making programmatic processing straightforward.
Tip
For documents where both the running text and the table structure are needed, the "OCR full analysis" service task is recommended — it combines text extraction and structure analysis in a single pass.