OCR full analysis
The "OCR full analysis" service task combines text extraction and structure analysis in a single processing step. In addition to the complete running text, tables, headings, and layout regions are recognized simultaneously.
This makes it ideal for documents that contain both text paragraphs and tabular content — such as invoices with line items, delivery notes, or forms with free-text fields.
Input parameters
Provide the following fields as task input:
{
"pdf": {
"referenceId": "...",
"filename": "delivery-note.pdf"
},
"enableHandwriting": true,
"outputFormat": "json"
}
Explanation:
pdf.referenceId: Reference to the uploaded file (PDF or image). The file must be available as afileReference.pdf.filename: The filename including extension. Supported formats: PDF, JPG, PNG, BMP, TIFF.enableHandwriting: Optional (true/false, default:true). Enables handwriting recognition and automatic document de-skewing.outputFormat: Optional (jsonormarkdown, default:json). When set tomarkdown, the result is returned as structured Markdown text.
Output
JSON format (default)
With outputFormat: "json", the result includes both text and structural data per page:
{
"metadata": {
"source_file": "delivery-note.pdf",
"total_pages": 1,
"total_text_blocks": 49,
"total_tables": 1
},
"pages": [
{
"page_number": 0,
"text": "Delivery Note No. 7208166\nDate: 2025-03-19",
"text_blocks": [
{
"text": "Delivery Note No. 7208166",
"confidence": 0.98,
"bbox": { "x_min": 50, "y_min": 30, "x_max": 400, "y_max": 60 }
}
],
"tables": [
{
"index": 0,
"html": "<table><tr><td>Item</td><td>Qty</td><td>Product</td></tr><tr><td>1</td><td>50</td><td>Bolt M8</td></tr></table>"
}
]
}
],
"full_text": "Delivery Note No. 7208166\nDate: 2025-03-19\nCustomer: Sample Ltd."
}
Markdown format
With outputFormat: "markdown", the result is returned as combined Markdown text:
{
"markdown": "## Delivery Note No. 7208166\n\nDate: 2025-03-19\n\n| Item | Qty | Product |\n|------|-----|---------|\n| 1 | 50 | Bolt M8 |",
"ocr_text": "Delivery Note No. 7208166\nDate: 2025-03-19\nItem Qty Product\n1 50 Bolt M8"
}
Explanation:
text_blocks: Individual text blocks with position and confidence score — for precise text localization.tables: Recognized tables as HTML.full_text: The combined text from all pages as plain text — ideal as input for AI services.markdown: Structured Markdown text with tables and paragraphs (only withoutputFormat: "markdown").ocr_text: Additional plain text independent of the structure analysis (only withoutputFormat: "markdown").
JSONata examples
// Reference a file from an upload step
{
"pdf": {
"referenceId": uploadResult.referenceId,
"filename": uploadResult.filename
},
"enableHandwriting": true,
"outputFormat": "json"
}
// Pass recognized text and tables to an AI service
{
"text": ocrResult.full_text,
"prompt": "Extract the invoice number, date, and total amount."
}
// Check whether tables were recognized (gateway condition)
ocrResult.metadata.total_tables > 0
Notes
- Processing time varies depending on document size — typically 30–180 seconds, as both text and structure recognition are performed.
- When only the running text is needed (without tables), the "OCR text extraction" service task is faster and more resource-efficient.
- When only the table structure is relevant, the "OCR structure analysis" service task can be used instead.
- All three OCR variants support image files (JPG, PNG, BMP, TIFF) in addition to PDF — photographed documents can be processed directly.
Tip
The Markdown format (outputFormat: "markdown") works excellently as input for AI services. The structured text with table formatting enables the AI to recognize items, quantities, and amounts more reliably than with plain text alone.