Skip to content

Read PDF metadata

The "Read PDF metadata" service task reads structural information and metadata from a PDF file. Properties such as the page count, title or author can be extracted – for example to use as a condition in subsequent process steps.


Input parameters

The task expects the following fields:

{
  "pdf": {
    "referenceId": "...",
    "filename": "annual-report.pdf",
    "contentType": "application/pdf"
  }
}

Explanation:

  • pdf: File reference of the PDF whose metadata should be read (required).

Output

The task returns the available metadata as an object.

{
  "pageCount": 12,
  "title": "Annual Report 2025",
  "author": "John Smith",
  "subject": null,
  "creator": null,
  "producer": "pdf-lib",
  "creationDate": "2025-01-15T10:30:00.000Z",
  "modificationDate": null,
  "filename": "annual-report.pdf"
}

Explanation:

  • pageCount: Number of pages in the document.
  • title: Document title (if stored in the PDF).
  • author: Document author (if stored).
  • subject: Document subject (if stored).
  • creator: Application used to create the document.
  • producer: Software that generated the PDF file.
  • creationDate: Creation date in ISO 8601 format.
  • modificationDate: Last modification date in ISO 8601 format.
  • filename: The filename from the input.

Missing metadata

Not all PDF files contain complete metadata. Fields without a value are returned as null.


JSONata examples

Use page count as a gateway condition:

{
  "pdf": $.document
}

The result can then be used in a process gateway, e.g. $.pageCount > 10 to control branching.


Notes

  • The task does not modify the PDF file – it only reads metadata.
  • The results can be used in subsequent steps as conditions or for display purposes.

Tip

The page count is especially useful as a gateway condition: for example "If the document has more than 20 pages → compress automatically" or "Forward single-page PDFs directly, send multi-page ones for review first".