Merge PDFs

POST /api/pdf-merge2 quota units per call · cache hits free

Merge multiple PDFs into one. Lossless: pages are preserved verbatim with no re-rasterization. Optional title / author metadata for the resulting document.

Concatenates the supplied PDFs in order and returns a single merged document. All pages are preserved verbatim — no re-rasterization, no re-compression, original quality. Optional metadata fields (`title`, `author`, `subject`, `keywords`) populate the resulting document's info dictionary. Encrypted source PDFs are rejected by default; pass `ignoreEncryption: true` to load them anyway. Works for the common no-password-but-flagged-encrypted PDFs from scanners and form software; truly password-protected PDFs still fail.

Inputs

Name	Type	Default	Description
pdfs*	file	—	Array of base64-encoded PDFs to concatenate, in order.
title	string	—	Optional Title metadata for the merged document.
author	string	—	Optional Author metadata for the merged document.
ignoreEncryption	boolean	—	Load PDFs flagged as encrypted anyway. Works for scanner / form-software output that's marked encrypted without a password; truly password-protected PDFs still fail. Default: false.

Response

Modes: binary, stream, json. Cache: yes (24h TTL).

Large outputs stream: responses over the 4 MiB cache limit are piped through as they are produced instead of buffered. They carry x-cache-skip: size and are never cached — identical follow-up calls recompute.

Every response carries x-cache (HIT / MISS / BYPASS), x-cache-signature (stable across identical inputs), and the rate-limit headers listed below.

Quota & limits

Cost per call	2 units against the 10,000-unit monthly quota. Cache hits are free — only cache misses decrement.
Burst limit	200 requests per rolling minute, shared across all tools.
Max request size	`80 MiB` (base64 inflates file payloads ~33% — the limit applies to the decoded bytes).
Max response size	`50 MiB`
Timeout	20s wall clock.
Rate-limit headers	`X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Burst-Limit` on every response; `X-Quota-Warning` once 80% of the monthly quota is spent; `Retry-After` on 429s.

Errors

Errors are JSON with an error string; schema failures add a details object (Zod's flattened field errors).

Status	When	Body
400	Input fails the tool's schema (wrong type, missing required field, malformed base64) or the file bytes can't be parsed as the expected format.	{ "error": "invalid inputs", "details": { "fieldErrors": { … } } }
401	Missing or invalid X-Api-Key / X-Account-Id headers (or no session when called from the browser).	{ "error": "unauthorized" }
404	Unknown tool id, or calling /api/* on the website hostname instead of api.xtract.bot.	{ "error": "not found" }
413	Request body over the tool's max request size, or an image header declaring more megapixels than the tool's pixel budget.	{ "error": "…exceeds maxRequestBytes…" }
415	Content-Type isn't application/json on the json-body transport.	{ "error": "unsupported content-type; expected application/json" }
429	Monthly quota or the 200/min burst limit exhausted. Check Retry-After and the X-RateLimit-* headers.	{ "error": "monthly quota exceeded", "monthRemaining": 0, … }
5xx	Conversion engine failure. Safe to retry; if it persists, the input is hitting a bug — please report it.	{ "error": "…" }

Code samples

Built from the two-page-pair example.

# Download or substitute the example input:
#   curl -O https://xtract.bot/examples/pdf-merge/hello.pdf
PDFS_0_=$(base64 -w0 < hello.pdf)
# Download or substitute the example input:
#   curl -O https://xtract.bot/examples/pdf-merge/world.pdf
PDFS_1_=$(base64 -w0 < world.pdf)

curl -X POST https://api.xtract.bot/api/pdf-merge \
  -H "Content-Type: application/json" \
  -H "Accept: application/octet-stream" \
  -H "X-Account-Id: $XTRACT_ACCOUNT_ID" \
  -H "X-Api-Key: $XTRACT_API_KEY" \
  -d '{
  "title": "merged",
  "pdfs[0]": "__BASE64_PDFS_0___",
  "pdfs[1]": "__BASE64_PDFS_1___"
}'