HTML sanitize

POST /api/html-sanitize1 quota unit per call · cache hits free

Sanitize untrusted HTML: strip `<script>`, event handlers, and dangerous attributes. Whitelist-based — only known-safe elements and attributes survive.

Takes potentially-unsafe HTML (from a rich-text editor, an email body, a CMS, etc.) and returns markup that is safe to render. The sanitiser is whitelist-based: only known-safe elements survive (paragraphs, headings, lists, tables, common inline formatting, links, images), and only known-safe attributes are preserved on each. `<script>`, `<style>`, `<iframe>`, `onerror=`, `onclick=`, `javascript:` URLs etc. are all removed. Use this anywhere you embed user-supplied HTML in your own page.

Inputs

Name	Type	Default	Description
html*	string	—	Untrusted HTML to sanitize.
preset	enum (strict \| rich)	"strict"	`strict` (default) for user-generated content; `rich` for article-style content (allows headings, code blocks, tables).
allowedTags	string	—	Optional JSON object mapping tag → array of allowed attributes. When set, replaces the preset's tag list entirely.
stripDisallowed	boolean	false	When true, drop disallowed tags entirely (with their content for <script>/<style>). When false (default), escape them as text.

Response

Modes: json. Cache: yes (24h TTL).

Every response carries x-cache (HIT / MISS / BYPASS), x-cache-signature (stable across identical inputs), and the rate-limit headers listed below.

Quota & limits

Cost per call	1 unit against the 10,000-unit monthly quota. Cache hits are free — only cache misses decrement.
Burst limit	200 requests per rolling minute, shared across all tools.
Max request size	`10 MiB` (base64 inflates file payloads ~33% — the limit applies to the decoded bytes).
Max response size	`10 MiB`
Timeout	15s wall clock.
Rate-limit headers	`X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Burst-Limit` on every response; `X-Quota-Warning` once 80% of the monthly quota is spent; `Retry-After` on 429s.

Errors

Errors are JSON with an error string; schema failures add a details object (Zod's flattened field errors).

Status	When	Body
400	Input fails the tool's schema (wrong type, missing required field, malformed base64) or the file bytes can't be parsed as the expected format.	{ "error": "invalid inputs", "details": { "fieldErrors": { … } } }
401	Missing or invalid X-Api-Key / X-Account-Id headers (or no session when called from the browser).	{ "error": "unauthorized" }
404	Unknown tool id, or calling /api/* on the website hostname instead of api.xtract.bot.	{ "error": "not found" }
413	Request body over the tool's max request size, or an image header declaring more megapixels than the tool's pixel budget.	{ "error": "…exceeds maxRequestBytes…" }
415	Content-Type isn't application/json on the json-body transport.	{ "error": "unsupported content-type; expected application/json" }
429	Monthly quota or the 200/min burst limit exhausted. Check Retry-After and the X-RateLimit-* headers.	{ "error": "monthly quota exceeded", "monthRemaining": 0, … }
5xx	Conversion engine failure. Safe to retry; if it persists, the input is hitting a bug — please report it.	{ "error": "…" }

Code samples

Built from the strip-script example.


curl -X POST https://api.xtract.bot/api/html-sanitize \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "X-Account-Id: $XTRACT_ACCOUNT_ID" \
  -H "X-Api-Key: $XTRACT_API_KEY" \
  -d '{
  "html": "<p>Hello <script>alert(\"xss\")</script> world!</p>",
  "preset": "strict"
}'