xtract.bot
POST /api/html-to-text

Convert HTML to plain text. Block-level elements become paragraph breaks; lists become indented bullets; links keep their text. Configurable line wrapping.

Strips all HTML markup and returns the readable plain-text content. Block-level elements (paragraphs, headings, list items, table rows) become paragraph breaks; inline elements (spans, anchors, emphasis) collapse to their text content. Options: - `wrap` (default 80): line-wrap long paragraphs to this width. Set to `0` to disable wrapping. - `noLinks` (default false): drop link URLs and keep just the link text. With the default, links are kept as `[text](url)`. - `imgAlt` (default true): show the `alt` attribute as the text for `<img>` tags.

Inputs

NameTypeDefaultDescription
html*stringHTML source.
wordwrapnumber (0…1000)80Hard-wrap column (0 = disabled).
preserveLinksbooleantrueAppend href URLs inline next to link text.
preserveAltTextbooleantrueReplace `<img>` with their alt text.

Response

Modes: json, text. Cache: yes (24h TTL).

Code samples

Built from the article example.


curl -X POST https://api.xtract.bot/api/html-to-text \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "X-Account-Id: $XTRACT_ACCOUNT_ID" \
  -H "X-Api-Key: $XTRACT_API_KEY" \
  -d '{
  "html": "<h1>Hello</h1><p>Visit <a href=\"https://example.com\">our site</a>.</p><ul><li>One</li><li>Two</li></ul>"
}'