xtract.bot
POST /api/docx-to-text

Extract plain text from a Microsoft Word .docx file. Strips all formatting and returns just the readable content. Useful for indexing, full-text search, or feeding documents to LLMs.

Reads a .docx Word document and returns just the plain text — paragraphs in order, no formatting, no styles, no embedded images. Useful for indexing documents in a search engine, building a Markdown / LLM-friendly version, or simply checking the prose word count without the visual noise of the original layout.

Inputs

NameTypeDefaultDescription
docx*fileInput.docx bytes.
paragraphSeparatorstring"\n\n"String inserted between paragraphs. Default blank-line.

Response

Modes: json, binary. Cache: yes (24h TTL).