Sometimes you just want the words out of an ebook — to search them, count them, quote them, or feed them into another tool. This extractor pulls the full text out of an EPUB right in your browser, in proper reading order, with no reader app and no upload.
How it works
An EPUB is a ZIP archive of XHTML chapter files plus a package document. The extractor:
- Unzips the archive, inflating compressed entries with the browser’s native decompression API.
- Reads
META-INF/container.xmlto find the OPF package, then reads the OPF spine — the list that defines the book’s reading order. - For each content document in that order, it removes
<script>and<style>blocks, converts block-level tags to line breaks, strips the remaining HTML, and decodes character entities. - Joins the chapters together and reports a word count.
Tips and notes
- The output is plain text: paragraph and heading breaks are kept as newlines, but all styling, fonts, and images are dropped — exactly what you want for analysis or search.
- Use Copy text to paste into a document, or Download .txt to save the whole book as a single file.
- Because the spine drives the order, chapters come out as the author intended rather than in arbitrary file order.
- DRM-free, text-based EPUBs work best. DRM-encrypted or image-only books contain no extractable text. Everything runs locally, so your ebook never leaves your device.