A PDF word counter tells you how many words, characters and pages a document contains — useful for translators billing per word, students checking essay length, and legal teams sizing disclosure bundles. Unlike most online counters, this tool never uploads your file: the PDF is decoded entirely in your browser, so confidential contracts and unpublished manuscripts stay on your machine.
How it works
A PDF is not a flat text file. Each page’s text lives inside a content stream — a sequence of drawing operators — and those streams are usually compressed with FlateDecode (zlib/deflate). To count words the tool:
- Reads the file as raw bytes and locates every
stream … endstreamobject. - Inflates any FlateDecode streams using the browser’s native
DecompressionStream. - Scans the decoded operators for text-showing commands —
Tj,TJ,'and"— and pulls the string literals (the parts in parentheses( )) and hex strings out of them. - Joins the strings into page text, inserting spacing where the layout operators imply a gap, then counts words, characters and sentences.
Words are runs of non-whitespace separated by whitespace. Sentences are approximated by counting terminal punctuation (., !, ?). Because PDFs position glyphs rather than store paragraphs, the count is an accurate estimate rather than a byte-perfect match to the authoring tool.
Tips and notes
- If the count comes back as zero, the PDF is almost certainly a scan (image-only) with no text layer — there is nothing to count without OCR.
- Encrypted or password-protected PDFs cannot be decoded in the browser; remove the protection first.
- The per-page table is handy for spotting where the bulk of a document’s content sits, for example a long appendix.
- Everything runs locally, so you can safely use it on sensitive or unpublished material.