How does this count words in a PDF without uploading it?

The PDF is parsed entirely in your browser. The tool reads the file as bytes, inflates the FlateDecode content streams and extracts the text-showing operators, then counts words from the resulting text. No data leaves your device.

Why might the word count differ slightly from Adobe or Word?

PDFs store text as positioned glyphs, not flowing paragraphs, so word boundaries are inferred from spacing operators. Different readers reconstruct spaces and line breaks slightly differently, which can shift the count by a small margin on heavily formatted documents.

Does it work on scanned PDFs?

No. Scanned PDFs are images of pages with no embedded text layer, so there are no characters to count. This tool only counts real, selectable text. Run OCR first if you need to count scanned content.

What counts as a word?

A word is any run of non-whitespace characters separated by spaces, tabs or line breaks. The character count includes spaces; a separate count of characters excluding spaces is also shown.

Is there a file size limit?

There is no hard limit, but very large PDFs (hundreds of pages) are parsed in your browser's memory, so extremely big files may be slow on low-memory devices. Most documents process in well under a second.

What is the PDF Word Count?

Free in-browser PDF word counter. Extracts text from every page of a PDF and counts words, characters, sentences and pages with a per-page breakdown. Nothing is uploaded — the file is parsed entirely on your device. It runs free in your browser on Gera Tools, with nothing uploaded.

PDF Word Count — Gera Tools

Name: PDF Word Count
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

A PDF word counter tells you how many words, characters and pages a document contains — useful for translators billing per word, students checking essay length, and legal teams sizing disclosure bundles. Unlike most online counters, this tool never uploads your file: the PDF is decoded entirely in your browser, so confidential contracts and unpublished manuscripts stay on your machine.

How it works

A PDF is not a flat text file. Each page’s text lives inside a content stream — a sequence of drawing operators — and those streams are usually compressed with FlateDecode (zlib/deflate). To count words the tool:

Reads the file as raw bytes and locates every stream … endstream object.
Inflates any FlateDecode streams using the browser’s native DecompressionStream.
Scans the decoded operators for text-showing commands — Tj, TJ, ' and " — and pulls the string literals and hex strings out of them.
Joins the strings into page text, inserting spacing where layout operators imply a gap, then counts words, characters, and sentences.

Words are runs of non-whitespace separated by whitespace. Sentences are approximated by counting terminal punctuation (., !, ?). Because PDFs position glyphs rather than store paragraphs, the count is an accurate estimate rather than a byte-perfect match to the authoring tool — minor differences of a few percent are normal.

Who uses PDF word counts and why

Translators are the primary use case. Professional translation is billed per source word, and many clients supply PDFs rather than editable source files. Getting a fast, privacy-safe word count from a locked or confidential PDF before accepting a job is exactly what this tool is for.

Students and academics face word limits for essays, dissertations, and journal submission guidelines. Checking a PDF export against the limit catches any discrepancy between the word processor’s count and what the PDF actually contains (headers, footnotes, and captions can differ between tools).

Legal and compliance teams sizing discovery or disclosure bundles need page and word counts to estimate review effort and cost before committing resources.

Technical writers and content teams use page-level breakdowns to spot sections that are unusually long or short before editing.

The per-page breakdown

The table showing word count per page is useful beyond the total. A 50-page document with 200 words on page 35 likely has an empty or near-empty page — possibly a separator, blank page, or an image-only page with no text layer. Spotting those pages explains why the total is lower than expected and can identify pages where OCR would be needed for a complete count.

Tips and common issues

Zero count on all pages: the PDF is almost certainly a scan with no text layer. Run OCR (Adobe Acrobat, Tesseract, or similar) to generate a text layer first.
Count much lower than expected: check whether some pages are image-heavy or whether the PDF uses subset-encoded fonts with non-literal character codes.
Encrypted PDFs: password-protected files cannot be decoded in the browser without the password. Remove the protection first.
Everything runs locally; no file data is transmitted.

PDF Word Count

Get one useful tool a week

How it works

Who uses PDF word counts and why

The per-page breakdown

Tips and common issues