If you are budgeting an API call or trying to fit text into a model’s context window, the practical question is: how many tokens will my text use? The quick answer for ordinary English is about 1.3 tokens per word, or equivalently roughly 0.75 words per token — so 1,000 tokens is around 750 words. But the relationship is an average, not a fixed rule, and understanding why helps you estimate more reliably.
Tokens are not words
Language models do not read words; they read tokens. A token is a chunk of text the model’s tokenizer has learned to treat as a unit. Common short words like “the” or “cat” are usually a single token. Longer or rarer words get split: a word like “tokenization” might become several tokens such as “token” + “ization”. Whitespace and punctuation are often bundled onto adjacent tokens rather than counted on their own. Because of this, you cannot get an exact count by dividing characters or words — you can only estimate, and then verify with a real tokenizer if precision matters.
A practical rule of thumb
For everyday English prose, these approximations are close enough for planning:
- 1 token is roughly 4 characters of English text.
- 1 token is roughly 0.75 words.
- 100 tokens is about 75 words.
- 1,000 tokens is about 750 words, or one to two pages of plain text.
These figures hold for normal writing. They drift for text that is mostly code, numbers, URLs, or unusual symbols, all of which tokenize less efficiently and therefore use more tokens per visible character.
Why language matters
Tokenizers are trained predominantly on English, so English is the most token-efficient language for most models. Text in languages with different scripts — for example Chinese, Japanese, Arabic, or Hindi — often breaks into more tokens per word, sometimes a token per character or less. The consequence is real: the same meaning expressed in a non-English language can cost more in API tokens and fill the context window faster. If you build a multilingual product, budget extra token headroom for non-English content rather than assuming the English ratios apply.
Why the count matters for cost and limits
Two things in any LLM are measured in tokens: price and capacity. APIs bill per token for both the prompt you send (input) and the text the model returns (output), so a chatty system prompt repeated on every call quietly adds up. Separately, every model has a context window — a maximum number of tokens it can consider at once — and exceeding it causes errors or truncation. Estimating tokens before you send large prompts, and trimming anything unnecessary, keeps costs predictable and avoids requests that silently overflow the window. When exact numbers matter, run your text through the model provider’s tokenizer rather than trusting the rule of thumb alone.