Is a token the same as a word?

No. A token is usually a chunk of a word — a common piece like "ing" or "tion" — not a whole word. Short, frequent words are often a single token, while long or rare words get split into several pieces. On average, one English word works out to roughly 1.3 tokens.

Why don't models just use whole words?

A fixed vocabulary of pieces lets the model represent any word, including ones it has never seen, by combining known chunks. This keeps the vocabulary a manageable size while still handling new names, typos, code, and other languages, which a pure word list never could.

Why should I care about tokens?

Tokens are the unit AI providers charge by and the unit context windows are measured in. Knowing roughly how text turns into tokens helps you estimate cost, stay under context limits, and understand why a long document costs more to process than a short one.

Do spaces and punctuation count as tokens?

Yes. The leading space before a word is normally part of that word's token, and punctuation marks like commas and full stops are usually their own tokens. That is why "hello" and " hello" can tokenise differently and why character count never matches token count exactly.

AI Tokens ELI5: What the Model Actually Reads

What a token actually is

When you send text to an AI model, it never sees your words or letters directly. It sees tokens — small chunks of text drawn from a fixed vocabulary. A token is often a whole short word like “the”, sometimes a word-piece like “ing” or “tion”, and sometimes a single character or punctuation mark. The model’s entire view of language is this stream of numbered chunks, which is why “tokenisation” is the very first thing that happens to anything you type.

How it works

Use the box below to type a sentence and watch it break into coloured token chunks in real time. You will notice a pattern: frequent everyday words usually become a single token, while long, rare, or made-up words get sliced into several pieces. Spaces and punctuation get counted too — the leading space before a word is normally bundled into that word’s token. The tool also shows you the word count and character count alongside the token count, so you can feel the gap between how you read text and how the model reads it.

Why this matters

Tokens are not an academic detail — they are the unit that runs the economics of AI. Providers price their APIs per token, and every model’s context window (how much it can read at once) is measured in tokens, not words. A useful rule of thumb is that one English word is about 1.3 tokens on average, so a 1,000-word document is roughly 1,300 tokens. Code, unusual names, and other languages tokenise less efficiently and cost more per word. Understanding this is the difference between guessing at your AI bill and estimating it, and between mysteriously hitting a context limit and knowing exactly how much room you have left.