GPT-4o token counter
Paste any text and instantly estimate how many tokens it uses under GPT-4o’s tokenizer. Tokens — not words or characters — are the unit OpenAI bills and limits on, so knowing the count helps you stay inside the 128K context window and predict cost before you send a request.
How the estimate works
GPT-4o tokenizes with o200k_base, a byte-pair encoding (BPE) vocabulary that
splits text into roughly 4 characters per token for typical English. This tool
applies that measured ratio, blended with a word-boundary heuristic, so the result
tracks the real tiktoken count closely without bundling the full vocabulary in
the browser. A system prompt, if you add one, is counted separately and summed into
the total request size.
Tips and notes
- Code, JSON, and non-Latin scripts pack fewer characters per token — expect a higher count than the English-tuned estimate.
- o200k_base is more token-efficient than the older cl100k_base, so the same text often costs slightly fewer tokens on GPT-4o than on GPT-4.
- For an exact count before a high-volume production call, run OpenAI’s own
tiktokenlibrary with theo200k_baseencoding.