Code token estimator
Sending code to an LLM is more expensive than sending prose of the same length — operators, indentation, and short identifiers all fragment into extra tokens. This estimator measures your code with per-language density and tells you the token count, how much of a context window it eats, and the cost, before you spend a single API call discovering the file was too big.
How it works
The tool applies a character-per-token density tuned for each language: dense languages like JSON and minified code pack the fewest characters per token, while comment-heavy prose-like code sits closer to natural text. It estimates tokens from your pasted code, prices the input against your chosen model, and shows the fraction of a typical context window the file occupies. Everything runs in your browser.
Tips and notes
For large codebases, the practical question is usually “does this fit?” — the context window bar answers it at a glance and tells you whether to chunk or summarize. If you are sending the same files repeatedly (for example a fixed framework header on every request), prompt caching can make the repeated portion nearly free. Stripping comments and dead code cuts tokens but can degrade the model’s reasoning about the code, so trim carefully. Treat the count as a close estimate and confirm with the provider’s tokenizer before sizing a large automated run.