Context window planner
Every model has a hard ceiling on how many tokens it can hold at once — the context window — and it is shared between your input and the model’s reply. Paste your system prompt and conversation history, reserve room for the response, and this planner tells you whether it all fits the model you picked, with a clear color-coded indicator.
How it works
The tool estimates tokens for your system prompt and conversation history using an English-calibrated heuristic, then adds the reserved completion tokens you specify. That total is compared against the selected model’s context window. The fit indicator turns green when you are comfortably under, amber as you approach the ceiling, and red when input plus reserved output would overflow — meaning the call would truncate or fail.
Tips and notes
Treat ~80% utilization as your practical ceiling: token estimates carry 5-10% error, special and formatting tokens add overhead, and responses often run longer than planned. If you are overflowing, the highest-leverage fixes are trimming or summarizing old conversation turns, moving long reference material to retrieval (RAG) instead of stuffing it inline, or switching to a larger-context model. Reserve generous completion room for reasoning models, which emit hidden thinking tokens.