Token cost heatmap by model
Picking the cheapest model is not just “smallest number wins” — it depends on your prompt-to-completion ratio, because providers price output tokens far higher than input. This heatmap costs your specific request across 30+ models and colors each row by price, so the best-value choice is obvious at a glance.
How it works
For every model the cost of one request is:
cost = (input_tokens / 1,000,000) × input_price
+ (output_tokens / 1,000,000) × output_price
The tool computes this for each model at your token counts, sorts cheapest to most expensive, and maps cost onto a green-to-red color scale relative to the models shown. A quality-tier filter lets you compare flagship, mid-range and fast models on equal footing instead of mixing a frontier model with a budget one.
Tips for choosing a model
- Match the ratio to the task. Summarizing a long document (big input, small output) rewards models with cheap input; brainstorming (small input, big output) rewards cheap output.
- Start in the fast tier. For classification, extraction and routing, a fast/mini model is often 10–20× cheaper and good enough.
- Reserve frontier models. Use them where reasoning quality clearly moves the needle, not as a default.
- Re-run when prices change. Edit the presets to your current contracted rates for an accurate ranking.