Token Cost Heatmap by Model

Color heatmap of 30+ LLMs from cheapest to priciest for your token ratio.

Ad placeholder (leaderboard)

Token cost heatmap by model

Picking the cheapest model is not just “smallest number wins” — it depends on your prompt-to-completion ratio, because providers price output tokens far higher than input. This heatmap costs your specific request across 30+ models and colors each row by price, so the best-value choice is obvious at a glance.

How it works

For every model the cost of one request is:

cost = (input_tokens / 1,000,000) × input_price
     + (output_tokens / 1,000,000) × output_price

The tool computes this for each model at your token counts, sorts cheapest to most expensive, and maps cost onto a green-to-red color scale relative to the models shown. A quality-tier filter lets you compare flagship, mid-range and fast models on equal footing instead of mixing a frontier model with a budget one.

Tips for choosing a model

  • Match the ratio to the task. Summarizing a long document (big input, small output) rewards models with cheap input; brainstorming (small input, big output) rewards cheap output.
  • Start in the fast tier. For classification, extraction and routing, a fast/mini model is often 10–20× cheaper and good enough.
  • Reserve frontier models. Use them where reasoning quality clearly moves the needle, not as a default.
  • Re-run when prices change. Edit the presets to your current contracted rates for an accurate ranking.
Ad placeholder (rectangle)