Inference Cost vs Quality Frontier Explorer

Plot every major LLM on the cost-quality Pareto frontier

Ad placeholder (leaderboard)

See the cost-quality trade-off at a glance

Choosing an LLM is a trade-off between price and capability. This explorer plots major models with cost per 1M tokens against a benchmark quality score, then highlights the Pareto frontier — the models that give the most quality for their cost. Everything below the frontier is a worse deal.

How the frontier works

A model is Pareto-optimal if no other model beats it on both axes at once — that is, nothing else is simultaneously cheaper and higher quality:

dominated(A) = exists B such that cost(B) <= cost(A)
                                  and quality(B) >= quality(A)
                                  and B != A
frontier = models that are not dominated

The frontier traces the efficient trade-off curve. Picking off the frontier means you are leaving quality or money on the table — there is a strictly better model available.

Tips for using the plot

  • Pick the cheapest frontier model above your quality bar. Set the minimum quality you need, then take the least expensive model that clears it.
  • Switch metrics to match the job. A coding workload should rank on HumanEval, not MMLU — the frontier reshapes per metric.
  • Validate on your own task. Public benchmarks guide the shortlist; your actual prompts decide the winner.
Ad placeholder (rectangle)