How current are the prices?

Prices reflect published list rates per million tokens and are clearly labelled as estimates. Providers change pricing frequently, so confirm the live rate in each provider's dashboard before committing.

Why is output more expensive than input?

Generating tokens costs more compute than reading them, so almost every provider charges 2-5x more per output token. The table shows input and output prices separately for this reason.

What does the context window mean?

It is the maximum number of tokens the model can consider at once — prompt plus response. Larger windows let you feed in whole documents or codebases. Gemini 1.5 Pro leads with up to 2M tokens.

Does the table call any APIs?

No. It is a static, curated reference rendered entirely in your browser. Filtering and sorting happen locally with no network requests.

What is the AI Model Comparison Table?

An interactive comparison matrix of leading large language models across context window, input and output price, speed, vision support and core strengths. Filter by provider, sort by cost, context or speed, and show vision-capable models only. It runs free in your browser on Gera Tools, with nothing uploaded.

AI Model Comparison Table

Name: AI Model Comparison Table
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

Every major LLM, side by side

Choosing a model means trading off cost, context window, speed and capability. This matrix puts the leading models from OpenAI, Anthropic, Google, Meta and Mistral in one place so you can compare them on the axes that actually drive your decision — and filter to just the ones that fit.

How it works

The table is a curated dataset of current flagship and workhorse models. Each row lists the context window, input and output price per million tokens, relative speed, whether it supports vision (image input), and a short note on what the model is best at. Use the provider filter to focus on one vendor, the sort control to rank by cheapest, largest context or fastest, and the vision toggle to hide text-only models. All filtering happens in your browser.

How to read it

Cost vs capability: the cheapest models (GPT-4o mini, Gemini 1.5 Flash) handle the majority of everyday tasks; reserve premium reasoning models (o1, Claude 3 Opus) for genuinely hard problems.
Context window: if you are feeding whole documents or codebases, Gemini 1.5 Pro’s 2M-token window or Claude’s 200K window matter more than raw quality.
Speed: “Fast” models suit interactive chat and high-volume pipelines; “Slow” reasoning models trade latency for harder problem-solving.

The three decisions the table is built for

Picking a default workhorse. Sort by cheapest and scan the “Best at” column. The lowest-cost models that cover your main use case (summarisation, classification, drafting) become your default. Premium models are for the exceptions.

Verifying a model upgrade is worth it. When a newer version of a model arrives, compare the two rows directly: if the price dropped and the context window grew, upgrading is usually a free improvement. If the price rose, look at what capability you are actually buying.

Evaluating a provider for a new use case. Filter to vision-capable models when you need image input, then compare context windows and cost. Not every vision model handles the same tasks equally — the “Best at” note flags where each provider tends to lead.

Why input and output prices are listed separately

Almost every provider charges significantly more per output token than per input token, because generating tokens requires more compute than reading them. For a workload that is prompt-heavy (for example, sending a long document for summarisation), input cost dominates. For a workload that produces long responses (code generation, long-form drafts), output cost is the bigger driver. Knowing both lets you estimate the real cost of your specific usage pattern rather than relying on a single blended rate.

Treat the prices as a planning estimate. For an exact monthly figure based on your own token volume, pair this with the LLM Pricing Calculator.