LLM Energy Cost Calculator

Estimate the electricity cost and carbon footprint of your LLM usage

Ad placeholder (leaderboard)

Every token your application generates burns real electricity on a GPU somewhere. This calculator turns monthly token volume into estimated compute hours, an electricity bill, and a CO2-equivalent footprint so you can budget AI spend in both dollars and carbon.

How it works

The model is deliberately simple and transparent:

  1. Energy per token. Each GPU has a typical power draw (kW) and a sustained inference throughput (tokens/second). Dividing power by throughput gives joules per token, which converts to kWh per token.
  2. GPU energy. Multiply energy per token by your monthly token volume.
  3. Grid energy. Multiply by a PUE factor to add data-center overhead (cooling, power distribution).
  4. Cost and carbon. Multiply grid energy by your electricity price for cost, and by your grid’s carbon intensity for emissions.

Defaults reflect published specs: an A100 draws ~0.4 kW and serves on the order of 1,500 tokens/s for a mid-size model; an H100 draws ~0.7 kW but sustains roughly 3,000 tokens/s, making it more energy-efficient per token despite higher peak power.

Worked example

Serving 500 million tokens/month on H100s at $0.15/kWh, grid 0.25 kg CO2/kWh, PUE 1.2:

  • Energy/token: 0.7 kW ÷ 3,000 tok/s ≈ 6.5e-8 kWh/token
  • GPU energy: ~32.4 kWh → with PUE 1.2 ≈ 38.9 kWh
  • Cost:$5.83/month
  • Carbon:9.7 kg CO2/month

That is roughly the emissions of driving a petrol car 50 to 80 km — small per app, but material at fleet scale.

Tips

  • H100s usually win on energy-per-token despite higher wattage; throughput matters more than peak draw.
  • Cleaner cloud regions can cut your reported footprint by 5 to 10× at no code cost — schedule batch jobs where the grid is greenest.
  • Combine with the LLM API Cost Calculator to see dollar and carbon budgets together.
Ad placeholder (rectangle)