Training large models consumes a lot of electricity, and the resulting carbon depends heavily on where and when you train. This estimator turns your hardware and run details into a CO2e figure using the same multiply-through approach used in the widely cited Strubell et al. and Patterson et al. papers.
How it works
The calculation chains four multipliers:
power_kW = chip_TDP_watts × chips / 1000
energy_kWh = power_kW × hours × PUE
emissions = energy_kWh × grid_gCO2e_per_kWh (grams)
tonnes = emissions / 1,000,000
PUE scales the IT load up to the whole facility (cooling, power conversion), and the grid intensity converts energy into carbon. Default accelerator power draws use published board TDPs: A100 ≈ 400 W, H100 ≈ 700 W, V100 ≈ 300 W, and a TPU v4 chip ≈ 200 W.
Example and notes
Running 256 A100s for 300 hours at PUE 1.1 on a 400 g/kWh grid uses about 256 × 0.4 kW × 300 h × 1.1 ≈ 33,800 kWh and emits roughly 13.5 tonnes CO2e — about 2.4% of the estimated GPT-3 footprint. Move that same run to a 50 g/kWh low-carbon grid and emissions fall to under 1.7 tonnes, which shows why region choice often matters more than raw efficiency. This is a planning estimate; it omits chip utilisation, networking, idle time, and embodied hardware carbon.