Knowledge distillation ROI calculator
Distilling a large model like GPT-4o into a smaller fine-tuned student can cut your per-request cost dramatically — but only if your volume is high enough to earn back the upfront investment. This calculator weighs the one-time cost of generating teacher data and fine-tuning against the ongoing per-request savings to find your break-even point.
How it works
There are two costs to recover. First, you spend GPT-4o tokens generating high-quality labeled outputs for your training set. Second, you pay for the fine-tuning job itself. Together these form the upfront investment. Each request served by the cheaper student model then returns a fixed saving.
upfront = teacher_generation_cost + fine_tuning_cost
daily_saving = inference_cost_delta × daily_requests
payback_days = upfront / daily_saving
year_net = daily_saving × 365 − upfront
If your volume is low, the payback may stretch beyond a year — in which case staying on the large model is the rational choice.
Tips and notes
- Volume is everything. Distillation pays back fast at thousands of daily requests and may never pay back at dozens.
- Budget for evaluation. A student model needs a quality gate before it replaces the teacher; factor that effort in even though it is not a token cost.
- Re-distill as the teacher improves. When the teacher model gets cheaper or better, re-run the math — the break-even shifts.