Fine-tuning dataset size estimator
“How many examples do I need?” is the first question of every fine-tuning project, and the honest answer is “it depends.” This tool turns that into a concrete starting number by weighing the three factors that matter most: what kind of task it is, how good the base model already is, and how high you need accuracy to go.
How it works
Each task type carries a base example count reflecting its difficulty — teaching a fixed output format or a writing style needs far fewer examples than teaching new classification boundaries or domain knowledge. The estimator then scales that figure by your baseline capability (a strong base model needs fewer examples to nudge) and by your target accuracy (chasing the last few percent costs disproportionately more data). The result is a minimum to start with and a recommended ceiling to plan toward.
Tips and notes
- Start at the minimum, not the maximum. Collect the smaller number, run a pilot fine-tune, and measure on a held-out set before labelling more.
- Quality beats quantity. A few hundred clean, diverse examples usually outperform thousands of noisy ones. Deduplicate and balance your classes.
- Match the eval to the goal. Your held-out set should reflect real production inputs, or your accuracy number will mislead you.
- Reserve a test split. Always hold back data the model never sees during training so your reported accuracy is honest.