Fine-Tuning (AI Glossary)

Updating a pre-trained model's weights on a task-specific dataset

Ad placeholder (leaderboard)

Definition

Fine-tuning is the process of taking a model that has already been pre-trained on vast general data and continuing to train it on a smaller, task-specific dataset. The model’s weights are updated so it specialises in your domain, style, or output format while retaining the broad language ability it learned during pre-training. The result is a custom model that behaves the way you want without needing long, example-heavy prompts at run time.

Flavours of fine-tuning

The term covers a spectrum of techniques:

  • Full fine-tuning — every weight in the model is updated. Most powerful but expensive in compute and storage, and produces a full-size copy per task.
  • Supervised fine-tuning (SFT) — training on labelled (input, desired output) pairs; the standard way to teach a base model a specific task.
  • Instruction tuning — a form of SFT on many (instruction, response) pairs that turns a raw base model into a helpful, instruction-following assistant.
  • PEFT (parameter-efficient fine-tuning) — methods like LoRA and QLoRA that train only a tiny set of extra parameters, dramatically cutting cost and memory.

LoRA and QLoRA

LoRA (Low-Rank Adaptation) freezes the original weights and learns small, low-rank “adapter” matrices that adjust the model’s behaviour. Because these adapters hold a fraction of a percent of the parameters, you can train and store many task-specific versions cheaply. QLoRA goes further by quantizing the base model to 4-bit precision before applying LoRA, making it possible to fine-tune very large models on a single consumer GPU.

Fine-tuning vs. retrieval

A common mistake is fine-tuning to inject facts. Fine-tuning is excellent for teaching behaviour, tone, and format — but knowledge baked into weights goes stale and is hard to update. When information changes often or must be cited, retrieval-augmented generation (RAG) is usually the better tool, because it supplies fresh source documents at query time. Many production systems combine both: fine-tune for style, retrieve for facts.

When to fine-tune

Reach for fine-tuning when you need consistent output at scale (thousands of calls), when prompts with all needed examples would be too long or costly, or when prompting simply can’t deliver the reliability you need. Otherwise, start with prompting and retrieval — they are faster to iterate and avoid the data preparation, training cost, and maintenance burden that fine-tuning introduces.

Ad placeholder (rectangle)