Question 1

What is the difference between zero-shot and few-shot prompting?

Accepted Answer

Zero-shot prompting gives the model only an instruction and the task, with no worked examples. Few-shot prompting includes a handful of input-output examples in the prompt to demonstrate the desired pattern before the real task. Few-shot usually improves accuracy and format consistency on tricky or unusual tasks, at the cost of a longer, more expensive prompt.

Question 2

When should I fine-tune instead of using examples in the prompt?

Accepted Answer

Fine-tune when you need consistent behaviour or a specific style across many requests, when few-shot examples would make the prompt too long or costly, or when latency matters and you want a short prompt. Fine-tuning bakes the behaviour into the model's weights. It is overkill for one-off tasks and a poor way to inject facts that change frequently — use retrieval for those.

Question 3

Does few-shot prompting cost more than zero-shot?

Accepted Answer

Yes. Every example you add to the prompt is extra input tokens, billed on every call and counting against the context window. For high-volume tasks, the per-request cost of long few-shot prompts can exceed the one-time cost of fine-tuning a smaller model. The trade-off is convenience and flexibility now versus lower marginal cost later.

Question 4

Can I combine these approaches?

Accepted Answer

Absolutely, and strong systems often do. A common pattern is to fine-tune a model for a consistent base behaviour, then still use a short few-shot prompt or retrieved context at request time for the specific case. Start with the simplest method that works — usually zero-shot — and add examples or fine-tuning only when measurement shows you need them.

Zero-Shot vs Few-Shot vs Fine-Tuning: When to Use Each

Three ways to adapt a model

Zero-shot: just ask

Few-shot: teach by example

Fine-tuning: bake it in

A quick decision guide