Definition
Instruction following is a large language model’s ability to take a request written in natural language and actually carry it out — answering the question asked, producing the requested format, and honouring constraints — instead of merely continuing the surrounding text. It is the capability that turns a raw language model into something people can use as an assistant, and it is the result of deliberate post-training rather than something that emerges from pre-training alone.
The base-model gap
A freshly pre-trained model — a base model — is optimised solely to predict the next token in web-scraped text. Give it the instruction “Write a haiku about autumn,” and it may well continue as though the instruction were part of a document: adding more instructions, drifting into unrelated text, or restating the prompt. It has no inherent concept that a request implies an obligation to comply. This gap between predicting plausible text and doing what is asked is exactly what instruction following closes.
How models learn to follow instructions
Two complementary techniques bridge the gap. The first is instruction tuning: supervised fine-tuning on large, diverse collections of instruction–response pairs, teaching the model the general pattern that a request should be answered directly and usefully. The second is RLHF (Reinforcement Learning from Human Feedback), which goes further by optimising the model against a reward model built from human preference rankings, sharpening helpfulness, tone, and compliance. Most modern assistants — the “instruct” or “chat” variants of a model — have gone through both.
Measuring instruction following
Because “did it follow the instruction?” can be subjective, researchers built benchmarks around verifiable instructions. IFEval is the best-known: it poses requests with automatically checkable constraints, such as “answer in exactly three bullet points,” “include the word ‘sustainability’ twice,” or “respond in all lowercase.” A simple program can verify compliance without human judgement, giving a clean numeric score for how reliably a model obeys precise requirements.
Why it matters
Instruction following is the foundation of practical LLM applications. Reliable formatting, constraint adherence, and faithful task completion are what make a model usable in pipelines, agents, and products where output is consumed programmatically. When a model ignores constraints — overshooting a length limit or using a banned word — downstream systems break. Strong instruction following is therefore not a nicety but a core requirement, and it remains an active area of research as instructions grow longer, more numerous, and more contradictory.