What is hallucination?
In AI, a hallucination is output that is fluent, confident, and plausible — but factually wrong or unsupported. Because a large language model generates text by predicting likely next tokens rather than looking up verified facts, it can produce a convincing answer that has no basis in reality. The danger is that hallucinations rarely look wrong; they are phrased with the same authority as correct answers.
Intrinsic vs extrinsic hallucination
Researchers usually split hallucinations into two kinds:
- Intrinsic — the output directly contradicts the source you provided. For example, you supply a contract and the model states a clause that says the opposite of what the document actually says.
- Extrinsic — the output adds claims that cannot be verified from the source at all. The information is not contradicted by the source, but it is invented and unsupported.
Both are problematic, but extrinsic hallucinations are often harder to catch because there is nothing in the supplied context to flag the contradiction.
Why language models hallucinate
Several factors drive hallucination:
- Next-token prediction — the model optimises for plausibility, not truth.
- Training gaps — when a topic is rare or poorly represented in training data, the model interpolates a likely-sounding answer.
- Pressure to answer — models are tuned to be helpful, so they tend to produce an answer rather than admit uncertainty.
- Decoding settings — higher temperature increases randomness and the chance of drifting away from grounded facts.
How to detect and reduce hallucination
You cannot eliminate hallucination, but you can shrink it:
- Grounding (RAG) — give the model retrieved, trusted documents and ask it to answer only from them.
- Ask for citations — require the model to quote or reference its source so unsupported claims become obvious.
- Lower temperature — for factual tasks, a temperature near 0 reduces creative drift.
- Permit uncertainty — explicitly tell the model it may answer “I do not know” instead of guessing.
- Verify — for anything that matters, a human or a second system should check the claim against a reliable source.
The safest mindset is to treat every confident factual statement from an LLM as a draft that needs verification, not as a settled fact.