Question 1

What causes LLMs to hallucinate?

Accepted Answer

A language model predicts plausible text, not verified facts. When it lacks the right information it fills the gap with confident-sounding fabrication. Hallucinations spike on niche, recent, or precise questions, on ambiguous prompts, and when the model is pushed to answer rather than allowed to say it does not know.

Question 2

Does RAG eliminate hallucinations?

Accepted Answer

No, but it sharply reduces them. Grounding the model in retrieved documents means it answers from supplied text rather than memory. Hallucinations still occur if retrieval misses the right passage or the prompt does not firmly instruct the model to answer only from context, so RAG must be paired with retrieval quality checks and an instruction to abstain.

Question 3

What is self-consistency sampling?

Accepted Answer

It means generating several answers to the same prompt at a non-zero temperature and comparing them. If the samples agree, confidence is high; if they diverge, the question is likely beyond the model's reliable knowledge and the answer should be flagged or withheld. It trades extra cost for a measurable confidence signal.

Question 4

How do I enforce citations?

Accepted Answer

Provide source passages with identifiers, instruct the model to attach a citation to every claim, and then programmatically verify that each cited passage actually supports the statement. Reject or flag any answer with unsupported or invented citations rather than trusting that they are real.

Question 5

How do I monitor hallucinations in production?

Accepted Answer

Log inputs, retrieved context, outputs, and any citations. Sample live traffic for automated faithfulness scoring with an LLM judge, track user feedback and correction signals, and alert when faithfulness drops. Every confirmed hallucination should become a new eval case so the failure is regression-tested going forward.

How to Reduce LLM Hallucinations in Production

Why models hallucinate

Grounding and abstention

Verification and sampling

Monitoring in production