AI response confidence estimator
Not every AI claim deserves the same suspicion. A model summarising a well-known concept is usually right; the same model citing a specific study, quoting a figure, or describing a last-week event is far more likely to invent something convincing. This estimator turns that intuition into a calibrated read: pick the domain, claim type, and time sensitivity, and get a reliability band plus a recommended verification step.
How it works
The estimate combines three independently documented risk factors. Domain sets a baseline — general knowledge and everyday reasoning score high, while law, medicine, and fast-moving technical specifics score lower because errors there are both more frequent and more costly. Claim type then adjusts that baseline: specific citations and exact statistics carry the largest penalty because fabrication rates for references and precise numbers are well above those for qualitative explanations. Finally, time sensitivity applies a further discount, since anything depending on recent events falls outside or near the edge of a model’s training data. The combined score maps to a confidence band and an action: trust, spot-check, or verify against a primary source.
Tips and notes
- Citations and numbers always get verified. They are the single most common hallucination type — treat a low score there as a hard rule.
- Recent-event claims need a live source. Models cannot reliably know what happened after their cutoff, even when they answer confidently.
- Use it as triage, not a verdict. A high score means “less scrutiny,” not “guaranteed correct.” High-stakes decisions still need a real source.