LLMs are trained to trust the context you give them — which is exactly what makes RAG pipelines vulnerable to wrong or poisoned documents. This tester injects a false premise into a context document, sends it to your own model, and checks whether the model echoes the falsehood or correctly resists it.
How it works
You provide a true fact, a plausible-but-false alternative, and a question whose answer depends on that fact. The tool builds a short context document stating the false claim, then asks your model the question against that context using your own API key (sent directly to OpenAI or Anthropic from your browser). It inspects the answer for the injected false value and for hedging/correction language, and reports a verdict: echoed the falsehood (bad) or resisted / flagged it (good).
Notes
The verdict is a heuristic — always read the full response yourself, because the dangerous case is an answer that quietly repeats the injected value without flagging it. A robust, well-grounded model should correct the claim, note the contradiction, or express uncertainty. Run the same test across models and temperatures to see which configuration is most faithful. Your API key is used only for the direct provider request and is never stored.