Annotate LLM output with per-sentence confidence
When an LLM gives you a paragraph of claims, some are rock-solid and some are plausible guesses. This tool uses your own API key to ask a model to re-read the text sentence by sentence and assign each one a confidence score, then highlights the shaky claims so you know where to focus your fact-checking. Your key never leaves your browser.
How it works
- The pasted output is split into sentences locally.
- A single request is sent directly from your browser to OpenAI or
Anthropic, asking the model to return a JSON array of
{ sentence, confidence }objects, where confidence is 0–100. - The results are color-coded: green for high confidence, amber for medium, and red for low — the claims most worth verifying.
No proxy, no server: the only network call is the one from your machine to the provider you chose, authenticated with your key.
Tips and notes
- Calibration is imperfect. Treat a low score as “go check this”, not as proof of error — and never treat a high score as proof of truth.
- Use a cheap model (gpt-4o-mini or claude-3-5-haiku) for scoring; the task is simple and the cost per run stays tiny.
- This pairs well with a real source check: the annotator tells you where to look; you still confirm the facts against a primary source.