LLM Citation URL Validator

Extract and validate URLs cited by an LLM — check format and flag hallucinations.

Ad placeholder (leaderboard)

LLM citation URL validator

Language models love to cite sources — and sometimes invent them. A fabricated URL looks completely plausible until someone clicks it and lands on a 404. This tool extracts every link from an LLM’s output and runs a battery of static checks that flag the URLs most likely to be hallucinated, so you can triage your references before they ship.

How it works

The extractor scans for http/https URLs, including those inside markdown [text](url) syntax and parentheses, and trims trailing punctuation. Each URL is then parsed with the browser’s URL constructor for format validity, its top-level domain is checked against a list of common TLDs, and a set of heuristics looks for hallucination tells — placeholder hosts like example.com, suspiciously long random-looking path segments, far-future dates, and duplicated paths across multiple citations. URLs are grouped into OK, malformed, and suspicious, each with a short reason. Because browsers block requests to arbitrary origins, the tool deliberately does not fetch the links — it tells you which ones to open yourself.

Tips and notes

  • Suspicious is not the same as wrong. A long path can be a real permalink. The flags are a triage queue, not a verdict — open the flagged ones first.
  • Watch for repeated paths. When several citations share an identical trailing path, the model often pasted a template it filled with fake IDs.
  • Bare domains need a scheme. A reference written as acme.com/report without https:// is flagged so you can normalise it before linking.
  • Everything is local. No network calls, so confidential drafts stay private.
Ad placeholder (rectangle)