URL & Link Extractor from LLM Output

Extract all URLs from LLM-generated text and check their format.

Ad placeholder (leaderboard)

Language models confidently cite URLs that look real but do not exist. This tool pulls every link out of a block of LLM output, removes duplicates, checks each one for valid format, and flags the patterns that usually signal a hallucinated or unusable link — so you can verify before you publish or click.

How it works

A regular expression scans the text for both http/https URLs and bare www. domains, trims trailing punctuation, and deduplicates the results. Each link is then parsed with the browser’s URL API and checked against several heuristics: uncommon or missing top-level domains, example.* placeholder domains, localhost and loopback hosts, non-ASCII characters that can hide homograph attacks, unusually long hostnames, and opaque path segments LLMs tend to invent. Valid, unremarkable links are marked clean; anything questionable is flagged with the reason. Everything runs client-side.

Tips and notes

A green check means the URL is well-formed, not that the page exists — the tool cannot fetch links, so always open flagged ones (and ideally the clean ones too) before trusting a model’s citations. The uncommon-TLD flag is intentionally conservative and will sometimes flag legitimate but rare domains; treat it as a prompt to look closer. Use the copy button to pull a clean, deduplicated list of links into a checker or your notes. Because nothing leaves your browser, it is safe to run on confidential drafts.

Ad placeholder (rectangle)