Catch lookalike-character spoofing
Attackers register domains and create usernames that look identical to a trusted
name but use a different Unicode character — a Cyrillic а instead of a Latin
a, a Greek ο instead of an o, or a fullwidth a. This is the basis of the
IDN homograph attack. This tool scans your text and flags every non-ASCII
character that imitates an ASCII letter or digit, showing which character it is
pretending to be and its code point.
How it works
The detector holds a mapping table of well-known confusable characters keyed
by the ASCII letter or digit they resemble (for example Cyrillic а U+0430,
е U+0435, о U+043E; Greek ο U+03BF, ρ U+03C1; fullwidth forms
U+FF21–U+FF5A; and letterlike/math symbols). It walks the input by code point;
any character that is not in the ASCII range U+0000–U+007F is looked up
in the table. A hit is reported as “looks like x”, along with the character’s
U+XXXX value so you can confirm exactly what it is. Pure ASCII text produces no
flags.
Example
The string аpple.com looks like apple.com, but the inspector flags the first
character as Cyrillic а (U+0430) imitating Latin a. That single substitution
is enough to fool a quick glance and route a victim to an attacker-controlled
site.
Tips and notes
- When a domain is flagged, check whether your browser shows it as
xn--…Punycode — that confirms it is an internationalised (non-ASCII) name. - Whole-script confusables (an entire word written in Cyrillic that spells a Latin word) are the hardest to spot by eye and the most dangerous; this tool surfaces every non-ASCII letter so a mixed-script string stands out.