IRI Encoder (RFC 3987)

Percent-encode an IRI to URI, preserving Unicode path chars

Ad placeholder (leaderboard)

An IRI (Internationalized Resource Identifier, RFC 3987) is a URL that may contain Unicode — handy for humans but not always accepted by strict clients. This tool converts an IRI to a plain ASCII URI by percent-encoding the non-ASCII parts, in your browser.

How it works

The conversion walks the IRI character by character:

  1. Characters in the allowed ASCII set — unreserved (A-Z a-z 0-9 - . _ ~) and reserved delimiters (: / ? # [ ] @ ! $ & ' ( ) * + , ; = and %) — are left as-is so the URL structure stays intact.
  2. Every other character (any non-ASCII, plus ASCII space and control characters) is encoded to its UTF-8 bytes, and each byte becomes %XX.

This is exactly the IRI-to-URI mapping from RFC 3987: it preserves the scheme, host separators, and query structure while making the identifier safe for systems that only accept ASCII URIs.

Tips and examples

The IRI https://example.com/café?q=naïve becomes https://example.com/caf%C3%A9?q=na%C3%AFve — the é and ï are encoded as their two-byte UTF-8 sequences, while the scheme, slashes, and the ? and = delimiters are untouched. Note that internationalized host names (the domain itself) are converted with Punycode (xn-- labels) by IDNA, not by percent-encoding, so this tool focuses on the path, query, and fragment.

Ad placeholder (rectangle)