What is a Unicode block?

A Unicode block is a contiguous range of code points reserved for a related set of characters, such as Basic Latin, Cyrillic or Emoticons. Each block has a name and a fixed hexadecimal range.

How do I read a code point like U+1F600?

The U+ prefix denotes a Unicode code point written in hexadecimal. U+1F600 is the grinning-face emoji, which falls inside the Emoticons block U+1F600 to U+1F64F.

Is this list exhaustive?

No. It covers the most commonly used blocks for everyday text and emoji. The full Unicode standard defines several hundred blocks. No data is sent anywhere.

Which block are emoji in?

There is no single emoji block. Face emoji live in Emoticons (U+1F600–U+1F64F), many objects and symbols in Miscellaneous Symbols and Pictographs (U+1F300–U+1F5FF), travel symbols in Transport and Map Symbols (U+1F680–U+1F6FF), and newer emoji in Supplemental Symbols and Pictographs (U+1F900–U+1F9FF).

What is the difference between a Unicode block and a script?

A block is a fixed contiguous range of code points, while a script (like Latin or Cyrillic) is a writing system whose characters may be spread across several blocks. The two often overlap but are not the same concept.

Why are Chinese, Japanese and Korean characters in one block?

CJK Unified Ideographs (U+4E00–U+9FFF) holds the Han characters shared across Chinese, Japanese and Korean. Because the scripts share thousands of common ideographs, Unicode unified them into one block rather than duplicating each.

How do I find which block a character belongs to?

Find the character's code point, then locate the block whose start and end range contains it. For example U+20AC (the euro sign) sits between U+20A0 and U+20CF, so it is in the Currency Symbols block.

What is the Unicode Blocks Reference?

A developer reference of common Unicode blocks, listing each block name, its hexadecimal code-point range and the characters it contains. Searchable and fully client-side — nothing is uploaded. It runs free in your browser on Gera Tools, with nothing uploaded.

Unicode Blocks Reference

Name: Unicode Blocks Reference
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Common Unicode blocks reference

Unicode organises its 149,000+ assigned code points into named blocks — fixed, non-overlapping ranges each reserved for a related family of characters, such as Basic Latin, Cyrillic, Mathematical Operators or Emoticons. This searchable reference lists the blocks developers and writers encounter most often, each with its hexadecimal code-point range and a short note on what it contains.

Why blocks matter for developers

Regex and input validation. When you need to accept only Latin text, or only CJK characters, or reject control characters, you need to know which ranges to include or exclude. A pattern like [�-] matches only ASCII; [Ѐ-ӿ] matches Cyrillic.

Font coverage. Rendering engines fall back to system fonts when a code point falls outside what the current font covers. Knowing the block helps you diagnose missing glyphs and choose fallback stacks.

Encoding bug diagnosis. A character that “looks like” another but behaves differently is often a near-identical code point from a different block — for example, a full-width A (U+FF21) from the Halfwidth and Fullwidth Forms block versus a standard A (U+0041). NFKC normalisation collapses these; knowing the blocks explains why.

How to use this reference

Every Unicode character has a single code point, written with a U+ prefix followed by its hexadecimal value — for example U+0041 is the letter A and U+20AC is the euro sign. To find a character’s block, look for the range that contains its code point.

The table filters live as you type. Search by:

Block name (e.g., “arrows”, “cyrillic”)
Hex range (e.g., “U+2200” highlights Mathematical Operators)
Content keyword (e.g., “emoji”, “superscript”)

Frequently needed blocks

Block	Range	Typical use
Basic Latin (ASCII)	U+0000–U+007F	English, standard punctuation, control chars
Latin-1 Supplement	U+0080–U+00FF	Accented letters: é ü ñ
Latin Extended-A	U+0100–U+017F	Czech, Polish, Turkish: č ž ș
General Punctuation	U+2000–U+206F	Em dash, en dash, typographic quotes
Currency Symbols	U+20A0–U+20CF	₽ ₩ ₪ ₴ ₿
Letterlike Symbols	U+2100–U+214F	™ ℅ ℃ ℉
Mathematical Operators	U+2200–U+22FF	∑ ∫ ≤ √ ∞ ≠
Box Drawing	U+2500–U+257F	Terminal UI frames
Dingbats	U+2700–U+27BF	✓ ✗ ❄
CJK Unified Ideographs	U+4E00–U+9FFF	Chinese, Japanese, Korean Han
Hangul Syllables	U+AC00–U+D7AF	Korean
Private Use Area	U+E000–U+F8FF	Font-specific icons (e.g., Font Awesome)
Emoticons	U+1F600–U+1F64F	😀 😂 😍 face emoji
Misc. Symbols and Pictographs	U+1F300–U+1F5FF	🌍 🎃 🏠
Supplemental Symbols	U+1F900–U+1F9FF	🤔 🦊 newer emoji

The emoji block myth

Emoji are not in one block. When people say “emoji block,” they usually mean the Emoticons block (U+1F600–U+1F64F), but objects, animals, flags, and many other emoji are scattered across Miscellaneous Symbols (U+2600–U+26FF), Dingbats (U+2700–U+27BF), Misc. Symbols and Pictographs (U+1F300–U+1F5FF), Transport and Map Symbols (U+1F680–U+1F6FF), and Supplemental Symbols and Pictographs (U+1F900–U+1F9FF). There is no single emoji range.

This reference covers the most-used blocks and runs entirely client-side — no data is uploaded.