Big5 Encoder/Decoder

Traditional Chinese Big5 byte encoding and decoding

Ad placeholder (leaderboard)

Big5 is the established legacy encoding for Traditional Chinese, long used across Taiwan, Hong Kong, and Macau. It keeps ASCII as single bytes and stores each Traditional Chinese character as a two-byte pair. This tool encodes Traditional Chinese text into Big5 hex bytes and decodes Big5 bytes back into text.

How it works

ASCII characters 0x000x7F are stored as one byte. Every Chinese character is two bytes: a lead byte in 0x810xFE followed by a trail byte in 0x400x7E or 0xA10xFE. The gap between 0x7F and 0xA0 in the trail range is what keeps Big5 distinguishable from other double-byte schemes.

To use the exact Big5 mapping, the tool enumerates the single-byte range and all valid lead/trail pairs, decodes each pair with the browser’s native Big5 decoder, and builds a character-to-bytes map. Encoding looks each input character up in that map; decoding passes the hex bytes through the native decoder.

Example and notes

  • "中文" encodes to a4 a4 a4 e5 — two characters, each a two-byte pair, with as A4 A4 and as A4 E5.
  • Big5 is for Traditional Chinese; Simplified-only characters and many rare ideographs are not in the table and will be flagged as unmapped.
  • Several Big5 extensions exist (Big5-HKSCS adds Hong Kong characters). This tool follows the encoding your browser’s standard Big5 decoder implements; for full Unicode coverage, use UTF-8.
Ad placeholder (rectangle)