Complete Simplified Chinese character set — 7,445 characters across 94 zones. Browse by zone, search by character, and inspect EUC-CN encoding.
Jump:
Character
GB2312 Code
Zone · Position
EUC-CN Hex
Unicode
Classification
Copied!
How to Browse the GB2312 Table
Four simple steps to navigate the encoding standard.
1
Select a Zone
Use the color-coded zone tabs to navigate 94 zones. Yellow = symbols (01-09), Green = Level 1 Chinese (16-55), Blue = Level 2 Chinese (56-87), Gray = reserved zones (10-15, 88-94).
2
Browse the Character Grid
Each zone displays up to 94 characters in a 10-column grid. Every cell shows the character and its 2-digit position code within that zone.
3
Click for Encoding Details
Click any character to see its GB2312 zone-position code (区位码), EUC-CN hexadecimal encoding, and Unicode code point — all in a detail popup.
4
Search by Character
Type a Chinese character into the search box to instantly find and navigate to the zone containing it. Copy characters with one click.
What is GB2312?
GB2312 (GB0, 国家标准 2312) is the People's Republic of China's national standard for Simplified Chinese character encoding, published by the China National Bureau of Standards in 1980 and effective since May 1, 1981. It was the first standardized Chinese character encoding and formed the foundation for all subsequent Chinese encoding standards including GBK, GB18030, and influenced the design of Unicode's CJK Unified Ideographs block.
GB2312 encodes 6,763 Chinese characters (3,755 Level 1 + 3,008 Level 2) plus 682 full-width symbols — including Latin letters, Greek letters, Cyrillic letters, Japanese hiragana and katakana, Chinese punctuation, and typographic symbols — for a total of 7,445 assigned code points out of 8,836 possible positions.
GB2312 Zone-Position Structure (区位码)
GB2312 uses a two-dimensional coordinate system: 94 zones (区, qū) numbered 01–94, each containing 94 positions (位, wèi) numbered 01–94. This gives 94 × 94 = 8,836 possible code points. A character's location is expressed as its 区位码 (qūwèi mǎ): the zone number followed by the position number, e.g., 啊 is at zone 16, position 01, written as 16-01.
Zones
Content
Count
Sort Order
01 – 09
Full-width symbols & punctuation
846
By character type
10 – 15
Reserved (later expanded by GBK)
—
—
16 – 55
Level 1 Chinese (一级汉字)
3,755
By Pinyin (拼音)
56 – 87
Level 2 Chinese (二级汉字)
3,008
By Radical + Stroke (部首/笔画)
88 – 94
Reserved (later expanded by GBK)
—
—
EUC-CN Encoding: How GB2312 Works in Computers
GB2312 defines a logical coordinate system (zone, position), but computers need a byte-level encoding. The standard computer representation is EUC-CN (Extended Unix Code for China), which encodes each GB2312 character as two bytes:
Byte 1 = Zone + 0xA0 (0xA1 – 0xFE)
Byte 2 = Position + 0xA0 (0xA1 – 0xFE)
Adding 0xA0 (160 in decimal) to each coordinate ensures that both bytes are always ≥ 0xA1, placing them safely above the ASCII range (0x00–0x7F) and allowing GB2312 text to coexist with ASCII in the same document without ambiguity. For example, the character 啊 at zone 16, position 01 is encoded as bytes 0xB0 0xA1 (16+160=176=0xB0, 1+160=161=0xA1). Each byte is in the range 0xA1–0xFE (161–254), giving 94 possible values each and enabling the full 94×94 coordinate space.
GB2312 vs GBK vs GB18030
Standard
Year
Characters
Encoding
Notes
GB2312
1980
7,445
2-byte EUC-CN
Original standard; Simplified Chinese only
GBK
1995
21,886
2-byte (extended range)
Backward-compatible superset; includes all GB2312 chars + Traditional Chinese + more symbols; uses zones 10-15 and 88-94
GB18030-2005
2005
70,244+
1/2/4-byte variable
Mandatory in China since 2006; covers all of Unicode; fully backward-compatible with GB2312 and GBK
GB18030-2022
2022
87,887+
1/2/4-byte variable
Latest revision; includes minority scripts
Key point: Any valid GB2312 text is also valid GBK and GB18030. The EUC-CN byte pairs for the original 7,445 GB2312 characters are identical across all three standards — making GB2312 fully forward-compatible.
Level 1 vs Level 2 Chinese Characters
GB2312 splits Chinese characters into two levels based on frequency of use:
Level 1 (一级汉字, zones 16–55): 3,755 of the most commonly used Chinese characters, sorted by Hanyu Pinyin (汉语拼音) alphabetical order. Characters sharing the same pronunciation are sorted by stroke count. These cover approximately 99.9% of everyday Chinese text — newspapers, books, websites, and daily communication.
Level 2 (二级汉字, zones 56–87): 3,008 less common characters, sorted by Kangxi radical (康熙部首) and then by stroke count. These include characters used in names, place names, classical literature, and specialized terminology. While individually rare, they are essential for comprehensive Chinese text processing.
Historical Significance
GB2312 was developed at a pivotal moment in Chinese computing history. Before its adoption, Chinese text could not be reliably represented in digital form — dozens of incompatible encoding schemes existed. GB2312 unified Simplified Chinese encoding under a single national standard, enabling the first generation of Chinese-language software, operating systems (CCDOS, early Windows Chinese editions), email, and eventually the Chinese internet. Its design principles — the 94×94 zone grid, the 0xA0 offset for ASCII coexistence, and the pinyin/radical sort orders — directly influenced GBK, GB18030, Big5 (Traditional Chinese), and even the organization of Unicode's CJK Unified Ideographs block. While modern systems now use Unicode (UTF-8), understanding GB2312 remains essential for working with legacy Chinese systems, embedded devices, mainframe terminals, and historical digital archives.
Frequently Asked Questions About GB2312
What is GB2312 and why was it created?
GB2312 (国家标准 2312, National Standard 2312) is China's original Simplified Chinese character encoding standard, published in 1980 and effective May 1, 1981. It was created to solve the fundamental problem of representing Chinese characters in digital computer systems, which were originally designed for Latin alphabets. Before GB2312, there was no unified standard — different systems used incompatible encodings, making data exchange impossible. GB2312 defined a single, authoritative encoding for 6,763 Chinese characters and 682 symbols, laying the foundation for Chinese-language computing for the next two decades.
How do GB2312, GBK, and GB18030 differ?
GB2312 (1980) is the original standard with 7,445 characters in a 94×94 zone grid. GBK (1995) is a backward-compatible superset that extended the encoding to 21,886 characters by filling previously reserved zones (10-15, 88-94) and extending byte ranges. GB18030 (2000, revised 2005 and 2022) is the current mandatory Chinese national standard, supporting over 87,000 characters using a variable-length 1/2/4-byte encoding. Crucially, all three standards are fully backward-compatible — the EUC-CN byte pairs for the original GB2312 characters are identical in GBK and GB18030, meaning a GB2312 document remains valid in both newer standards.
How does EUC-CN encoding work mathematically?
EUC-CN encodes each GB2312 character as exactly 2 bytes. Given a zone number Z (01–94) and position number P (01–94), the two bytes are calculated as: Byte1 = Z + 160 (0xA0), Byte2 = P + 160 (0xA0). This produces byte values in the range 0xA1–0xFE (161–254). To decode, subtract 160 from each byte: Z = Byte1 − 160, P = Byte2 − 160. For example, the character 电 (diàn) is at zone 21, position 71, so its EUC-CN encoding is 0xB5 0xC7 (21+160=181=0xB5, 71+160=231=0xC7).
What are the 94 zones and 94 positions?
The GB2312 standard organizes all characters into a 94×94 grid: 94 zones (区, qū) numbered 01–94 horizontally, and 94 positions (位, wèi) numbered 01–94 vertically within each zone, for a total of 94×94=8,836 theoretical code points. Zone 01-09 contain symbols and punctuation (full-width Latin, Greek, Cyrillic, Japanese kana, box-drawing characters). Zones 10-15 and 88-94 were originally reserved. Zone 16-55 contain the 3,755 most common Chinese characters (Level 1), sorted by pinyin. Zone 56-87 contain 3,008 less common Chinese characters (Level 2), sorted by radical and stroke count.
How many Chinese characters are in GB2312?
GB2312 contains 6,763 Chinese characters, divided into Level 1 (3,755 characters, zones 16–55) and Level 2 (3,008 characters, zones 56–87). In addition, it includes 682 full-width symbols and punctuation marks for a total of 7,445 assigned code points. The 6,763 characters were carefully selected to cover the vast majority of modern Chinese text — Level 1 alone covers approximately 99.9% of everyday usage.
Is GB2312 still used today?
While GB2312 is no longer the primary encoding for new systems — having been superseded by GBK (1995) and GB18030 (2000, mandatory since 2006), and increasingly by Unicode (UTF-8) — it remains important for several reasons: legacy system compatibility (many older Chinese databases, embedded systems, and industrial equipment still use GB2312), historical digital archives (Chinese government documents and early Chinese internet content), educational contexts (teaching how Chinese encoding evolved), and as a subset guarantee (any GB2312 text is valid GBK, GB18030, and can be losslessly converted to Unicode). Understanding GB2312 remains essential for engineers working with Chinese legacy systems.
How do I convert GB2312 to Unicode?
Converting GB2312-encoded text to Unicode is straightforward because there is a fixed, one-to-one mapping between GB2312's EUC-CN byte pairs and Unicode code points. In modern programming languages: in Python, use bytes.decode('gb2312') or bytes.decode('gbk'); in JavaScript, use new TextDecoder('gbk').decode(bytes) or new TextDecoder('gb2312').decode(bytes); on the command line, use iconv -f GB2312 -t UTF-8 input.txt > output.txt. All 7,445 GB2312 characters have exact Unicode equivalents in the CJK Unified Ideographs block (U+4E00–U+9FFF) and related symbol blocks. The conversion is lossless in both directions.
What does 区位码 (zone-position code) mean?
区位码 (qūwèi mǎ, zone-position code) is the coordinate system used by GB2312 to locate characters. Every character is identified by a 4-digit number: the first two digits are the zone (区, 01–94) and the last two digits are the position within that zone (位, 01–94). For example, the character 中 (zhōng, "middle") is at zone 54, position 48, so its location code is written as 5448. This system makes it easy to manually look up characters, type characters on numeric keypads (common on early Chinese typewriters and phones), and reference characters in technical documentation without needing to display the actual glyph. The zone-position code directly maps to EUC-CN bytes: add 160 (0xA0) to each component.