# Code page 936 (IBM)

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Code_page_936_(IBM)
> Markdown URL: https://mediated.wiki/source/Code_page_936_(IBM).md
> Source: https://en.wikipedia.org/wiki/Code_page_936_(IBM)
> Source revision: 1247817584
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Superseded Simplified Chinese character encoding, structured similarly to Shift JIS

IBM-936 Alias(es) SHIFTGB[1] Language Simplified Chinese Created by IBM Current status Deprecated Transforms / Encodes GB 2312 Succeeded by IBM-1381 Other related encoding Shift JIS

**IBM code page 936** is a character encoding for [Simplified Chinese](/source/Simplified_Chinese) including 1880 [user-defined characters](/source/Private_Use_Areas#Private-use_characters_in_other_character_sets) (UDC), which was superseded in 1993. It is a combination of the single-byte [Code page 903](/source/Code_page_903) and the double-byte **Code page 928**.[2][3] **Code page 946** uses the same double-byte component, but an extended single-byte component ([Code page 1042](/source/Code_page_1042)).[2][4]

IBM code page 936 should not be confused with [the identically numbered Windows code page](/source/Windows-936), which is a variant of the [GBK](/source/GBK_(character_encoding)) encoding;[2] GBK is called [Code page 1386](/source/Code_page_1386) by IBM. While GBK is a superset of the [EUC-CN](/source/EUC-CN) encoding of [GB 2312](/source/GB_2312), IBM-936 uses a different coded form of GB 2312, more closely resembling the relationship of [Shift JIS](/source/Shift_JIS) to [JIS X 0208](/source/JIS_X_0208).

## History

Except for [Shift JIS](/source/Shift_JIS) itself, the similarly structured code pages for other [CJK](/source/CJK_characters) locales were phased out between 1992 and 2016.

The encoding was in use mainly during the 1980s and early 1990s. While the original IBM PC ([IBM 5150](/source/IBM_5150)) lacked functionality for processing data in [CJK](/source/CJK_characters) languages, the [IBM 5550](/source/IBM_5550) possessed such functionality, and was available in models supporting Japanese, [Korean](/source/Korean_language), [Traditional Chinese](/source/Traditional_Chinese) or [Simplified Chinese](/source/Simplified_Chinese). Code page 936 for Simplified Chinese accompanied [code page 932](/source/Code_page_932_(IBM)) ([Shift JIS](/source/Shift_JIS)) for Japanese, [code page 934](/source/Code_page_934) for Korean and [code page 938](https://en.wikipedia.org/w/index.php?title=Code_page_938&action=edit&redlink=1) for Traditional Chinese.

The last revision of IBM-928/936/946 was documented in 1992, and it was superseded in 1993 by the [EUC-CN](/source/EUC-CN)-based [code pages 1380 through 1383](/source/Code_page_1380); code page 1380 encodes the same characters as code page 928, but in a different layout.[5] As of 1998, "some older Chinese packages" still included an algorithm for converting between IBM-936 and other encodings of GB 2312.[1]

## Status

Although chart definitions for Code page 1380 (the document C-H 3-3220-130 1993-11) are provided online by IBM, IBM does not similarly provide the chart definition for the older Code page 928 (the document C-H 3-3220-130 1992-11, i.e. an earlier revision of the same specification).[5][6] [International Components for Unicode](/source/International_Components_for_Unicode) (ICU) does not include an IBM-936 or IBM-946 codec, and uses the Windows code page for the "cp936" label.[7] The ICU project does possess mapping data for IBM-946, which it makes publicly available,[8] but does not ship it with ICU.

## Structure

Code page 928, the double byte component, includes 9,355 characters as double-byte sequences starting with 0x81 through 0xAC and 0xF0 through 0xFA.[9]

The 0x81–AC lead byte range is used for GB 2312 characters: lead bytes 0x81–87 were used for non-hanzi, 0x88–9C are used for level 1 hanzi and 0x9C–AC are used for level 2 hanzi.[1][5][8] Like [Shift JIS](/source/Shift_JIS), trail (second) bytes are in the range 0x40–FC excluding 0x7F, allowing two GB 2312 rows to be encoded per lead byte;[8] unlike Shift JIS, the bytes 0xA0–AC are not excluded from the lead byte range,[5][8] since [JIS X 0201](/source/JIS_X_0201) compatibility was not required. The 0xF0–FA lead byte range is used for IBM extensions: 0xF0 through 0xF9 are used for user-defined characters, and 0xFA is used for additional non-hanzi.[5]

## References

1. ^ [***a***](#cite_ref-leisher_1-0) [***b***](#cite_ref-leisher_1-1) [***c***](#cite_ref-leisher_1-2) Leisher, Mark (2008) [1998-03-06]. ["SHIFTGB.TXT: Shifted GB2312.1980. Generated from an algorithm provided with some older Chinese packages"](https://web.archive.org/web/20230120125054/http://sofia.nmsu.edu/~mleisher/Software/csets/SHIFTGB.TXT). Department of Mathematical Sciences, [New Mexico State University](/source/New_Mexico_State_University). Archived from [the original](http://sofia.nmsu.edu/~mleisher/Software/csets/SHIFTGB.TXT) on January 20, 2023.

1. ^ [***a***](#cite_ref-lunde2009_2-0) [***b***](#cite_ref-lunde2009_2-1) [***c***](#cite_ref-lunde2009_2-2) [Lunde, Ken](/source/Ken_Lunde) (2009). "Chapter 4: Encoding Methods (§ Code Pages)". *CJKV Information Processing* (2nd ed.). [Sebastopol, California](/source/Sebastopol%2C_California): [O'Reilly Media](/source/O'Reilly_Media). pp. 278–282. [ISBN](/source/ISBN_(identifier)) [978-0-596-51447-1](https://en.wikipedia.org/wiki/Special:BookSources/978-0-596-51447-1).

1. **[^](#cite_ref-3)** ["CCSID 936"](https://web.archive.org/web/20160327035758/http://www-01.ibm.com/software/globalization/ccsid/ccsid936.html). [IBM](/source/IBM). Archived from [the original](http://www-01.ibm.com/software/globalization/ccsid/ccsid936.html) on March 27, 2016.

1. **[^](#cite_ref-4)** ["CCSID 946"](https://web.archive.org/web/20160326215526/http://www-01.ibm.com/software/globalization/ccsid/ccsid946.html). [IBM](/source/IBM). Archived from [the original](http://www-01.ibm.com/software/globalization/ccsid/ccsid946.html) on March 26, 2016.

1. ^ [***a***](#cite_ref-ibm1380_5-0) [***b***](#cite_ref-ibm1380_5-1) [***c***](#cite_ref-ibm1380_5-2) [***d***](#cite_ref-ibm1380_5-3) [***e***](#cite_ref-ibm1380_5-4) "Table 1: Registration of GCSGID and CPGID for the IBM CH-S Graphic Character Set". [*C-H 3-3220-130 1993-11: IBM Simplified Chinese Graphic Character Set*](https://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP01380.pdf) (PDF). 1993. p. 6.

1. **[^](#cite_ref-6)** ["Code page 928 information document"](https://web.archive.org/web/20160317015802/http://www-01.ibm.com/software/globalization/cp/cp00928.html). Archived from [the original](https://www-01.ibm.com/software/globalization/cp/cp00928.html) on March 17, 2016.

1. **[^](#cite_ref-7)** ["windows-936-2000 (alias cp936)"](https://ssl.icu-project.org/icu-bin/convexp?conv=cp936). *ICU Demonstration – Converter Explorer*. International Components for Unicode.

1. ^ [***a***](#cite_ref-icu946_8-0) [***b***](#cite_ref-icu946_8-1) [***c***](#cite_ref-icu946_8-2) [***d***](#cite_ref-icu946_8-3) ["ibm-946_P100-1995"](https://github.com/unicode-org/icu-data/blob/main/charset/data/ucm/ibm-946_P100-1995.ucm). *[International Components for Unicode](/source/International_Components_for_Unicode) Data Repository*. [Unicode Consortium](/source/Unicode_Consortium), [IBM](/source/IBM).

1. **[^](#cite_ref-9)** ["CCSID 928 information document"](https://web.archive.org/web/20160326215312/http://www-01.ibm.com/software/globalization/ccsid/ccsid928.html). Archived from [the original](http://www-01.ibm.com/software/globalization/ccsid/ccsid928.html) on March 26, 2016.

v t e Character encodings Early telecommunication Telegraph code Needle Morse Non-Latin Wabun/Kana Chinese Cyrillic Baudot and Murray Fieldata ASCII ISO/IEC 646 BCDIC Teletex and Videotex/Teletext T.51/ISO/IEC 6937 ITU T.61 ITU T.101 World System Teletext background sets Transcode ISO/IEC 8859 Approved parts -1 (Western Europe) -2 (Central Europe) -3 (Maltese/Esperanto) -4 (North Europe) -5 (Cyrillic) -6 (Arabic) -7 (Greek) -8 (Hebrew) -9 (Turkish) -10 (Nordic) -11 (Thai) -13 (Baltic) -14 (Celtic) -15 (New Western Europe) -16 (Romanian) Abandoned parts -12 (Devanagari) Proposed but not approved KOI-8 Cyrillic Sámi Adaptations Welsh Estonian Ukrainian Cyrillic Bibliographic use MARC-8 ANSEL CCCII/EACC ISO 5426 5426-2 5427 5428 6438 6862 National standards ArmSCII Big5 BraSCII BSCII CNS 11643 DIN 66003 ELOT 927 GOST 10859 GB 2312 GB 12345 GB 12052 GB 18030 HKSCS ISCII JIS X 0201 JIS X 0208 JIS X 0212 JIS X 0213 KOI-7 KPS 9566 KS X 1001 KS X 1002 LST 1564 LST 1590-4 PASCII Shift JIS SI 960 TIS-620 TSCII VISCII VSCII YUSCII ISO/IEC 2022 ISO/IEC 8859 ISO/IEC 10367 Extended Unix Code (EUC) Code pages Mac OS ("scripts") Armenian Arabic Barents Cyrillic Celtic Central European Croatian Cyrillic Devanagari Font X (Kermit) Gaelic Georgian Greek Gujarati Gurmukhi Hebrew Iceland Inuit Keyboard Latin (Kermit) Maltese/Esperanto Ogham Roman Romanian Sámi Turkish Turkic Cyrillic Ukrainian VT100 DOS 437 737 850 858 861 862 863 864 865 866 867 868 869 899 904 932 936 942 949 950 951 1040 1043 1046 1098 1115 1116 1117 1118 1127 ABICOMP CS Indic CSX Indic CSX+ Indic CWI-2 Iran System Kamenický Mazovia MIK IBM AIX 895 896 912 915 921 922 1006 1008 1009 1010 1012 1013 1014 1015 1016 1017 1018 1019 1046 1133 Windows CER-GS 932 936 (GBK) 950 Extended Latin-8 1250 1251 1252 1253 1254 1255 1256 1257 1258 1270 Cyrillic + French Cyrillic + German Polytonic Greek EBCDIC Japanese language in EBCDIC DKOI DEC terminals (VTx) Multinational (MCS) National Replacement (NRCS) French Canadian Swiss Spanish United Kingdom Dutch Finnish French Norwegian and Danish Swedish Norwegian and Danish (alternative) 8-bit Greek 8-bit Turkish SI 960 Hebrew Special Graphics Technical (TCS) Platform specific 1052 1053 1054 1055 1058 Acorn RISC OS Amstrad CPC Apple II ATASCII Atari ST BICS Casio calculators CDC Compucolor 8001 Compucolor II CP/M+ DEC RADIX 50 DEC MCS/NRCS DG International Galaksija GEM GSM 03.38 HP Roman HP FOCAL HP RPL SQUOZE LICS LMBCS MSX NEC APC NeXT PETSCII PostScript Standard PostScript Latin 1 SAM Coupé Sega SC-3000 Sharp calculators Sharp MZ Sinclair QL Teletext TI calculators TRS-80 Ventura International WISCII XCCS ZX80 ZX81 ZX Spectrum Other ABICOMP ASMO 449 Digital encoding of APL symbols ISO-IR-68 ARIB STD-B24 Fieldata HZ IEC-P27-1 INIS 7-bit 8-bit ISO-IR-169 ISO 2033 KOI KOI8-R KOI8-RU KOI8-U Mojikyō SEASCII Stanford/ITS Symbol TRON Unified Hangul Code Unicode, ISO/IEC 10646 UTF-1 UTF-7 UTF-8 UTF-16 UTF-32 UTF-EBCDIC GB 18030 DIN 91379 BOCU-1 CESU-8 SCSU TACE16 Comparison of Unicode encodings TeX typesetting Cork LY1 OML OMS OT1 Control character Morse prosigns C0 and C1 control codes ISO/IEC 6429 JIS X 0211 Unicode control, format and separator characters Whitespace characters Related topics CCSID Character encodings in HTML Charset detection Han unification Hardware code page MICR code Mojibake Variable-length encoding Character sets

---
Adapted from the Wikipedia article [Code page 936 (IBM)](https://en.wikipedia.org/wiki/Code_page_936_(IBM)) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Code_page_936_(IBM)?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
