{{Short description|Multi-byte graphic character set for Chinese}} {{Infobox character encoding | name = CCITT Chinese set (ISO-IR 165) | mime = iso-ir-165 | alias = {{code|CN-GB-ISOIR165}} ([[Extended Unix Code|EUC]] form)<ref name="rfc1922" /> | definitions = [[ISO-IR]] 165 | standard = [[Videotex character set|ITU T.101]], annex C | lang = [[Simplified Chinese]], [[English language|English]], [[Russian language|Russian]]<br/>'''Partial support:'''<br/>[[Greek language|Greek]], [[Japanese language|Japanese]] | status = | extends = [[GB 2312]] | prev = | next = [[GB 18030]] | otherrelated = | encodings = [[ISO-2022-CN|ISO-2022-CN-EXT]], [[Videotex character set|Videotex Data Syntax 2]] }} The '''CCITT Chinese Primary Set'''<ref name="chung" /> is a multi-byte graphic [[character set]] for [[Chinese language|Chinese]] communications created for the [[ITU-T|Consultative Committee on International Telephone and Telegraph (CCITT)]] in 1992.<ref name="lunde2009"> {{cite book |title=CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing |last=Lunde |first=Ken |authorlink=Ken Lunde |year=2009 |edition=2nd |publisher=[[O'Reilly Media|O'Reilly]] |location=[[Sebastopol, CA]] |isbn=978-0-596-51447-1 |pages=94–111 }}</ref> It is defined in [[Videotex character set|ITU T.101]], annex C, which codifies Data Syntax 2 [[Videotex]].<ref name="chung">{{cite web |url=https://appsrv.cse.cuhk.edu.hk/~irg/irg/irg50/IRGN2276.pdf |title=Pseudo-G8 characters |last=Chung |first=Jaemin |date=2018-01-24 |id=[[ISO/IEC JTC 1/SC 2]]/WG 2/[[Ideographic Research Group|IRG]] N2276}}</ref> It is registered with the [[ISO-IR]] registry for use with [[ISO/IEC 2022]] as '''ISO-IR-165''',<ref name="iso-ir"/> and encodable in the [[ISO-2022-CN|ISO-2022-CN-EXT]] code version.<ref name="rfc1922">{{citation|mode=cs1 |id=<nowiki>RFC 1922</nowiki> |title=Chinese Character Encoding for Internet Messages |url=https://tools.ietf.org/html/rfc1922 |first1=HF. |last1=Zhu |first2=DY. |last2=Hu |first3=ZG. |last3=Wang |first4=TC. |last4=Kao |first5=WCH. |last5=Chang |first6=M. |last6=Crispin |publisher=[[IETF]] |work=Requests for Comments |date=1996 |doi=10.17487/rfc1922|doi-access= |url-access=subscription }}</ref>

It is an extended modification of [[GB 2312|GB/T&nbsp;2312-80]], and corresponds to the union of the mainland Chinese [[GB standards]] '''GB 6345.1'''-86 and '''GB 8565.2'''-88, with some further modification and extensions. A subset of the GB 6345.1 extensions are incorporated into [[GB 18030]], while GB 8565.2 serves as the mainland Chinese source reference for certain [[CJK Unified Ideographs]].

==GB 6345.1== GB 6345.1-86 (''32 × 32 Dot Matrix Font Set of Chinese Ideographs for Information Interchange'') includes both a [[corrigendum]] and an extension for GB 2312.<ref name="lunde2009" /> The corrigendum alters the following two characters:

{|class="wikitable" |+ Alterations made to existing GB 2312 characters by GB 6345.1 !scope="col"|[[Kuten|Row-cell]]!![[EUC-CN|EUC]]!!GB 2312 (Unamended)<ref name="ir58">{{cite iso-ir |number=58 |title=Coded Chinese Graphic Character Set for Information Interchange |sponsor=China Association for Standardization}}</ref>!!GB 6345.1!!Notes |- !03-71 |0xA3E7||[[File:Looptail g.svg|10px|class=skin-invert|text-bottom|{{not a typo|g}}]]||ɡ||<!-- This is the way around the GB 2312 ( https://www.itscj-ipsj.jp/ir/058.pdf ) versus GB 6341.1 charts ( https://imgur.com/a/sghgJcL ) show them; Lunde has them the wrong way around. --> {{efn|Corresponds to {{unichar|FF47}} in Unicode; however, the amended reference glyph can also correspond to {{unichar|0261}}. See below for how {{tt|U+0261}} is typically mapped to/from GB/T&nbsp;6341.1, versus how it is mapped to/from ISO-IR-165. GB 18030 swaps this one back to the original<ref name="ir58"/> looped glyph.<ref name="gb18030"/>}} |- !79-81 |0xEFF1||[[wikt:鍾|鍾]]||[[wikt:锺|锺]]||{{efn|The unamended reference glyph is a Traditional Chinese character corresponding to {{tt|U+937E}}. The character in question is usually replaced with [[wikt:钟|钟]] ({{tt|U+949F}}, also the simplification of [[wikt:鐘|鐘]]) in Simplified Chinese except in names of persons; the amended glyph is an alternate simplified form corresponding to {{tt|U+953A}}.}} |} {{notelist}}

Deployed implementations incorporating GB 2312, such as [[code page 936 (Microsoft Windows)|Windows code page 936]], generally follow these corrections in mapping 79-81 to U+953A.<ref name="ms936">{{cite web |url=https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT |title=cp936 to Unicode table |year=2000 |last=Steele |first=Shawn |publisher=[[Microsoft]], [[Unicode Consortium]]}}</ref>

The extension adds half-width [[ISO 646-CN]] characters in row 10 (in addition to the existing full-width characters in row 3) and extends the set of 26 non-ASCII [[pinyin]] characters in row 8 with six additional such characters. These GB 6345.1 extensions are also incorporated into [[GB/T 12345]], the [[Traditional Chinese]] counterpart to GB 2312, in addition to 29 vertical presentation forms in row 6.<ref name="lunde2009" /><ref name="cjkv-12345">{{cite book |title=CJKV Information Processing |last=Lunde |first=Ken |author-link=Ken Lunde |isbn=9781565922242 |year=1998 |chapter-url=https://resources.oreilly.com/examples/9781565922242/blob/master/AppF/gbt12345.pdf |publisher=[[O'Reilly Media]]|chapter=Appendix F: GB/T 12345}}</ref>

Later GB/T 6345.1-2010 published in 2011 officially adds half-width forms of the 32 pinyin characters (including the six new additions) in row 8 to row 11.<ref name=":0">{{Cite book |last=Standardization Administration of China (SAC) |url=http://archive.org/details/gb-6345.1-2010 |title=GB/T 6345.1-2010 信息技术 汉字编码字符集(基本集) 32点阵字型 第1部分宋体 |date=2011-01-10 |location=China |language=zh-CN}}</ref> This addition is not featured in GB 18030.<ref name="gb18030" />

The six additional pinyin characters from GB 6345.1 and the vertical presentation forms from GB 12345 — but not the half-width forms — are included in the [[classic Mac OS]] encoding for Simplified Chinese (a modification of [[EUC-CN]]),<ref name="macsimpchinese"/> and also as two-byte codes in [[GB 18030]].<ref name="gb18030">{{Cite book|url=https://archive.org/details/GB18030-2005|title=GB 18030-2005: Information Technology—Chinese coded character set|last=Standardization Administration of China (SAC)|date=2005-11-18}}</ref> The additional pinyin characters are as follows:<ref name="macsimpchinese">{{cite web|url=https://unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINSIMP.TXT|title=Map (external version) from Mac OS Chinese Simplified encoding to Unicode 3.0 and later.|publisher=[[Apple, Inc]]}}</ref>

{|class="wikitable" |+ Extensions made by GB 6345.1 to GB 2312 row 8 !scope="col"|[[Kuten|Row-cell]]!![[EUC-CN|EUC]]!!Character<ref name="macsimpchinese"/><ref name="gb18030"/>!!Notes |- !08-27 |0xA8BB||{{unichar|0251}}|| |- !08-28 |0xA8BC||{{unichar|1E3F}}||{{efn|Mapped to the [[Private Use Area]] {{tt|U+E7C7}} by [[Windows-936|Windows code page 936]]<ref name="ms936-with-pua">{{cite web |url=https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit936.txt |title=CODEPAGE 936: PRC GBK (XGB) - ANSI, OEM |author=Microsoft |author-link=Microsoft |publisher=[[Unicode Consortium]]}}</ref> and the first (2000) edition of [[GB 18030]]; this was amended by the 2005 edition.<ref name="gb18030"/>}} |- !08-29 |0xA8BD||{{unichar|0144}}|| |- !08-30 |0xA8BE||{{unichar|0148}}|| |- !08-31 |0xA8BF||{{unichar|01F9}}||{{efn|This composed character was added in Unicode 3.0. Prior to this, this character was mapped to its composition sequence (i.e. {{tt|U+006E U+0300}}) by Apple.<ref name="macsimpchinese"/> This change predates the stabilisation of [[Unicode normalisation]] forms, which was introduced in Unicode 3.1.<ref>{{cite web | url=https://www.unicode.org/policies/stability_policy.html | title=Unicode Character Encoding Stability Policies | publisher=Unicode Consortium | date=2017-06-23 }}</ref> It is mapped to {{tt|U+E7C8}} by [[Windows-936|Windows code page 936]].<ref name="ms936-with-pua"/>}} |- !08-32 |0xA8C0||{{unichar|0261||image=Looptail g.svg{{!}}class=skin-invert}}||{{efn|Matches the unamended reference glyph for 03-71 (see above) in being a looped g, in spite of being typically mapped to U+0261. Mappings used for ISO-IR-165 differ (see below). GB 18030 swaps 03-71 back to the looped g, and makes this one the open g.<ref name="gb18030"/>}} |} {{notelist}}These extensions and modifications to GB 2312 were first introduced in GB 5007.1-85 in 1985.

==GB 8565.2== GB 8565.2-88 (''Information Processing - Coded Character Sets for Text Communication - Part 2: Graphic Characters'') defines an extension for GB 2312, adding 705 characters between rows 13–15 and 90–94, of which 69 (all in row 15) are non-hanzi. It includes the GB 2312 corrections from GB 6345.1, but not its extensions.<ref name="lunde2009" />

The [[Unihan]] database references GB 8565.2 as the mainland Chinese source of several hanzi included in [[Unicode]]. Its Unihan source abbreviation is {{code|G8}}.<ref name="chung" />

==CCITT changes== ISO-IR-165 incorporates the GB 2312 extensions from both GB 6345.1-86 and GB 8565.2-88.<ref name="lunde2009" /> Additionally, it adds 161 further characters (including 139 hanzi, identified as “general Chinese characters and variants”).<ref name="lunde2009" /><ref name="iso-ir">{{cite iso-ir |number=165 |title=Codes of the Chinese graphic character set for communication |date=1992-07-13 |sponsor=CCITT |sponsor-link=ITU-T}}</ref> These CCITT hanzi extensions have on occasion been mistaken for standard GB 8565.2 characters, including in previous revisions of the [[Unihan]] database.<ref name="chung" /> In total the set contains 8446 characters.

A number of patterned [[semigraphics|semigraphic]] characters are included in row 6.<ref name="iso-ir"/> This collides with the vertical presentation forms included in other extensions such as Mac OS Simplified Chinese<ref name="macsimpchinese"/> and GB 18030.<ref name="gb18030"/>

The GB 6345.1 corrections to GB 2312 are applied, but two Unicode mappings are reversed compared to other encodings which include GB 2312 with GB 6345.1 extensions. The table below shows the mappings and their corresponding glyphs including [[GB 18030]]:

{|class="wikitable" !scope="col"|[[Kuten|Row-cell]]!![[EUC-CN|EUC]]!!GB 2312 (unamended)<ref name="ir58"/>!!GB 6345.1<ref name=":0" />!!GB 6345.1 mapping<ref name="macsimpchinese"/>!!ISO-IR-165<ref name="iso-ir"/>!!ISO-IR-165 mapping<ref>{{cite web |url=https://raw.githubusercontent.com/unicode-org/icu/refs/tags/icu-milestone-3-9-3/source/data/mappings/iso-ir-165.ucm |title=Unicode to ISO-IR-165 table |first=Raghuram |last=Viswanadha |date=2000-08-30 |publisher=[[IBM]] |work=[[International Components for Unicode]] }} (Note: codes are listed in the source in 7-bit form: add 0x80 to each byte for EUC form, or subtract 0x20 for kuten form)</ref> !GB 18030<ref name="gb18030" /> !GB 18030 mapping<ref name="gb18030" /> |- !03-71 |0xA3E7||[[File:Looptail g.svg|10px|class=skin-invert|text-bottom|{{not a typo|g}}]]||ɡ||{{tt|U+FF47}}||ɡ||{{tt|U+0261}} |[[File:Looptail g.svg|10px|class=skin-invert|text-bottom|{{not a typo|g}}]] |{{tt|U+FF47}} |- !08-32 |0xA8C0||(absent)||[[File:Looptail g.svg|10px|class=skin-invert|text-bottom|{{not a typo|g}}]]||{{tt|U+0261}}||[[File:Looptail g.svg|10px|class=skin-invert|text-bottom|{{not a typo|g}}]]||{{tt|U+FF47}} |ɡ |{{tt|U+0261}} |- !79-81 |0xEFF1||鍾||锺||{{tt|U+953A}}||锺||{{tt|U+953A}} |锺 |{{tt|U+953A}} |}

==References== <references />

==External links== *[https://itscj.ipsj.or.jp/ir/165.pdf ISO-IR-165: Code of the Chinese graphic character set for communication] (registered 1992, amended 1994) *[https://raw.githubusercontent.com/unicode-org/icu/refs/tags/icu-milestone-3-9-3/source/data/mappings/iso-ir-165.ucm Unicode mappings for ISO-IR-165] {{CJK computing|collapsed=false}}

{{DEFAULTSORT:Iso-Ir-165}} [[Category:Chinese character encodings]]