Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 History  





2 Structure and use  





3 HZ encoders and decoders  





4 Disadvantages  





5 References  














HZ (character encoding)






Deutsch
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


HZ encoding
MIME / IANAHZ-GB-2312
Language(s)Simplified Chinese, English, Russian
Created byFung Fung Lee
StandardRFC 1843
ClassificationCJK encoding, ASCII armor, variable-width encoding, stateful encoding
Transforms / EncodesGB 2312
Preceded byzW
Succeeded byQuoted-printable, UTF-7, 8BITMIME
  • t
  • e
  • The HZ character encoding[1] is an encodingofGB 2312 that was formerly commonly used in email and USENET postings. It was designed in 1989 by Fung Fung Lee (Chinese: 李楓峰) of Stanford University, and subsequently codified in 1995 into RFC 1843.[2]

    The HZ, short for Hanzi (simplified Chinese: 汉字; traditional Chinese: 漢字; lit. 'Chinese Characters'), encoding was invented to facilitate the use of Chinese characters through e-mail, which at that time only allowed 7-bit characters. Therefore, in lieu of standard ISO 2022 escape sequences (as in the case of ISO-2022-JP) or 8-bit characters (as in the case of EUC), the HZ code uses only printable, 7-bit characters to represent Chinese characters.

    It was also popular in USENET networks, which in the late 1980s and early 1990s, generally did not allow transmission of 8-bit characters or escape characters.

    History

    [edit]

    HZ superseded the earlier "zW" encoding, which marked entire lines as being GB 2312 text by beginning them with the characters zW.[3]

    Structure and use

    [edit]

    In the HZ encoding system, the character sequences "~{" and "~}" act as escape sequences; anything between them is interpreted as Chinese encoded in GB 2312 (the most significant bits are ignored). Outside the escape sequences, characters are assumed to be ASCII.

    An example will help illustrate the relationship between GB 2312, EUC-CN, and the HZ code:

    Various forms of the GB 2312 code (0xD2BB) for the character『一』(one)
    Form Code With escape sequences Remarks
    Kuten / Qūwèi / 区位 form 5027 Zone/ward/row (ku/qū/) 50, point (ten/wèi/) 27
    ISO 2022 form 5216 3B16 0E165216 3B16 0F16 50 + 32 = 82 = 5216
    EUC-CN form D216BB16 D216BB16 5216 ∨ 8016 = D216
    HZ form (standard) 5216 3B16 7E16 7B165216 3B16 7E16 7D16 Appears as ~{R;~} without HZ decoder
    HZ form (alternate) D216BB16 7E16 7B16 D216BB16 7E16 7D16 EUC form acceptable to at least some decoders

    HZ was originally designed to be used purely as a 7-bit code. However, when situations allow, the escape sequences "~{" and "~}" sometimes surround characters represented in EUC-CN; this alternative use allows Chinese to be readable either with the help of HZ decoder software, or with a system that understands EUC-CN.

    Additionally, the specification defines that:

    However, not all HZ decoders follow these two rules.

    HZ encoders and decoders

    [edit]

    The first HZ encoder and decoder were written in 1989 by the code's inventor for the Unix operating system.[4]

    The hztty program, also for the Unix operating system, was also among the first and one of the most popular HZ decoders. It deviates from the specification in that it will display the escape sequences (i.e., "~{" and "~}"), and it does not treat "~~" and "~" followed by a newline specially. This was probably to allow software which assumes one character to occupy one screen position (on a text screen) to function correctly without modification.

    Support on Microsoft Windows came later, and a number of third-party "Chinese systems" support HZ. These systems may provide an option to hide the escape sequences.

    Disadvantages

    [edit]

    Because of its escape sequences, and furthermore because its escape delimiters are printable characters in ASCII, it is fairly easy to construct attack byte sequences that round-trip from HZ to Unicode and back. Use of HZ encoding is thus treated as suspicious by malware protection suites.[5][better source needed]

    References

    [edit]
  • ^ RFC 1843
  • ^ Lunde, Ken (1995-12-18). "CJK.INF Version 1.9".
  • ^ "HZ package 2.0 — HZ spec, reference encoder and decoder source code".
  • ^ "935453 - Gather telemetry about HZ and other encodings we might try to remove". Archived from the original on 2017-05-19. Retrieved 2018-06-18.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=HZ_(character_encoding)&oldid=1211169183"

    Category: 
    Chinese character encodings
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    Articles containing Chinese-language text
    Articles containing simplified Chinese-language text
    Articles containing traditional Chinese-language text
    All articles lacking reliable references
    Articles lacking reliable references from September 2020
     



    This page was last edited on 1 March 2024, at 05:31 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki