Home  

Random  

Nearby  



Log in  



Settings  



Donate  



About Wikipedia  

Disclaimers  



Wikipedia





Z-variant





Article  

Talk  



Language  

Watch  

Edit  





InUnicode, two glyphs are said to be Z-variants (often spelled zVariants) if they share the same etymology but have slightly different appearances and different Unicode code points. For example, the Unicode characters U+8AAA 說 and U+8AAC 説 are Z-variants. The notion of Z-variance is only applicable to the "CJKV scripts"—Chinese, Japanese, Korean and Vietnamese—and is a subtopic of Han unification.

Differences on the Z-axis

edit

The Unicode philosophy of code point allocation for CJK languages is organized along three "axes." The X-axis represents differences in semantics; for example, the Latin capital A (U+0041 A) and the Greek capital alpha (U+0391 Α) are represented by two distinct code points in Unicode, and might be termed "X-variants" (though this term is not common). The Y-axis represents significant differences in appearance though not in semantics; for example, the traditional Chinese character māo "cat" (U+8C93 貓) and the simplified Chinese character (U+732B 猫) are Y-variants.[1]

The Z-axis represents minor typographical differences. For example, the Chinese characters (U+838A 莊) and (U+8358 荘) are Z-variants, as are (U+8AAA 說) and (U+8AAC 説). The glossary at Unicode.org defines "Z-variant" as "Two CJK unified ideographs with identical semantics and unifiable shapes,"[1] where "unifiable" is taken in the sense of Han unification.

Thus, were Han unification perfectly successful, Z-variants would not exist. They exist in Unicode because it was deemed useful to be able to "round-trip" documents between Unicode and other CJK encodings such as Big5 and CCCII. For example, the character 莊 has CCCII encoding 21552D, while its Z-variant 荘 has CCCII encoding 2D552D. Therefore, these two variants were given distinct Unicode code points, so that converting a CCCII document to Unicode and back would be a lossless operation.

Confusion

edit

There is some confusion over the exact definition of "Z-variant." For example, in an Internet Draft (ofRFC 3743) dated 2002,[2] one finds "no" (U+4E0D ) and (U+F967 不︀) described as "font variants," the term "Z-variant" being apparently reserved for interlanguage pairs such as the Mandarin Chinese "rabbit" (U+5154 ) and the Japanese to "rabbit" (U+514E ). However, the Unicode Consortium's Unihan database[3][failed verificationsee discussion] treats both pairs as Z-variants.

See also

edit

References

edit
  1. ^ a b "Glossary". www.unicode.org.
  • ^ Huang, K.; Ko, Y.; Konishi, K.; Qian, H. (April 2004). "Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean". tools.ietf.org.
  • ^ "Unihan Database Lookup". www.unicode.org.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Z-variant&oldid=1181195342"
     



    Last edited on 21 October 2023, at 13:45  





    Languages

     



     

    Wikipedia


    This page was last edited on 21 October 2023, at 13:45 (UTC).

    Content is available under CC BY-SA 4.0 unless otherwise noted.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Terms of Use

    Desktop