Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Overview  



1.1  Assigned characters  







2 Basic Multilingual Plane  





3 Supplementary Multilingual Plane  





4 Supplementary Ideographic Plane  





5 Tertiary Ideographic Plane  





6 Unassigned planes  





7 Supplementary Special-purpose Plane  





8 Private Use Area Planes  





9 References  














Plane (Unicode)






Čeština
Deutsch
Español
فارسی

Italiano
Magyar

Português
Русский
Slovenčina
Українська


 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 




In other projects  



Wikimedia Commons
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 

(Redirected from Tertiary Ideographic Plane)

In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+hhhhhh). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes".[1] The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version 15.1, five of the planes have assigned code points (characters), and seven are named.

The limit of 17 planes is due to UTF-16, which can encode 220 code points (16 planes) as pairs of words, plus the BMP as a single word.[2] UTF-8 was designed with a much larger limit of 231 (2,147,483,648) code points (32,768 planes), and would still be able to encode 221 (2,097,152) code points (32 planes) even under the current limit of 4 bytes.[3]

The 17 planes can accommodate 1,114,112 code points. Of these, 2,048 are surrogates (used to make the pairs in UTF-16), 66 are non-characters, and 137,468 are reserved for private use, leaving 974,530 for public assignment.

Planes are further subdivided into Unicode blocks, which, unlike planes, do not have a fixed size. The 328 blocks defined in Unicode 15.1 cover 26% of the possible code point space, and range in size from a minimum of 16 code points (sixteen blocks) to a maximum of 65,536 code points (Supplementary Private Use Area-A and -B, which constitute the entirety of planes 15 and 16). For future usage, ranges of characters have been tentatively mapped out for most known current and ancient writing systems.[4]

Overview[edit]

Assigned characters[edit]

Plane Allocated code points[note 1] version 15.0 Assigned characters
0 BMP 65,520 55,639
1 SMP 26,160 23,276
2 SIP 61,536 61,495
3 TIP 9,136 9,131
14 SSP 368 337
15 SPUA-A 65,536 0 (by definition)
16 SPUA-B 65,536 0 (by definition)
Totals 293,792 149,878
  1. ^ Code points which have been allocated to a Unicode block.

Basic Multilingual Plane [edit]

A map of the Basic Multilingual Plane. Each numbered box represents 256 code points.

The first plane, plane 0, the Basic Multilingual Plane (BMP), contains characters for almost all modern languages, and a large number of symbols. A primary objective for the BMP is to support the unification of prior character sets as well as characters for writing. Most of the assigned code points in the BMP are used to encode Chinese, Japanese, and Korean (CJK) characters.

The High Surrogate (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate and one Low Surrogate. A single surrogate code point will never be assigned a character.

65,520 of the 65,536 code points in this plane have been allocated to a Unicode block, leaving just 16 code points in a single unallocated range (2FE0..2FEF).

As of Unicode 15.1, the BMP comprises the following 164 blocks:

  • Latin-1 Supplement (Upper half of ISO/IEC 8859-1) (0080–00FF)
  • Latin Extended-A (0100–017F)
  • Latin Extended-B (0180–024F)
  • IPA Extensions (0250–02AF)
  • Spacing Modifier Letters (02B0–02FF)
  • Combining Diacritical Marks (0300–036F)
  • Greek and Coptic (0370–03FF)
  • Cyrillic (0400–04FF)
  • Cyrillic Supplement (0500–052F)
  • Armenian (0530–058F)
  • Semitic abjads and other right-to-left scripts:
  • Brahmic scripts:
  • Other alphabetic or syllabic left-to-right scripts:
  • Philippine scripts:
  • Khmer (1780–17FF)
  • Mongolian (1800–18AF)
  • Unified Canadian Aboriginal Syllabics Extended (18B0–18FF)
  • Brahmic scripts:
  • Tai scripts:
  • Combining Diacritical Marks Extended (1AB0–1AFF)
  • Indonesian scripts:
  • Lepcha (1C00–1C4F)
  • Ol Chiki (1C50–1C7F)
  • Other left-to-right alphabetic or syllabic supplements:
  • Sundanese Supplement (1CC0–1CCF)
  • Vedic Extensions (1CD0–1CFF)
  • Other left-to-right alphabetic supplements:
  • Symbols:
  • Other left-to-right alphabetic scripts or supplements:
  • African scripts:
  • Other left-to-right alphabetic supplements:
  • CJK scripts and symbols:
  • Yi Syllables (A000–A48F)
  • Yi Radicals (A490–A4CF)
  • Lisu (A4D0–A4FF)
  • African scripts:
    • Vai (A500–A63F)
  • Other left-to-right alphabetic supplements:
  • African scripts:
  • Other left-to-right alphabetic supplements:
  • Brahmic scripts:
  • Hangul Jamo Extended-A (A960–A97F)
  • Brahmic scripts:
  • Ethiopic Extended-A (AB00–AB2F)
  • Latin Extended-E (AB30–AB6F)
  • Cherokee Supplement (AB70–ABBF)
  • Meetei Mayek (ABC0–ABFF)
  • Hangul Syllables (AC00–D7AF)
  • Hangul Jamo Extended-B (D7B0–D7FF)
  • Surrogates:
  • Private Use Area (E000–F8FF)
  • CJK Compatibility Ideographs (F900–FAFF)
  • Alphabetic Presentation Forms (FB00–FB4F)
  • Arabic Presentation Forms-A (FB50–FDFF)
  • Variation Selectors (FE00–FE0F)
  • Vertical Forms (FE10–FE1F)
  • Combining Half Marks (FE20–FE2F)
  • CJK Compatibility Forms (FE30–FE4F)
  • Small Form Variants (FE50–FE6F)
  • Arabic Presentation Forms-B (FE70–FEFF)
  • Halfwidth and Fullwidth Forms (FF00–FFEF)
  • Specials (FFF0–FFFF)
  • Supplementary Multilingual Plane[edit]

    A map of the Supplementary Multilingual Plane. Each numbered box represents 256 code points.

    Plane 1, the Supplementary Multilingual Plane (SMP), contains historic scripts (except CJK ideographic), and symbols and notation used within certain fields. Scripts include Linear B, Egyptian hieroglyphs, and cuneiform scripts. It also includes English reform orthographies like Shavian and Deseret, and some modern scripts like Osage, Warang Citi, Adlam, Wancho and Toto. Symbols and notations include historic and modern musical notation; mathematical alphanumerics; shorthands; Emoji and other pictographic sets; and game symbols for playing cards, mahjong, and dominoes.

    As of Unicode 15.1, the SMP comprises the following 151 blocks:

  • Linear B Ideograms (10080–100FF)
  • Aegean Numbers (10100–1013F)
  • Ancient Greek Numbers (10140–1018F)
  • Ancient Symbols (10190–101CF)
  • Phaistos Disc (101D0–101FF)
  • Lycian (10280–1029F)
  • Carian (102A0–102DF)
  • Coptic Epact Numbers (102E0–102FF)
  • Old Italic (10300–1032F)
  • Gothic (10330–1034F)
  • Old Permic (10350–1037F)
  • Ugaritic (10380–1039F)
  • Old Persian (103A0–103DF)
  • Deseret (10400–1044F)
  • Shavian (10450–1047F)
  • Osmanya (10480–104AF)
  • Osage (104B0–104FF)
  • Elbasan (10500–1052F)
  • Caucasian Albanian (10530–1056F)
  • Vithkuqi (10570–105BF)
  • Linear A (10600–1077F)
  • Latin Extended-F (10780–107BF)
  • Right-to-left scripts:
  • Brahmic scripts:
  • Unified Canadian Aboriginal Syllabics Extended-A (11AB0–11ABF)
  • Brahmic scripts:
  • Lisu Supplement (11FB0–11FBF)
  • Tamil Supplement (11FC0–11FFF)
  • Cuneiform scripts:
  • Cypro-Minoan (12F90–12FFF)
  • Hieroglyphic scripts:
  • Bamum Supplement (16800–16A3F)
  • Mro (16A40–16A6F)
  • Tangsa (16A70–16ACF)
  • Bassa Vah (16AD0–16AFF)
  • Pahawh Hmong (16B00–16B8F)
  • Medefaidrin (16E40–16E9F)
  • Miao (16F00–16F9F)
  • East Asian scripts:
  • Notational writing systems:
  • Symbols and numerals:
  • Notational writing systems:
  • Other left-to-right scripts:
  • Nyiakeng Puachue Hmong (1E100–1E14F)
  • Toto (1E290–1E2BF)
  • Wancho (1E2C0–1E2FF)
  • Nag Mundari (1E4D0–1E4FF)
  • African scripts:
  • Symbols and numerals:
  • Supplementary Ideographic Plane [edit]

    A map of the Supplementary Ideographic Plane. Each numbered box represents 256 code points.

    Plane 2, the Supplementary Ideographic Plane (SIP), is used for CJK Ideographs, mostly CJK Unified Ideographs, that were not included in earlier character encoding standards.

    As of Unicode 15.1, the SIP comprises the following seven blocks:

    Tertiary Ideographic Plane [edit]

    A map of the Tertiary Ideographic Plane. Each numbered box represents 256 code points.

    Plane 3 is the Tertiary Ideographic Plane (TIP). CJK Unified Ideographs Extension G was added to the TIP in Unicode 13.0, released in March 2020.[5] It also is tentatively allocated for Oracle Bone script and Small Seal Script.[6]

    As of Unicode 15.1, the TIP comprises the following two blocks:

    Unassigned planes[edit]

    Planes 4 to 13 (planes 4toDinhexadecimal): No characters have yet been assigned, or proposed for assignment, to Planes 4 through 13.

    Supplementary Special-purpose Plane [edit]

    A map of the Supplementary Special-purpose Plane. Each numbered box represents 256 code points.

    Plane 14 (E in hexadecimal) is designated as the Supplementary Special-purpose Plane (SSP). It comprises the following two blocks, as of Unicode 15.1:

    Private Use Area Planes [edit]

    The two planes 15 and 16 (planes F and 10 in hexadecimal) each contain a "Private Use Area". They contain blocks named Supplementary Private Use Area-A (PUA-A) and -B (PUA-B). The Private Use Areas are available for use by parties outside ISO and Unicode (private character encoding).

    References[edit]

    1. ^ "Glossary". www.unicode.org. Retrieved 2021-09-27.
  • ^ See Table 3.5 "UTF-16 Bit Distribution" in the Unicode Standard https://www.unicode.org/versions/Unicode6.0.0/UnicodeStandard-6.0.pdf
  • ^ See Table 3.6 "UTF-8 Bit Distribution" in the Unicode Standard https://www.unicode.org/versions/Unicode6.0.0/UnicodeStandard-6.0.pdf
  • ^ "Roadmaps to Unicode". www.unicode.org. Retrieved 2021-09-27.
  • ^ "Announcing The Unicode Standard, Version 13.0".
  • ^ "Proposed New Characters: The Pipeline". www.unicode.org.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Plane_(Unicode)&oldid=1227896367#Tertiary_Ideographic_Plane"

    Category: 
    Unicode
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    Articles needing additional references from July 2016
    All articles needing additional references
    Articles containing potentially dated statements from 2023
    All articles containing potentially dated statements
     



    This page was last edited on 8 June 2024, at 11:17 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki