The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.[5] Its block name in Unicode 1.0 was ASCII.[6]
A The letter U+005C (\) may show up as a Yen(¥) or Won(₩) sign in Japanese/Korean fonts mistaking Unicode (especially UTF-8) as a legacy character set which replaced the backslash with these signs.[7]
Subheadings
The C0 Controls and Basic Latin block contains six subheadings.[8]
C0 controls
The C0 Controls, referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the ISO/IEC 6429:1992 standard.[8]
ASCII punctuation and symbols
This subheading refers to standard punctuation characters, simple mathematical operators, and symbols like the dollar sign, percent, ampersand, underscore, and pipe.[8]
ASCII digits
The ASCII Digits subheading contains the standard European number characters 1–9 and 0.[8]
Uppercase Latin alphabet
The Uppercase Latin alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the majuscule.[8]
Lowercase Latin alphabet
The Lowercase Latin Alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the minuscule.[8]
Several of the characters are defined to render as a standardized variant if followed by variant indicators.
A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1 (0︀).[9][10]
Twelve characters (#, *, and the digits) can be followed by U+FE0E VS15 or U+FE0F VS16 to create emoji variants.[11][12][13][14]
They are keycap base characters, for example #️⃣ (U+0023 NUMBER SIGN U+FE0F VS16 U+20E3 COMBINING ENCLOSING KEYCAP). The VS15 version is "text presentation" while the VS16 version is "emoji-style".[10]
Emoji variation sequences
U+
0023
002A
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
base
#
*
0
1
2
3
4
5
6
7
8
9
base+VS15+keycap
#︎⃣
*︎⃣
0︎⃣
1︎⃣
2︎⃣
3︎⃣
4︎⃣
5︎⃣
6︎⃣
7︎⃣
8︎⃣
9︎⃣
base+VS16+keycap
#️⃣
*️⃣
0️⃣
1️⃣
2️⃣
3️⃣
4️⃣
5️⃣
6️⃣
7️⃣
8️⃣
9️⃣
History
The following Unicode-related documents record the purpose and process of defining specific characters in the Basic Latin block:
Freytag, Asmus; Karlsson, Kent (2011-02-02), Proposal to correct mistakes and inconsistencies in certain property assignments for super and subscripted letters
Moore, Lisa (2011-08-16), "Consensus 128-C3", UTC #128 / L2 #225 Minutes, Accept Ken Whistler's recommendations in L2/11-281 on name aliases for control characters with the addition of the abbreviations BEL and NUL.
Moore, Lisa (2015-05-12), "Consensus 143-C5", UTC #143 Minutes, Add the 12 keycap sequences in emoji-data.txt as provisional named sequences in Unicode 8.0.
Scherer, Markus; et al. (2022-01-19), "F.2 F4: U+0019 in ISO vs. NameAliases.txt vs. chart/NamesList.txt", UTC #170 properties feedback & recommendations
Constable, Peter (2022-04-21), "Consensus 170-C24", UTC #170 Minutes, For U+0019, add a Name alias "EM" of type abbreviation, for Unicode version 15.0.
^Proposed code points and characters names may differ from final code points and names