Dotted and dotless I in computing
The Latin-derived letters dotted İ i and dotless I ı, which are distinct letters in the alphabets of a number of Turkic languages, unlike in English and most languages using the Latin script, have caused some issues in computing. Difficulties
Unicode does not encode the uppercase form of dotless I and lowercase form of dotted İ separately from their base letters, and instead merges them with the upper and lower case forms of the Latin letter I respectively. John Cowan proposed disunification of plain Ii as capital letter dotless I and small letter I with dot above to make the casing more consistent.[1] The Unicode Technical Committee had previously rejected a similar proposal[2] because it would corrupt mapping from character sets with dotted and dotless I and corrupt data in these languages.[citation needed] Most Unicode software uppercases ı to I, but, unless specifically configured for Turkish, it lowercases I to i. Thus uppercasing then lowercasing changes the letters. Likewise, most Unicode software uppercases i to I, changing the letter in the process. In the Microsoft Windows SDK, beginning with Windows Vista, several relevant functions have a NORM_LINGUISTIC_CASING flag, to indicate that for Turkish and Azerbaijani locales, I should map to ı. In the LaTeX typesetting language the dotless ı can be written with the backslash-i command: Dotted İ and dotless ı are problematic in the Turkish locales of several software packages, including Oracle DBMS, PHP, Java (software platform),[3][4] and Unixware 7, where implicit capitalization of names of keywords, variables, and tables has effects not foreseen by the application developers. The C or US English locales do not have these problems. The .NET Framework has special provisions to handle the 'Turkish i'.[5] Many cellphones available in Turkey (as of 2008) lacked a proper localization, which led to replacing ı by i in SMS, sometimes severely distorting the sense of a text. In one instance, a miscommunication played a role in the deaths of Emine and Ramazan Çalçoban in 2008.[6][7] A common substitution is to use the character 1 for dotless ı. This is also common in Azerbaijan (see also translit), but the meaning of words is generally understood. In some Ectaco translators, the letter İ was also treated as I (e.g. TRAFIK ⟨traffic⟩, when it is normally TRAFİK).
See also
References
External links
|