Home
Main website
Display Sidebar
Hide Ads
Recent Changes
View Source:
Unicode
Edit
PageHistory
Diff
Info
LikePages
An encoding for character sets (such as Latin, Chinese and Cyrillic) that attempts to include all the world's languages while making as few linguistic assumptions as possible. Some of the assumptions it does embed include: * The classification of all characters into exactly one of 29 classes (upper case letter, lower case letter, digit, etc). This has the interesting side effect of requiring a duplication of some letters in several classes: __M__ appears twice, once as an upper case letter and once as a digit (for roman numerals). * That glyphs are drawn from a countably infinite set of characters (or finite set if you disallow composition). * That written text has a single reading order. * And probably others. Contrast [ASCII]. ! See also: * [A Quick Primer On Unicode and Software Internationalization Under Linux and UNIX | http://eyegene.ophthy.med.umich.edu/unicode/] * [Alan Wood's Unicode Resources | http://www.alanwood.net/unicode/] * UnicodeNotes * unicode(7) ---- CategoryStandards
14 pages link to
Unicode
:
JavaAndC++
Pango
FileAllocationTable
Big5
JavaDebuggingHints
GimpToolKit
FontNotes
UTF
XML
WhiteSpace
UTF-8
BitstreamVera
ISO
ComposeKey