Diff: Unicode - Waikato Linux Users Group

Differences between version 7 and previous revision of Unicode.

Other diffs: Previous Major Revision, Previous Author, or view the Annotated Edit History

Newer page:	version 7	Last edited on Tuesday, April 20, 2004 4:55:25 am	by AristotlePagaltzis	Revert
Older page:	version 6	Last edited on Tuesday, April 20, 2004 3:26:59 am	by StuartYeates	Revert

@@ -1,9 +1,10 @@

An encoding for character sets (such as Latin, Chinese and Cyrillic) that attempts to include all the world's languages while making as few linguistic assumptions as possible. Some of the assumptions it does embed include:

-# The classification of all characters into exactly one of 29 classes (upper case letter, lower case letter, digit, etc). This has the interesting side effect of requiring a duplication of some letters in several classes: __M__ appears twice, once as an upper case letter and once as a digit (for roman numerals).

-# That glyths are drawn from a countably infinite set of characters (or finite set if you disallow composition).

-# That written text has a single reading order.

-~~...~~ .

+

+* The classification of all characters into exactly one of 29 classes (upper case letter, lower case letter, digit, etc). This has the interesting side effect of requiring a duplication of some letters in several classes: __M__ appears twice, once as an upper case letter and once as a digit (for roman numerals).

+* That glyths are drawn from a countably infinite set of characters (or finite set if you disallow composition).

+* That written text has a single reading order.

+* And probably others .

Contrast [ASCII].