a12n-collaboration Mailing List Archive: Re: [A12n-Collab] Re: [africa] 5 categories of African orthographies (Latin)[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]
Tunde Adegbola wrote: In my own work in language technology however, I do have problems with the lack of unique code points for high/low tone sub dotted vowels. This presents ambiguity because they can be achieve in more that one way; by subdotting a tone-marked vowel, or by tone-marking a subdotted vowel. Both look exactly the same to a human reader but requires extra lines of code for a computer to see both as the same. It starts getting distracting when you consider that this has to be cattered for in both lowe and upper cases. As Andrew Cunningham pointed out, handling different character sequences for the same typeform is not very difficult, and it is for precisely this reason that normalisation exists and is well defined in the Unicode standard. This is an issue that affects any situation in which more than one mark is applied to a base letter, not just some African orthographies, and since it is a given that any combining mark characters may be combined in any quantity with any base characters, encoding precomposed combinations not only is not a viable option but simply shifts the normalisation issue into a comparison of precomposed and decomposed strings instead of comparision of variant decomposed strings. In any case, this point must be clearly understood: it is not possible to add any more precomposed diacritic combinations with canonical decompositions to Unicode, due to stability agreements with other international standards that rely on this aspect of Unicode to remain stable. Personally, I would be happy if Unicode did not include any precomposed characters, and if that had been possible from the beginning -- it was not, due to the principle of providing one-to-one backwards compatibility with pre-existing encodings -- then the software for seamlessly handling normalisation and display of combining mark text would have matured many years ago and African and other non-European languages would enjoy much better support that they have. John Hudson
Last Updated: Sat Dec 22 06:26:19 2007 |
a12n-collaboration is hosted on Kabissa - Space for Change in Africa
Your feedback is important. Click here to send a message to the Kabissa team.
Terms of Use | Privacy Notice | Web Site Credits © 1999-2006, Kabissa or its affiliates