a12n-collaboration Mailing List Archive: RE: [A12n-Collab] Re: [africa] 5 categories of African orthographies (Latin)
[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: RE: [A12n-Collab] Re: [africa] 5 categories of African orthographies (Latin)
- From: Tunde Adegbola <taintransit@xxxxxxxxxxx>
- Date: Sat, 22 Dec 2007 08:35:14 +0100
- Importance: Normal
Thanks for your comments Andrew. I surely will keep in touch with you on this point. Tunde
-----------------------------------------------------------------------------------------------
Tunde Adegbola (Ph.D.)
Executive Director
African Languages Technology Initiative
(Alt-I ... Inserting African issues into the agenda of the knowledge age)
President
Tiwa Systems Ltd.
11 Oluyole Way, New Bodija Ibadan, Nigeria.
+234 8034019398
------------------------------------------------------------------------------------------------
> Date: Fri, 21 Dec 2007 13:21:39 +1100 > From: andrewc@xxxxxxxxxxxxx > To: a12n-collaboration@xxxxxxxxxxxx > Subject: Re: [A12n-Collab] Re: [africa] 5 categories of African orthographies (Latin) > > Hi all > > Tunde Adegbola wrote: > > Hi everybody, > > Another interesting perspective on categorizing African orthographies > > is offered by Conrad Taylor. See > > http://www.ideography.co.uk/library/afrolingua.html > > In my own work in language technology however, I do have problems with > > the lack of unique code points for high/low tone sub dotted vowels. > > This presents ambiguity because they can be achieve in more that one > > way; by subdotting a tone-marked vowel, or by tone-marking a subdotted > > vowel. Both look exactly the same to a human reader but requires extra > > lines of code for a computer to see both as the same. It starts getting > > distracting when you consider that this has to be cattered for in both > > lowe and upper cases. > > not really, from the point of view of web applications and web services, > you're talking about one extra line of code to handle unicode > normalization to your preferred normalization form. It would also be > simple in various scripting languages to include either generic unicode > case folding, or language specific case folding. > > Fairly straight forward. > > The issue isn't whether it is more complicated or relatively simple to > process. Th issue is whether developers actually do the right thing. > > > > > The response to requests for these code points is that such english > > digraphs ans `sh` do not have code points. This totally misses the > > point because `sh` and `hs` do not look allike in any way. If we accept > > that Africans need to do more than read texts produced on a computer, if > > we accept that Africans need to take full advantage of developments in > > language technology, then UNICODE should concede these code points to > > the relevant languages. > > I'd actually argue that its not a Unicode issue. Its an application > developers issue. Unicode provides both normalization and case folding. > > The issue for so long has been: > > 1) very few font developers have created opentype or graphite that > support Yoruba and other languages. > 2) few vendors have implemented support for rendering with these fonts. > But thankfully that's changing. > > Although others on this list like John and Peter are in a better > position to comment. > > These two points are even more of an issue with languages that need to > stake combining diacritics. In that sense, > > 3) applications and web services need to support input/rendering of > combining diacritics, cursor movement and selection behaviour, Unicode > normalization and case folding, among other things. > > Most European languages don't need this type of support, so many > developers either don't bother, or don't realise they need to bother. > > When an developer says their application supports Unicode i always ask > with bits they support (and which bits they don't support). > > 4) use a "smart" input mechanism, instead of a simple input mechanism. > There are solutions on both Linux and Windows for developing keyboard > layouts that will produce NFC or NFD output. Obviously this doesn't work > with all input mechanisms, ịẹ the obvious ones. > > My approach tends to be to prefer NFC, few fonts perform well when > thrown NFD data, esp in web browsers. I've observed some interesting > anomalies where characters do not visually appeared although actually > present in the data. As soon as you convert the NFD data to NFC data you > see the characters. When i have time i'll compile a list of problem > character sequences the fonts and applications these appear in, and > wether the problem is reproducable on other computers. > > And it is possible to create a layout for Yoruba and other languages > that produces NFC output. > > I suppose I'm spoilt, I'm used to Vietnamese input systems that allow > you to control output, i.e. NFC or NFD, NCRs and other formats. > > -- > Andrew Cunningham > Research and Development Coordinator (Vicnet) > State Library of Victoria > 328 Swanston Street > Melbourne VIC 3000 > Australia > > Email: andrewc+AEA-vicnet.net.au > Alt. email: lang.support+AEA-gmail.com > > Ph: +613-8664-7430 Fax:+613-9639-2175 > Mob: 0421-450-816 > > http://www.slv.vic.gov.au/ http://www.vicnet.net.au/ > http://www.openroad.net.au/ http://www.mylanguage.gov.au/ > http://home.vicnet.net.au/~andrewc/ >
Express yourself instantly with MSN Messenger! MSN Messenger
|
[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
Last Updated: Sat Dec 22 05:32:29 2007
|