From:
a12n-collaboration-bounces@xxxxxxxxxxxx
[mailto:a12n-collaboration-bounces@xxxxxxxxxxxx] On Behalf Of Peter
Constable
Sent: Friday, December 21, 2007 11:39 AM
To: A12n tech support
Subject: RE: [A12n-Collab] Re: 5 categories of African orthographies
(Latin)
[PC] I see many problems with Taylor’s five levels.
I don’t see a significant
difference between levels 1 and 2. Maybe learning to enter é is a slight
challenge for an English speaker, but anyone living in France or Spain (e.g.)
has probably known how to do that from day one.
[DO] The differences are
becoming less and less important, I agree, but they are still there. Input of
accents is one issue not only on an English QWERTY, but probably also on
keyboards designed for one or another language using different accents.
Display is another issue that keeps
cropping up for category 2 languages, even though it shouldn't. Hardly a week
goes by when I don't encounter some problem or other with simple accented
characters in French. Technically this shouldn't happen, one may argue, and
maybe on fr locales it doesn't. But with different people using different
systems and encodings and so on, it definitely does.
Correct me if I'm wrong, but are
there not also programming contexts where ASCII only (= category 1 but not 2)
can be used?
[PC] Level 3 is too simplistic and out of date:
LEVEL 3
— The next step up in difficulty is those languages
which use
‘ordinary’ letterforms but in some non-standard
combinations – such
as a dot under a vowel, or an acute accent over a consonant.
These
languages cannot be set with standard applications and
fonts.
[PC] There are certainly “standard” applications
and fonts that can be used to set these. Maybe not *all*
“standard” applications and fonts – depending on what
“standard” means. Then there’s the confusion around suggested
increasing complexity / lack of software/font support with higher levels. In
particular, consider level 4 in relation to level 3:
[DO] I agree this is simplistic
and out of date. In my schema this is category 4, the orthographies that use
combining diacritics.
LEVEL 4
— These are the languages which clearly require
a number
of special letterforms that do not exist in the standard
fonts oriented
towards Western European language typesetting, for example
the
‘hooked consonants’ of Hausa. Here, a special
font is definitely
required, but no other modification of the system is
needed.
[PC] Word 97 could handle the
level-4 scenario even though it couldn’t handle the level-3 scenario. I
suspect there are several products from the past 10 years that are like that.
(InDesign is another that comes to mind.)
[DO] This is my category 3
because, as you indicate, it was simpler to handle than the combining
diacritics. You are of course right that some older applications and systems
were able to display extended Latin characters. (Actually I think that not a
few people who were relying on 8-bit "special fonts" for these
characters earlier this decade actually had unicode fonts with the same
characters on their computers but did not know it.)
[PC] Then, level 5 is a bit of a
muddle:
LEVEL 5
— The most problematic languages have a non-latin
character
set which is so large in its required repertoire that
a single standard
font cannot contain them all – or perhaps they
have unusual behaviours,
such as requiring different forms of letter depending
on where
they occur in a word. This level of problem requires
more than just a
special font: some other modifications will be needed,
such as special
software or operating system extensions.
[PC] The size of the repertoire
and the need to support “unusual behaviours” are two very different
issues. Arabic versions of Word 95 could handle the latter but not the former;
English Word 97 could handle the former but (IIRC) not the latter. And
obviously the claim that “a single standard font cannot contain [all
characters for large-repertoire languages]” is completely wrong: ever
since there have been TrueType fonts, they have been able to support tens of
thousands of characters in a standards-conformant way. The real issue behind
what he’s saying is one of encoding: 8-bit encodings, indeed, cannot handle
a script like Ethiopic (except by some kind of escaping mechanism or indirect
representation scheme).
[DO]
You are correct of course. I'm reminded in reading this again of a
distinguished scholar who wrote 2-3 years ago that a computer couldn't handle
the 3 alphabets that could be used to transcribe Berber languages. Public education
- even informing of language experts - about what Unicode does has never been
very strong. Works such as Taylor's which don't take this into account but
remain onlne with out update may keep confusing people.
I'll
see if I can contact Conrad Taylor about whether something can be added to at
least give readers an updated context.
Don