There are several points in this thread that I'd like to
clarify. I'd also like to say right off that I am glad that there is discussion
again about the issue of combining diacritics. It's an issue that seems to
float out there among some experts even as the technology improves. So - to
take a neutral point of view (NPOV; learned that on Wikipedia) - it needs to be
dealt with one way or another so we can move forward. More below in the fourth
point.
The points in this thread now include:
1) The original question: whether the system of
categorization - 5 categories of orthographies according to how unicode etc.
supports them. Am I correct in concluding that no one has a problem with this?
Reason I ask is that I want to use this in some writing and would rather get
criticism now than later.
Thanks, Tunde, for mentioning Conrad Taylor's site. For
those not familiar with it, it's actually a book and a nice piece of work, but
a bit dated. I'd seen it some years ago but had forgotten he had a list of
levels of difficulty on page 6. His Levels 1 & 2 are the same as the
Categories 1&2 I suggested. His 4 is basically my 3, and his 3 my 4. Partly
because we are now using Unicode, I see the extended characters themselves as
less of an issue than they were when he wrote his book - hence they are now
only a step above #2 in difficulty (these include the Extended Latin Additional
range that has the subdot letters). Also, my Category 4 includes characters in Category
3 which need combining diacritics, which is a bit different than Taylor's Level
3. I add a category (my #5) he doesn't have because he wasn't writing from the
viewpoint of Unicode support. Also, While I'm very aware of the importance of
non-Latin scripts I did not include them in my schema (or else his #5 would be
my #6 and on)
FYI, there was mention of Taylor's book on this list about 5
years ago. See:
http://lists.kabissa.org/lists/archives/public/a12n-collaboration/msg00120.html
http://lists.kabissa.org/lists/archives/public/a12n-collaboration/msg00121.html
2) Existing information on Latin-based orthographies of
African languages.
2a) Hartell's 1993 book. Yes, this is one we refer to often.
I used it for a series of charts on http://www.bisharat.net/A12N/#countrytables
(Lee Pearce also did some work there), and more recently Christian Chanard set
up a database using Hartell's data at http://sumale.vjf.cnrs.fr/phono/
. Problem is that there is no update to this, and indeed that expanding
it would be a challenge given the fact some orthographies are not set. Some
even apparently are changing
2b) Documents like the one by Jim Agenbroad that Charles
referred to, and indeed the oft-discussed research John did (time to bring that
up again) would indeed be great to get online for greater access.
3) With regard to missing characters in Unicode (which
define the category 5 orthographies), more work could certainly be done to
identify these, but this may not be as big a priority now - given the fact that
many outstanding needs have been addressed in recent years - as getting full
support for category 4 orthographies.
4) With regard to support for Category 4 orthographies (if
we agree on that terminology), that is orthographies that need combining
diacritics and hence support for those, the question of how good that support
is, and indeed how good the concept is, have been around for a while. The
suggestion that more precomposed characters be added to Unicode has been
discussed on this list - see for instance the thread beginning with http://lists.kabissa.org/lists/archives/public/a12n-collaboration/msg00182.html
.
The fact that the question keeps getting raised (I hear it
from others occasionally) is sign enough that there is a need to either clarify
the support issues and how those are being addressed, or clarify how the system
doesn't work. Continued doubts about dynamic composition either need to be
addressed with better explanations (and real support) or alternatively - again from
a NPOV - with a real proposal that makes the justification and proposes
specific precomposed characters. This so that we can move forward one way or
another rather than recycling debates.
That said (now I'm no longer NPOV), the system apparently
works but the support is not yet there for African languages. Maybe what the
problem is, and also the key to the concerns raised by Tunde and Samuel, is
that there still work to do to support input and display of Yoruba diacritics
(and other "category 4 orthographies") - so obviously it doesn't seem
to work.
In any event, I think this discussion is very timely and
would like to encourage people with whatever experience or expertise with category
4 orthographies (i.e., ones that require use of combining diacritics or even
stacking of diacritics) to let us know what they think.
Don Osborn
Bisharat.net
PanAfriL10n.org
* For mention of Taylor's site see: