Mailing List Hosted on Kabissa - Space for Change in Africa

a12n-forum Mailing List Archive: [A12n-forum] Re: Fwd: open-source machine translation toolbox

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

  • Subject: [A12n-forum] Re: Fwd: open-source machine translation toolbox
  • From: "Don Osborn" <dzo@xxxxxxxxxxxx>
  • Date: Sun, 20 Aug 2006 22:28:41 -0400
Last year I forwarded this item and I want to return to it quickly as it came up again on another list (AfrophoneWikis). The project Mikel mentions, OpenTrad, actually works with 2 open source engines: Apertium, mentioned in the original mail, which is useful for similar languages; and Maxtin, for languages that are not closely related. Links to all of these are on the updated MT in Africa page, at http://www.bisharat.net/Trans/#res .

As we know, there are many groups of languages that are closely related on the one hand, but wide variations in the structure etc. of languages across the continent on the other. Something like Apertium might be useful within groups of very similar languages.

It occurs that the situation in South Africa, and southern Africa more broadly, where there are widely spoken languages that fall into at least a couple of major groupings (thinking here of Zulu/Xhosa/Ndebele and Sotho/Tswana), might be ideal for something like Apertium. The idea being that where production of materials for, say schools, public awareness campaigns, extension, etc, need to be in each official language, such a translation software might reduce costs and speed things up. I don't know how the results of the OpenTrad project have been used in Spain, but perhaps the object was similar?

In any event, I would hasten to add that I don't see such MT software putting human translators out of work. Since MT is not perfect, and because its use would hopefully increase the amount of material getting translated, there would still be lots of work, though with an emphasis on perfecting translations rather than doing them from scratch.

Eventually, such translation efforts will also probably make use of translation memory programs, which in effect use records of previous translations in new tasks. (Such is already common in Europe.)

(On the other hand, some research suggests that the level of computer skills and cost of translation memory might be obstacles. See: http://www.localisation.ie/reader/localisation_in_south_africa.php . One might hope that an open source MT would be cost effective even in the face of need for training and assuring connectivity for translators involved in cleaning up MT products.)

It may be that there are folks in South Africa already well ahead on all of this, and if so it would be great to hear more about it and what they are learning. Their results could inform similar efforts to come elsewhere in the continent.

Don

Don Osborn
Bisharat.net
PanAfrican Localisation project


----- Original Message ----- From: "Donald Z. Osborn" <dzo@xxxxxxxxxxxx>
To: <a12n-forum@xxxxxxxxxxxx>; <loc-dev@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sent: Monday, September 19, 2005 2:23 AM
Subject: [A12n-forum] Fwd: open-source machine translation toolbox


This item may be of interest, although for the moment it is apparently useful
only for Romance languages and does not yet support Unicode.

Don Osborn
Bisharat.net



Dear Bisharat representative,
this letter is just to let you know that very recently my group has released an
open-source machine translation toolbox called Apertium through
http://apertium.sourceforge.net. Documentation is still scarce, but the main
use of this toolbox would be to produce machine translation systems for
languages that are not syntactically very different (we offer data for
Spanish-Catalan). You might be interested in posting a link to Apertium in your
http://www.bisharat.net/Trans/ , just in case some of the people working in
language technologies for African languages find it interesting. The kit still uses only the Latin-1 or Latin 9 encodings, but we are planning to extend it so
that it can work with Unicode.

Sincerely,

Mikel Forcada

--
Mikel L. Forcada                    E-mail: mlf@xxxxxxxxxx
Departament de Llenguatges          Phone: +34-96-590-9776
i Sistemes Informàtics                also +34-96-590-3772.
UNIVERSITAT D'ALACANT               Fax:   +34-96-590-9326, -3464
E-03071 ALACANT, Spain.
.................................
URL: http://www.dlsi.ua.es/~mlf/
---------------------------------------------------------------
[!] Please avoid sending me Word, Excel or PowerPoint attachments.
Send them to me as plain text, HTML, CSV, Postscript or PDF instead.
...............................................................
[!] Per favor, no m'envieu fitxers Word, Excel o PowerPoint adjunts.
Envieu-me'ls com text pelat, HTML, CSV, Postscript o PDF.
..............................................................
[Info? http://www.fsf.org/philosophy/no-word-attachments.html]



----- End forwarded message -----


_______________________________________________
A12n-forum mailing list
A12n-forum@xxxxxxxxxxxx
http://lists.kabissa.org/mailman/listinfo/a12n-forum



[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Last Updated: Wed Mar 14 23:48:29 2007

a12n-forum is hosted on Kabissa - Space for Change in Africa

Your feedback is important. Click here to send a message to the Kabissa team.

Terms of Use | Privacy Notice | Web Site Credits © 1999-2006, Kabissa or its affiliates