The first multilingual translator
The ATAMIRI MT-System has been developed by a small language engineering research group (IGRAL), under the conduction of its creator Iván Guzmán de Rojas. This system is made up of a set of natural language processing programs with its lexical and grammatical database, both designed to serve as support tool for a translation center, to assist, not substitute the professional translator. ATAMIRI can also be easily adapted to operate in a communications network as an immediate response interactive translator.
In Washington, in March 1985, under the auspices of the OAS General Secretary, Iván Guzmán de Rojas presented the first system prototype capable of simultaneously translating from English into Spanish, French and German and reverse-from Spanish into the other three languages. It is the first machine translation system with a genuinely multilingual capacity whose language basis can easily be extended, that is, it can „learn“ by only enriching its lexical and grammatical database, with a minimum of additional programming effort. Once a language has been „learned“, ATAMIRI can use it either as either a source or as a target language in the implemented multilingual environment.
Because its linguistic model is based on the formal language representation of the ancient Andean Aymara language, this achievement in language engineering is very peculiar. Due to its algorithmic matricial structure, an Aymara formal representation is useful as a syntactical bridge or interlingua, making it possible to simultaneously translate from source language into various other target languages.
For the Spanish speaking community as well as for the ancient Aymara culture ATAMIRI represents a singular language engineering result, since it emerges precisely out of the integrated treatment of very different linguistic structures such as Spanish and Aymara.
From 1985 until 1988, the system operated in production environment in the Panama Canal Translation Office (translating from English into Spanish), as well as in the Wang Translation Center of Panama, translating technical manuals. In 1987 the system worked in a test operation translating technical texts from English into German, Dutch, French and Italian in the Wang Translations Centers in Europe. In 1988 the system was tested to translate from English into Swedish in the Wang Translation Center in Stockholm. Due to the closing of the Wang Translation Centers in Europe at the end of 1989 the version for the Wang VS machine has been discontinued.
In 1990, the new PC version for Novell networks was installed as a test in the internationalization office of BULL products in Paris. Based on the results, this firm made a proposal to buy the technology; in spite of this, there was no agreement, and the operation was cancelled.
At this time, the same version was also installed to be tested in the Consulting Office of David Stanton Associates in Bournemouth, England. The reduced coverture of technical terminology in ATAMIRI’s dictionary for specialized fields didn’t allow this consulting office to use the system commercially.
In 1993, invited by CompuServe, the author installed ATAMIRI at Data Technologies Office in Cambridge, with the aim of running an evaluation to compare ATAMIRI with ten other translation systems. The evaluation was based on the translation from English into French of typical texts that usually appear in the exchange of messages in the CompuServe network. The results were positive, in spite of the fact that French was not yet a strong language in ATAMIRI and the tests were especially difficult. Regardless, ATAMIRI proved to be the only genuinely multilingual system that is able to operate in a real world-working environment. It is a very fast system allowing real time response and has high improvement possibilities.
The revival: ATAMIRI 2001
ATAMIRI remained asleep for almost a decade, even thought considerable programming work has been accomplished to save the system from obsolescence due to its operating platform. Now it is working under either Windows or UNIX. Thanks to an initiative of Unión Latina, a EU organization, during January and February 2001, in La Paz, a pilot operation was run in order to test ATAMIRI’s multilingual capabilities with the implementation of Romanian in the system. The author presented the resulting prototype in Paris, at UNESCO. This event launched again ATAMIRI in the arena of multilingual MT technology.
Further sponsorship from the Unión Latina, Paris, and NEOTEC, La Paz, the current version renders a messenager service with simultaneous translation in 30 language directions at www.atamiri.cc/qopuchawi
Meanwhile ATAMIRI operates at a very reduced scale by IGRAL, offering translation services via e-mail from La Paz. This enriches the system’s lexical database and improves its grammatical capability.
ATAMIRI: at service for the Aymara language
ATAMIRI’s origin goes back to the study of The Logic-linguistical problems of the social communication with the Aymara folk (Problemática Lógico-Lingüistica de la Comunicación Social con el Pueblo Aymara) which the author of the system wrote in 1979 and was published in 1982 and 1985 under the auspices of the IDRC from Canada and distributed in three languages: Spanish, English and French. Here the algorithmic character of the Aymara syntax is demonstrated, which possesses a trivalent logic underneath.
Even though a formal representation of Aymara is being used by the system as an interlingua only at deeper levels than the morphemic surface, texts can also be translated into this language, provided there are enough Aymara lexeme in the lexicographic data base of the system; besides, if a morph-syntactic analyzer for the Aymara would be implemented, this language could also be used as a source language.
In 1984, with the initial support of UNESCO (Montevideo office), IGRAL started introducing Aymara lexeme into the system, testing and evaluating the translation capability into the Aymara language in order to produce educational texts using ATAMIRI. The purpose was to develop school texts for bilingual education and thus revitalize this ancient Andean language. In spite of the successful initial results, this project was discontinued because of the lack of funds and also because IGRAL gave priority to the translation from English into Spanish to fulfill contracts signed with the Panama Canal Commission and with Wang.
Ten years later (1994), thanks to the donation of hardware by Mr. Sean Morriss from Vancouver, Canada, IGRAL started enriching the lexicographic database of the system with Aymara lexeme again.
This time, a scanner has been used to put in selected texts at school and university level, in English, trying to make a lexical analysis that would help find missing Aymara terminology. As a result of this analysis the lexeme that were asked by the system were introduced in selected way, storing terms of old dictionaries, especially from Bertonio’s dictionary (elaborated between 1585 and 1608 in the Titicaca Lake region), and in many cases generating adequate neologisms based on roots and suffixes from the Aymara language. In order to do this, Iván Guzmán de Rojas designed language-engineering techniques, which allow effective control of lexical consistency and sintagmatic adaptation necessarily required to obtain a functional standarization of the language.
In order to accomplish the task of massive translation of educational texts in Aymara, at different levels and in various thematic fields, IGRAL requires financial resources to improve and accelerate the already initiated operation, which is a promising service to the Aymara culture, whose language is ATAMIRI’s mother tongue.
ATAMIRI’s competitive advantage
Comparative tests have demonstrated that ATAMIRI is one of the fastest translation systems in the world. With an 800 MHz Pentium processor it translates 700 words per second. It is most appropriate for real time translation of e-mail and web pages.
Due to its multilingual capability, ATAMIRI can distinguish between hypertext syntax and the normal syntax of the language in which the embedded text is written. In this way hypertext pages delivered by Web sites can be immediately translated on-line.
ATAMIRI is the only genuine multilingual, machine translation system available and its design is pioneer in the use of an interlingua utilizing a matricial representation of the natural language, which allows the simultaneous translation of a source language into various target languages. The current Atamiri 2001 version uses either Spanish, French, Portuguese, Italian, Romanian or English as a source and targets languages. At lesser raw translation quality, according to the systems development state, German, Dutch, Swedish, and Aymara can be used. Other preliminary and promising tests have been also undertaken with Hungarian and Russian, using the Roman alphabet for syntactical tests.
A 30,000 lexemes implementation of a new target language requires eight man-months. In order to make the language act as a source language, four additional man-months are required. This way, the new language can be already used to combine translation pairs with all the other languages that have been previously introduced. These tasks can be done simultaneously for more than one language, since the expansion to other languages does not take place in pairs. Instead of administrating various dictionaries, only a single multilingual lexical data base is managed, which allows an easy entry of the user’s own terminology.
ATAMIRI’s multilingual capabilities allow significant savings as far as money and implementation time is concerned for new language implementations. In a multilingual working-environment with N languages, for language pair wise MT technologies, implementation costs are proportional to the N(N-1) translation directions, while for ATAMIRI’s interlingua technology, costs are proportional to N.
ATAMIRI is a complete user-friendly translation system. The generated translated draft is of good quality and the anomalies that might appear are easy to correct, allowing a significant production increase. The translated text, as needed, can be generated in UNICODE.
Since ATAMIRI’s origin in March 1979, various machine translation system projects were born. They also died very quickly. Millions of dollars have been invested in those projects; for instance, the EUROTRA project, where the idea was to develop a multilingual system, in which the European Community dissipated 50 million dollar without achieving its goal. There are also on sale low cost mono-user packages and few multi-user systems developed per language pairs. But still multilingualism is the challenge. The multilingual translation market in the web is enormous and it is still virgin.
ATAMIRI’s R&D is still advancing, even though too slowly. It is waiting for its technological potential to be exploited on large scale. To achieve this, capital investment, under equitable conditions that recognizes the value of this unique technology, is required.
La Paz, Bolivia
Calle 4 N°90, Los Pinos
Teléfono: +591 (2) 2770060
Fax: +591 (2) 2799692
Cajón Postal: 14396
La Paz, Bolivia