|
In this issue…
The IDOL Project
Localization into and out of Arabic
Until now there has been no localization product providing direct support for translation to and from the Arabic language. David Mowatt of UMIST (UK) explains how the IDOL (Information Retrieval System-based DOcument Localisation) project will fill the gap. The project partners intend to present the finished product in May 1999. In today's increasingly multilingual open-market environment, the importance of document localization has never been greater. Although many international organizations, small and medium enterprises (SMEs) and large corporations are attempting to address this issue, it is still the case that much important information with potentially widespread interest exists throughout Europe but remains unused by European users or inaccessible to developing country (DC) users because of linguistic barriers. Localization products already exist to help speakers of the more common European languages, but none yet provide direct support for those translating into or out of Arabic. The IDOL project aims to redress this imbalance. The objectivesThe central aim of the project is to develop software for a localization workstation supporting two of the main European languages (English and French) as well as Arabic. This software will include not only sophisticated document management, designed for the translation environment, but a translation memory engine that operates (initially) in the three languages of the project, with a view to expansion to other languages later. What makes the project unique is the high level of linguistic expertise in Arabic that is present in the partners, whilst in the research community as a whole, Arabic continues to receive relatively little attention. Combining the Arabic linguistic modules with the European language ones, and then applying modern information retrieval techniques and translation memory methods will ensure that high quality translation memory performance can be delivered whilst still maintaining the standards of speed and efficiency offered by other products. The translation checking module, the document management system and the data conferencing abilities will further increase the functionality of the software to create a unique localization product. More global objectives include:
The consortiumThe project started in February 1998 and is due to finish in May 1999, when the IDOL partners will present their software either at a conference or in a workshop to be held in Tunis. The IDOL (IRS-based DOcument Localisation) project is funded by the European Union and by the Swiss Federal Office for Education and Science. It comprises of partners from five countries. Headed by Rafik Belhadj Kacem of EPOS (France), the project also includes Swiss research institution ISSCO, the Centre for Computational Linguistics at UMIST (United Kingdom), UNIVERSAL (Tunisia) and IME (Lebanon). The total budget for the project is 783k ECU, of which 420k ECU is contributed by the European Community. EPOS ("Etudes et Programmation en Optimisation et Software") originally specialized in optimization (operational research, linear programming, virtual programming languages) but have rapidly expanded into the field of information management, investigating documentation research, advanced software development and information retrieval systems. CCL ("Centre for Computational Linguistics") at UMIST, Manchester, has undertaken a significant amount of research in machine translation in its broadest interpretation, with projects covering various different approaches to the general problem of text translation, terminology and dictionary development and foreign language text generation. IME ("Integro Middle East") has a long history in arabization, localization and multilingual domains in addition to its experience in terminal emulation and communication protocols for a variety of computers, platforms, in user interfaces, ergonomics, re-engineering, systems integration and consultancy. ISSCO ("Istituto Dalle Molle per gli Studi Semantici e Cognitivi") is the oldest and best-known European center for computational linguistics (computational semantics, parsing and machine translation), focusing recently on multilingual linguistic descriptions and grammar formalisms, corpus-based computational linguistics and NLP evaluation methodologies. UNIVERSAL is a Tunisian SME software company, whose expertise lies in the fields of management applications software, human-machine interface and applications for automatic processing of Arabic, English and French, developing software for the multilingual indexing of multilingual documents, multilingual database querying and the maintenance of Arabic and French dictionaries. DeliverablesThe project’s deliverables can be broadly divided into five areas, according to the partner responsible for each. A basic prototype of the system is expected towards the end of 1998 with the final system ready by May 1999. EPOS is in charge of overall project management and is also primarily responsible for general integration and demonstration as well as the exploitation of the project’s work. It will further be developing an information retrieval system (IRS) with support for Arabic, and this will form part of a document management system for translators. UMIST’s central role is to develop the translation memory (TM) engine of the program which will function in English, French and Arabic. It will develop the basic non-Arabic linguistic resources and bilingual lexicons used by the TM and document alignment modules. UMIST is also in charge of publicizing the IDOL project. IME is responsible for the document indexing module of the program as well as for the data conferencing capabilities. These capabilities will be merged into the final program to allow translations to be discussed by translators in different cities or even different countries simultaneously. ISSCO is responsible for creating TRACER, a translation checker that will enable translators to improve the accuracy and consistency of translations. It will also develop the document alignment module which will prepare previously translated documents so that they may be entered directly into the memory of the TM module. UNIVERSAL is developing the Arabic linguistic resources needed both for the translation memory and document alignment modules. It will also be working closely with ISSCO and UMIST to assist with the implementation of their modules for the Arabic language. Further informationFor more information, the IDOL Web site can be visited at the address below. Periodic updates detailing progress will be made. Alternatively, contact the project leader, Rafik Belhadj Kacem or (for more general information) the project liaison, David Mowatt, at one of the e-mail addresses below. Web site:
|
![]() 8-12 December 2008 |
||