|
In this issue…
TMX has been Implemented!
In this article, Prof. Alan Melby, Technical Secretary of LISA's OSCAR SIG, reports on the latest activities by the group. Success in implementing TMXThe big news from OSCAR, for those of you who missed the OSCAR session at the LISA Forum in Boston and its follow-up report in Shanghai, is that the TMX standard for Translation Memory eXchange has been implemented to one degree or another by seven major localization companies (Alpnet, International Communications, Intertrans, SDL, STAR, Sykes and Trados). Commitments made at the last OSCAR meeting in Madrid were kept and implementation has begun. This is great news, but what, exactly, does it mean? At the Boston meeting of the OSCAR Steering Committee, it became clear that implementation is not a simple binary question. There are two aspects of TMX implementation, import and export, and one does not necessarily imply the other. That is, a tool could import from TMX files but not be able to export to them. In addition, the OSCAR Steering Committee defined three levels of implementation:
Assuming that development and alpha testing continues at the rapid pace it has followed since Madrid, some tools developers could announce useable commercial implementations of TMX this year. This is very good news for end users of translation technology. TMX and TBX: plans for the futureAnother development within OSCAR is the beginning of work on TBX, the companion standard to TMX for termbase exchange. After considerable discussion in the formal meetings, halls and restaurants at the Boston Forum, a consensus emerged that TBX should incorporate ideas from at least the following three sources: TMX, MARTIF, and OLIF. Good features of TMX that should be maintained include XML compliance, meta-markup tags, and the "ude" method of documenting the use of non-Unicode characters. MARTIF is an international standard (ISO 12200) for human-oriented terminology interchange; OLIF is a format developed within the Otelo project for machine-translation dictionary interchange. A proposal is currently being circulated within the OSCAR group that combines elements of TMX, MARTIF, and OLIF into a proposed TBX format so that both human-oriented and machine translation dictionary information on the same terms can be held in the same exchange file. OSCAR members will soon be taking time from their busy TMX implementation schedules to comment on the current TBX proposal. Another OSCAR activity planned for the next few months is further testing of TMX, by importing into one tool a TMX file produced by another tool. In contrast, most testing up to now has involved exporting from one tool and importing back into the same tool. Yet another OSCAR activity planned for the next few months is the gathering of samples of termbase entries and machine translation dictionary entries for the testing of TBX proposals. The next sit-down meeting of OSCAR is scheduled for Innsbruck, Austria, on Tuesday, August 24, 1999, in conjunction with the international conference on Terminology and Knowledge Engineering. By then, TMX should be starting to benefit end users within the LISA community, while TBX will hopefully not be far behind. The fact that TMX has been implemented at various levels for both import and export, that it is being used internally and that it will soon be tested between tools is a real tribute to the dedication and spirit of cooperation of OSCAR members, for the common good of the Localization industry. Late-breaking news-SALTSALT (Standards-based Access service to multilingual Lexicons and Terminologies) is a consortium of academic, government, association, and commercial groups in the US and Europe who are working together on the task of testing, refining, and implementing a universal format for the interchange of terminology databases and machine translation lexicons. This universal "lex/term" format is based on the MARTIF standard (ISO 12200, which is in turn based on ISO 12620) for human-oriented terminology database exchange (for further info see http://www.ttt.org) and OLIF for machine-translation dictionary and other NLP lexicon exchange (see http://www.otelo.lu). The two types of exchange are known as "term" and "lex". In addition, the format will include some Unicode and meta-markup features of TMX and, finally, be coordinated with results from other related projects, such as Transterm and Geneter. OSCAR, in a letter by its chair, Franz Rau, has announced its support for the SALT project. TBX and SALT will be developed in conjunction so that TBX will be a LISA-specific subset of the SALT format. LISA has also announced its support for the SALT project. The SALT project itself involves (a) testing and refining an XML-based lex/term data interchange format combining MARTIF and OLIF and called XLT, (b) development of a website for people to try out various XLT utilities, and (c) development of an XLT toolkit for lex/term-related product developers. The utilities will include conversion routines between OLIF and XLT, between Geneter and XLT, and between several other formats and XLT, as well as guidelines for those who want to develop their own conversion routines. For many people in the language industries, the benefits of having one widely used term data interchange format are obvious. Indeed, the following typical comment was made at the LISA Forum in Boston (February 1999) in response to the idea of combining MARTIF and OLIF: "This is what we have been waiting twenty years for!" Projected benefits of work on SALT are expected to be:
It became clear at the OSCAR meeting in Boston that the previous OSCAR plan to look at termbase and MT-lexicon data exchange separately was unacceptable to the localization industry (and probably the wider language industries). An integrated standard is needed now. SALT has been submitted to both the National Science Foundation in the United States and the European Union's Fifth Framework for possible funding in a joint call for proposals issued by both bodies. The US side of the consortium will be headed by the author and the EU side will be chaired by Gerhard Budin, University of Vienna. With the support of OSCAR for SALT and OSCAR's proven record of being able to introduce needed standards and have them accepted by the localization industry, the SALT standard is in an unprecedented position to provide a common exchange format for terminological data. Following in the footsteps of TMX, SALT and TBX promise to help bring about what the industry has been waiting for for years. |
![]() 8-12 December 2008 |
||