|
In this issue…
Translating technical materials with PC MT software
My personal challenge and suggestions
ProfileNot as an MT specialists, but as a technical expert of the CASE, or Computer Aided Softwear Engineering, I took the challenge to translate technical materials with PC MT. One of my job requirements is to promote CASE in Japan, and I have been sending thesis to journals for CASE studies and those in CASE- related fields, while lecturing at some user seminars and workshops. There wouldn't be any efficient promotional activities realised without useful reading materials to supply for the future CASE users. In fact, as most technique related to CASE and most softwares, such as the C language and UNIX, were developed in the Europe and America, there was scarcely no translation available in this country when I started translating them. Preparing referential materials in Japanese is naturally among my job. Therefore, I have been translating as many CASE-related books from those famous English books as possible. However, I am not a professional translator. Recently, I have translated "Obujekuto shiko sekkei no hyojun HOOD3.1" "HOOD3.1 as the standard of the objective design", published from Kaibundo Shuppan Publishing. I have experienced translation of some technical document before this title since 1987, including five or six CASE related technical thesis, or I have published two technical materials of the CASE: "Analysis of structure of the real time system"; "Practicing the introduction of the CASE tools", both from the Nikkei BP Publishing, by utilizing MT in part for the latter two books. I would not have continued the translation of the technical book for years, without the publisher's advice. "The outfit necessary for an English to Japanese technical translator is first the penmanship for correct Japanese, second the expertise in that technical field, then comes the English ability as the third". He also suggested that the general person expects the ability of an translator in the opposite order. Certainly, that publisher supplied books in easy to read Japanese. However, businesswise, it is not expected to be quite profitable that translation of technical books to not sell good. The publishers would expect another beneficence besides sales records, or, the translator would be a spirited volunteer. It is the same in the field of CASE, while its readership is quite low. The recent title's fixed price is 3,000 yen, with the first edition limited to only 1,500 copies. Moreover, no reprint or revision is planned at all, since it is not presumable that all first editions will be sold out. Usually, when a technical book is translated, a lot of first editions are published about only 3,000 copies. Only good selling title is reprinted/revised. Even the user manual of the ever popular C and UNIX, the sales of these translation are reaching only tens of thousands of copies. Thus, it is widely accepted that the technical books does not sell good, thus most of them will remain untranslated. It is my challenge to abolish this vicious circle anyhow, by realising the translation of small quantity with numerous kinds more simply. In that sense, my recent experience of translating the book benchmarked to improve the MT system for specialising in technical document translation. Conditions essential for MTTranslating non-native author's bookThe book, written by a person who is not a native English speaker, suits best to an MT. The original author of the book in English I translated is a French. His mother tongue is not English, and that fact showed in the easy English expressions in the original book. No literary complicated expression nor idiomatic expression remainder is seen in the pages, thus making it easy for a Japanese to understand the contents. Moreover, many fixed sentence patterns are used in this book. As for the translation, this type of patterned sentence composition is ideal. For example, a sentence or phrase like the following has syntax comparatively easy to understand: "the ODS of an ACTIVE OBJECT is an object descriptor composed of the following sections". Those At the same time, there were many sentences here and there interspersingly that only words/phrases are replaced for another word, as those capitalized in the above sample sentence. The MT job with those kinds of sentences was quite easy. ShortcomingsMany spelling mistakes occur. As the author's native language is not English, he mistyped many words. Much is rescued to the function of the spelling checking and so on. However, the function of the spelling checking overlooks such as: 'an' and 'on' distinction/position. The spelling checking function does not distinguish an and on, as far as they are spelled correct. In some cases, if the position of those words in the sentence is grammatically correct, even those commercial grammar checking function as the "Grammatic Mac" does not point them out as mistakes. Object of personal pronoun unidentifiedThe relation with and the syntactic analysis of the immediately previous sentence don't show the object of the personal pronoun. The original author isn't quite fluent in English grammar. It isn't possible to judge all the personal pronoun's object with only the professional knowledge of the field. The original author and the publisher should be responsible for those errors of the spelling as well as English grammar errors. As I was trying to translate the original in order to popularise CASE in Japan, it wasn't reasonable to leave such mistakes unreported. Therefore, I questioned and confirmed with the original author on those matters, however, sparing considerable amount of time and energy. Anyhow, if the error in the original is left uncorrected, it will result in a translation error with the translator. MT house and the cost-effectivenessWith this translation project, I tried to do any job process by myself, including to challenge to translate with an MT. As mentioned above, I experienced to utilise MT partially twice in the past, and actually, I ordered the MT part to a MT specialist company, or an MT house as I call it here. However, I suffered very much by this method in several ways, while I ordered them not too much but to "preedit, process with MT and post edit." Actually, I rewrote the whole text by myself at the end literary, since:
Recently, the idea of groupware is propagated aggressively, and sharing translation work among several people is popularizing excessively. It could divide a translation by job stages such as pre editing and translation. However, I am afraid that it is meaningless that more than two person shares one translation original, or the translation original is cut by paragraphs among several translators, or the person in charge of the rewriting is troubled by the adjustment of the style. The present situation avoids the work of the human beings such as MT house to be involved in, as far as I set my goal for this project to be: "to realise maximum means possible for English to Japanese translation for technical books". Besides those points suggested above, the more serious problem of appointing an MT house for translation is the cost. In the past two translation projects, I paid more than 1 million yen to each MT house. This time, the contracted payment from the publisher for the translation was far small to cover these costs for assigning MT house. In other words, it is true that most of the English to Japanese translation projects for technical books is able to pay only the half the commission of an MT house. Media of the original textIt is indispensable to acquire the original file in digital media, when you are to accept the translation job. To start this project, I expected the original electronic file of the published original to exist somewhere. Even when I were to translate the original manually, I would need the original text in the form of electronic file, and use it on a word processor. In that case, replacing the translation equivalent for a certain terminology with the cut-and-paste function, it would help me save time and energy considerably. Of course, digitized files are much advantageous when applying MT. Therefore, the acquirement of the digitized file is an absolute condition. In addition, I tried to get a file with the best format for me to use in my working environment. When not acquiring the original text in digitized format, you may input it with an OCR to make up for. However, I avoided to use it, reasoning that application of OCR is still meaningless as: A single set of the OCR system is more expensive than the MT software itself. It is not possible to expect OCR precision by 100%, while it is quite doubted that the OCR will correctly extract only the text part excluding the charts and figures. Also, the work to find and correct recognition error of the original input data causes tremendous time loss. These points are enough to conclude that the use of OCR for original text input in this case is inefficient, and it makes you handle unnecessary works, losing a great deal of you labour and time. There is also a significant risk expected with the rectifying tasks in the case of OCR application, to involve mistakes, too. Let's take the case of building an architecture for an example. You will put together all the material correctly and finish a building. Applying OCR input is just like that you will dismantle all of once completed architecture intentionally at the building site, only to check the number and type of those material one by one. After confirming that, you will assemble the materials once again just as before. In the process, it may occur that parts of the material will break off and/or you will incorrectly join the parts. Hence, you will see that the digitized file is the easiest to handle. The translator concerned should try to acquire the electronic file surely. Once the original text is acquired, I did all the translation process with the computer. Moreover, I tried to centralize the whole work around an MT in the computer environment, in order to utilize the data to the full extent. Also, the process wasn't accompanied by the manual operation as much as possible, to cut the involvement of errors and time loss. Those arrangements were based on how I recognize MT that, "Now, it is no more the time to be helpless with an inconvenient MT". Some way should be sought to utilize MT's ability to the full extent. The rest will be decided by the budget, that by comparing the expense and the effect, I choose the best suiting MT software. MT for the personal computerAs for the translation project this time, I selected only the MT for the personal computer. I didn't select any those quite expensive MT software for work stations and the UNIX machines, or the translation service offered on the personal computer communication was also disregarded. The cost of those service are also expensive, while they charge for the service itself and the message forwarding service in addition to the access fee to the PCC net. The communication costs for a very long file are unacceptable, especially in Japan, where you pay more than the foreign countries such as the United States of America. Then, I chose T system of the K company from among those three Macintosh MT softwares on the market finally, although I seldom had the technical knowledge of the MT for the selection of the MT for PC software. The main reasons to pick up that particular software were:
Translation proceduresSome publisher requires the expensive price for the provision of the digitized data of the original text. When I translated a technical book in the past, The magnetic tape for the SYVision format was charged 500 U.S. dollars. They commissioned the format conversion service of the floppy disc into the IBM-PC format for another US$5,000. Fortunately, the publisher in France this time provided a frame maker 3.0 format gratis. However, that format may be popular in the Europe, I didn't have this type of software, and it was necessary to convert the software into plain text file format. As I investigated the software market for the cheapest priced, and I bought the Conversions Plus software released from DATA VIZ for 20,000 yen. Actually, the purchase of the tool proved to be wasteful, when the publisher in France converted that frame maker file into plain text file and sent in without any charge, after the purchase of the data conversion tool. Anyhow, I could avoid extra task of data conversion. Even though, it is very important with the use of the MT that you will do your best to acquire a digitized text with the most handy format with cheapest cost. If possible, that should be obtained complimentarily. At the next stage, I began the maintenance of the MT environment. Since I have never used that MT system before, I tested the general affinity of the MT and the translation original. I made the whole text processed with MT altogether at once, without any pre-editing. As a result, I discovered various problems, and those problems were fed back the system developer, being small or large. They answered to my inquiries very kindly each time, so that I could check off those problems with higher risk fairly beforehand. Next comes the maintenance of the dictionary. I decided to build a use dictionary by myself. The feasibility to apply an conventional domain specific dictionary, such as the computer dictionary on the market, to the field of CASE especially, has proved to be very impractical through my past experience with MT, so that I never imagined of the purchase of such terminology dictionary. The economic reason came secondary in this case. First, I gathered necessary terminology from the index annexed to the back of the original book, and registered those in the user dictionary. This work would have been unnecessary if the terminology, or special dictionary in the market is applicable to the field of CASE, or, the publisher did not insisted to express their character with their own terminology. If the translator himself were equipped and fluent with the domain specific terms encyclopaedically, he would not have built a user dictionary at all. Other cases of the problems involving terminology. Taking examples from my experience, the term list was handed to me beforehand from the publisher, and that list was regarded equivalent to the user dictionary, a requirement to be used for that translation project. At another occasion, a publisher requested another condition that I should use the idiomatic or historic expression and the terminology of the CASE industry. And for the last project, the thesaurus of the information processing was used as the main reference. In the future, when the publisher is conscious of the culture of the MT strongly, they should apply the domain specific dictionaries on the market in their own ways, or those terminology will never play an active part in this field of MT at all. Priority of user dictionaryAs mentioned above, I test-translated the whole original text with MT altogether at once, without any pre-editing. Among those various problems I discovered, the most serious one was that with the relationship or priority between the basic and user dictionaries. The translation equivalent of the basic dictionary had the priority over the user dictionary with the T system I used. It wasn't successfully translated with the registered words in the user dictionary. With the trial translation, I could obtain a list of English words which translation equivalent was chosen from the basic dictionary. CountermeasureFortunately, the MT system I used recognised Japanese words mixed in English text as symbols, and it input such English text with embedded Japanese words just as it to the translation process. Advantaging with this character, I replaced those English words which translation equivalent had been used from the basic dictionary with Japanese in the step of the pre-editing. Japanese translation equivalents were embedded in the English text. As a result, there was a unexpected merit, too, that I was able to read the original text in the composition containing right amount of Japanese words in English text. The English became that easy to grasp remarkably. However, to confirm the priority of the basic dictionary over the user dictionary, you have to look into each word with interactive mode of the MT system, and that job almost killed me. Only if I could have known which dictionary on the market have been applied as the basic dictionary, and I could refer to that dictionary to check out words to be registered in the user dictionary efficiently. In some cases, the same dictionary would be supplied as a CD-ROM disk. Does the system adopt its own basic dictionary? If that user dictionary is not published, then, the function to refer and confirm it simply should be supplied to the system. It doesn't efficiently build the user dictionary without that type of reference function. After checking the dictionary, I looked for the part of text which needed pre-editing. There, I took up the way to process the whole sentence at once, and have the MT system point out those part where the pre-editing is necessary. In case of English-Japanese translation, a lot of people don't pre-process the original text at all. However, in my project, I esteemed a pre-editing as much as possible, while I realised that I compensated the translation quality to the system price: I did not expect that cheaper MT system to pursue the translation precision. It was quite significant that the input of a little complicated English produces Japanese translation with obscure meaning. I wish I could avoid the stress would stagnate checking the obscure Japanese meaning. In the output of trial translation, obscure modification of words were found in many parts: Words such as 'and' and 'or' and so on. However, the pattern was limited. It is necessary to supply the MT with the function to correct these mistakes simply. Although there were various minor problems besides, I was convinced that by applying various devices, MT will be made handy for application to technical document translation. Long-range challenges for MTInformation in figures and tablesThe digitized text acquired this time was only the part of the original text; information in the figures and tables were never included in that. Therefore, the human being must translate the text in the charts. Actually, all the charts were drawn up by the publisher, duplicating the original. In the future, it is necessary to realise the means to use all the information in the original, including the charts and figures as well. It is quite indispensable to maintain such a system for that purpose. Reasons for the success of my projectOne reason contributed to the success of my translation project will be that the original author wrote the text while he was not born English. In other words, English similar to the limitation language was efficient. The technical document is not a literature, so that sophistication in expression will not be given the first priority. It suggests that in the future, the authors of the original text will be encouraged to write in simple English for the benefit of translation with MT. Paying incentives to cooperating authors should be taken into consideration, regarding the payment. For that purpose only, it should be realised to standardise worldwide the specification of the limitation language. More effective ways to check mistake of translated text will be discussed. At the present, in order to check out the incorrect translation, one may process the Japanese with such rewriting or revision softwares, and may try to pick up any misspelled Japanese words in the translated text. However, when the translator reads the Japanese with sufficient knowledge of Japanese, he/she will find the mistaken expression of Japanese generally. The serious problem with finding out translation mistake lies in whether the meaning of the original is correctly expressed in the translated text or not. The need for such a system which checks that correspondence is found more than once in my project. When reading only the Japanese translation for a long time, you doesn't notice the partial omission of English text. Mistranslation or misconception of the part of a word would be found, when the Japanese text is corresponded to the English original quite carefully, however, that will not prevent the mistake totally. Therefore, I applied two MTs for an original text to process the translation twice among the systems, and compared the original text with the twice processed text to check the efficiency of the MT software as shown below: Original text (English) -> English-Japanese MT Software ->
With this innovative new way, you will find the mistranslation in the Japanese text successfully. The development of such a system to verify improper Japanese translation from English is earnestly expected. This article was reprinted with the kind permission of the AAMT in Japan. Asia Pacific Association for Machine Translation (AAMT)
|
LISA Business Data Forum Summaries and Presentations LISA Globalization Consulting Network Webinars and TouchPoint Advisory Calls LISA Forum USA LISA@Chinasoft Fair LISA Forum Asia LISA Forum Europe LISA Forum India Open Standards • TBX • TMX |
||