LISA Home page [© 2010 • ISSN 1420-3693 • www.localization.org]
© 2010 SMP Marketing • ISSN 1420-3693 • www.localization.org

In this issue…


Software Conversion for the Chinese Market
Introducing Code Scanning and AI for Translation

Dr. Lloyd Yam, Timeless Software Limited

Timeless Software has developed tools for simplifying localization into Chinese featuring code scanning for displayed strings and artificial intelligence to aid in the editing of Chinese sentences. Timeless also provides an easier way for the user to input Chinese than by the standard methods.


Background

Timeless Software's globalization project started over a year ago. Chinese and English were initially used as the major links to other languages for practical purposes; expertise and existing databases in English-to-Chinese and English-to-French were relatively easily available, while the same was not true for French to Chinese. Timeless therefore started developing a bilingual database using English as a bridge to the other European languages and Chinese as the link with the double byte world.

At the start of 1997, following a joint venture agreement with the Beijing-Guangdong Computer Center, Timeless stepped up its R&D on bilingual translation as if on a two-way street. At the time, it seemed that one side of the street was a highway while the other was a bicycle lane—single byte technology was as old as the computer keyboard, but the technology of reading and interpreting its double byte counterpart had hardly begun. Timeless had to reclassify the entire set of Chinese characters to facilitate subsequent applications.

Why Reclassification?

In the Western world, kids start learning to read with the letters a to z. When daddy comes home with a PC, the keyboard has the same letters on it. Chinese children learn their characters as combinations of strokes; the keyboard looks strange to them, and when you stick pictures of Chinese radicals all over the place, you scare the daylights out of them. Timeless had to develop a new double byte technology, because we wanted to make things easy not only for the computer, but also for the user. The proprietary Timeless database consists of words and phrases which can be accessed by a large number of methods.

While R&D is still progressing, Timeless has now completed a set of translation tools and applied them to a major software package to provide a turnkey solution. The tools are suitable for PC and mid-range applications (such as IBM's AS/400) and for large projects (over one million words). The following summarizes the process of localization and illustrates two unique features of the Timeless tools—code scanning and artificial intelligence.

From the customer's point of view, the main components of a software application are the source code, text files, graphics files and documentation.

Paper documentation usually has file equivalents; since otherwise expensive optical scanning and/or manual translation has to be undertaken. Together with text files, documentation can be treated as text translation, which will be discussed shortly.

Graphics file conversion is a labor-intensive exercise. In most cases, when text and graphics are in a single file, editing and superimposition have to be done on the picture file manually. If they are in separate files, Chinese words will be typed on the picture. In either case, the Timeless auto-sizing utility speeds up work significantly.

Code conversion handles displayed strings, currency units and thousands separators. It is performed effectively by a sophisticated automatic scanner recently developed by Timeless. The development of such a code scanner specifically for localization was an investment issue, but fortunately, Timeless's tools for Year 2000 compliance conversion could be reused with only minor modifications.

The Timeless scanner removes a string, puts back a string variable at the same location in the code and stores the removed string in a file. The converted program is thus language-independent. When the program is run, the appropriate language strings will be read from the files, and the variables set before strings displayed on screen. This makes globalization very convenient. For another language application, only the file module has to be translated, and it is not necessary for the client to provide the vendor with the source code.

Depending on the programming language and the user interface, menu conversion can be performed on the source code or as menu editing using the programmer's tool kit. Here, the Timeless graphics editor comes in handy, as it prints Chinese characters from a customized database on a picture or menu. The user can print words and phrases by simple mouse clicks instead of the cumbersome Chinese input methods.

Here is a step-by-step description of the translation procedure as illustrated by the Timeless software:

First of all, the computer reads a text file or HTML file from the Internet and translates words and phrases that are found in the customized vocabulary. A customized word is defined as one that has only one acceptable translation as far as the client is concerned. The mixed text is then printed on screen. The user then has two options: to translate the rest without inspection, in which case the computer will select the most common definition; or to scan all possible translations of each word and make the most appropriate selection.

After this, sentence editing with artificial intelligence comes into play. Sentences are displayed one by one, and the computer will suggest, in a graphical way, rearrangement or deletion of words and groups of words to make the Chinese sentence read better. The user presses Enter to accept and Esc to skip. This continues until no more suggestions are made. If further editing is required, the following options are available, with graphical animation being used to make the editing more effective and hence speed up the work: a) two groups of words can be transposed, b) words can be deleted, c) words can be inserted, d) alternative definitions can be used, and e) sentences can be rearranged by pressing numbers next to words.


Dr. Lloyd Yam
Research & Development Director
Timeless Software Limited
23rd & 24th Floors, CRE Building
303 Hennessy Road
Wanchai, Hong Kong

Tel +852-2594-9613
Fax +852-2519-7186
E-mail: dryam@timeless.com.hk




Contents


LISA Business Data

LISA Publications Catalog

Industry Insights Reports

Best Practice Guides

Surveys

QA Model

Forum Summaries and Presentations

LISA Globalization Consulting Network

Webinars and TouchPoint Advisory Calls


Join LISA

Subscribe


Upcoming Events

LISA Forum USA
(Foster City, California, April 13–16, 2010)

LISA@Chinasoft Fair
(Chengdu, China)

LISA Forum Asia
(Suzhou, June 28–July 1, 2010)

LISA Forum Europe
(Budapest, October, 2010)

LISA Forum India
(New Delhi, December, 2010)


Open StandardsTBXTMX

Terminology SIG

Job and CV Postings