Main Content
Answer Archives
This page features answers to questions that have been submitted to LISA and answered by LISA staff and discussion forum participants. To ask a question of LISA, send an email to questions@lisa.org and we’ll try to help you out. If we feel your question would help others, we will post the question and our response here.
- What are the top ten languages translated or localized into, with some metrics behind the answer? (Ross Mayfield, CEO of SocialText)
Show/hide answerThis is a very good question, but one that is surprisingly difficult to answer as no one really knows the size of this industry and companies are often reluctant to share their data because it is closely tied to global strategies.
In addition, the answer depends on whether one is interested in translation volume or strategic importance. The two differ substantially, and in important ways.
Let's start with translation volume, using figures from 2000 (the latest year for which I have reliable statistics). These figures represent the percentage (of those companies that are localizing at all) that are localizing into particular languages:- English → French (~30%)
- English → German (~25%)
- English → Spanish (~25%)
- English → Japanese (~22%)
- English → Italian (~20%)
- English → Chinese (Simplified) (~15%)
- English → Portuguese (~12%)
- English → Swedish (~10%)
- English → Dutch (~8%)
- English → Korean (~7%)
These figures are a few years old, and the priority has shifted during the past few years away from European languages towards Asian languages, which leads us to the concept of "strategic" importance.
French, Italian, German and Spanish (a.k.a. FIGS) and other European languages are "maintenance" languages for many companies: i.e., companies already have a market in Europe, and have to maintain and serve it, but the market is not one that is seen as part of a strategic plan to gain global market share. This does not mean that these languages are unimportant, but rather that they are unlikely to represent new growth areas.
In contrast are "strategic" languages, i.e., those that represent new market areas with a potential for new revenue streams. In this view, China seems to be the number one language at present (I write this based on a number of LISA presentations and the general "buzz" in the industry).
While we don't have any hard data at present on strategic language (for obvious reasons, companies tend to keep strategic information quite close), if we look at those countries where U.S. and European businesses are trying to establish a foot-hold for consumer-oriented products and see new large markets (and where the market can be accessed easily with a single language), you will have a picture of the strategic languages. I suspect that the list would look something like the following:- Chinese
- Japanese
- Spanish (for U.S.-based companies that see Latin America as a market)
In conclusion, the most important language question does not have an easy answer, but there are some general trends we can identify.
[Answer provided by Arle Lommel]
- What Is the Difference Between TMX and TBX? (from a site visitor)
Show/hide answerWe are often asked to explain the difference between TMX (Translation Memory eXchange) and TBX (Term Base eXchange).
LISA has two different standards for dealing with two different aspects of localization workflows. The first, Translation Memory eXchange (or TMX) is an XML format for the representation of translation memory (TM) data. Term Base eXchange (or TBX) is also an XML format, but is used to represent concept-oriented terminological data (usually multilingual) for exchange purposes. Although there are some similiarities between the two formats, they are used for very different purposes.
A relatively simple format, TMX stores texts and their translations in a vendor-neutral, tool independent platform. It serves as a pivot format to allow users to move TM data between tools with little or no loss of data, and can also be used for archive purposes. Because it is a publicly-defined format, tools' developers need only to create a TMX import and export routine to share memories in their proprietary format with other translation tools. Without TMX, this would be very difficult, as developers would have to create import and export routines for every other tool on the market with which they might need to exchange data.
TBX is relatively complex by comparison. A powerful format that can represent complex terminological databases for exchange purposes, TBX serves as a vendor-neutral format for representation of terminological data. Terminological data, however, is not as easily standardized as TM data, so TBX actually is a language for describing terminology databases so that they can be interpreted by other tools. Not all aspects of these term-bases may be transferrable—different term-bases have different features—but TBX allows term-bases to be represented in as generic manner as possible to better allow reuse of data. TBX has also been moved to an ISO framework (as ISO 30042) and we hope to have it accepted as an international standard in 2008.
[Answer provided by Arle Lommel]
- I am managing translation of a document from English into Mexican Spanish using Adobe InDesign, but InDesign's spell checkers don't include Mexican Spanish, just "Spanish: Castilian". Where can I find Mexican spell checkers for InDesign? (from a site visitor)
Show/hide answerIn general, you don't worry about whether it's Mexican or continental Spanish as far as spell checking is concerned. Although there are substantial spoken differences, the InDesign dictionaries will work fine for both. The reason the InDesign language list says "Spanish: Castilian" is to distinguish between Castilian Spanish, Galego (very similar to Portuguese but with Castilian-style orthography) and Catalan, all of which are spoken in Spain, not to exclude Mexican Spanish. Both New World Spanish and Castilian share a common orthography and hyphenation is the same for the two. You might find some individual words not showing up in the dictionary, depending on how dialectal the Mexican Spanish is, but most New World Spanish types are considered dialects of Castilian.
However, this question raises workflow issues. Why, in most workflows, would you be waiting to the final DTP stage for spell checking? If you are outsourcing the translation, spell checking and review should be part of the service you are paying for. Since most translators work in Microsoft Word or a translation workbench's editing environment, and not InDesign or any other DTP application, you should probably not be doing major spell checking in InDesign, especially if you don't know Spanish. Also, you should be sending the finished projects back to the translator (probably as a PDF) so he or she can review it and make sure you haven't introduced any errors. (Expect to pay extra for this review phase - the translator's responsibility ended when he or she sent you clean translated text if you are doing the DTP -, and to pay for any mistakes you introduced - the translator should cover any mistakes that he or she made -, but paying a little extra for review of the DTPed project beats having to junk thousands of printed copies of a complete project.)
[Answer provided by Arle Lommel]
- How can I type in various languages on my computer (from a site visitor)
Show/hide answerThe answer depends on what platform you are on, and moving back and forth between Mac and Windows can cause serious frustration in this area. In general, I personally find multilingual typing easier on the Macintosh, but the gap has narrowed considerably in recent years. It used to be that you had to memorize the ALT+ code on Windows to get anything not seen on your keyboard, but that, fortunately is long past.
Here are some tips for Mac and Windows:- Macintosh For Mac OSX, you'll need to enable the keyboard layouts you want to use. To do this launch your System Preferences and select the International preference pane. From their go to the Input Menu pane. I would highly recommend selecting “Show input menu in menu bar,” which pops up a handy menu at the right end of the menu bar for easy access to various keyboard layouts. From there you can select which ones will appear in the menu and be available to you. If you need to be able to see what the keys correspond to, make sure you turn on the Keyboard Viewer, a nice little utility that pops up an image of your keyboard with the current key assignments displayed on it. You might also want to turn on the Character Palette, another utility that allows you to browse Unicode characters without needing to know how to type them. It’s very handy for when you need an occasional character that you can’t type. Note that most Apple keyboard layouts use a “dead key” typing system where, to type an accented character, you press the option key plus another key to select an accent and then you type the base character it goes on. For example, to type é with the standard U.S. keyboard layout, you type OPTION+e (for the accent) and then e (for the base character).
- Windows For windows, you will need to go to the Regional and Language Options control panel, go to the language tab and then click on Details. There you can add individual keyboards. Make sure you click on “Language Bar…” in the Preferences section and turn on “Show additional Language bar icons in the taskbar,” which will add the language options to your taskbar. There is not a built-in keyboard viewer, and Windows keyboards do not usually make extensive use of deadkeys, so most of them are more limited than their Macintosh counterparts. If you want the functionality of the Apple Character Palette (plus some extra goodies), you may want to consider purchasing PopChar, a Unicode-capable character-insertion utility (a Macintosh version is also available).
[Answer provided by Arle Lommel]
- How do you localize computer code in [language X]? (from a site visitor)
Show/hide answerWhile the details of localizing particular computer languages are beyond the scope of what we can answer here, the most important things to remember is that code, even more that documentation, must be properly internationalized if it is to work. In particular:- Text strings and other localizable resources should always be externalized rather than embedded. This practice refers to placing strings and other resources in separate files so that localizers do not need to see the computer code in order to localize content. These resources are then referenced at compilation or run time to produce the interface the user sees.
- Do not embed any assumptions about what text or resources will look like in the computer code. For example, do not assume that dates will always appear in month-day-year format or that addresses will appear in street-city-state-zip code format. If you have any questions about whether a particular data element or item needs special attention, consult your localization services provider.
- How do you localize websites?
Show/hide answerThe answer to this question depends in part on the nature of the website. Simple websites may require only the translation of HTML files and any graphics, while data-driven or large websites will involve much more complex processes. One major difficulty faced in localizing websites is that graphics displayed on the web are seldom editable and so cannot be translated directly. As a result, source-format files (e.g., Adobe Photoshop, Illustrator) should be saved and provided for localization.
For large-scale websites special tools are often used to detect changes to the source and automate decision-making processes about what to translate and when. For more information on these tools, please read the LISA Best Practice Guide on Managing Global Content. You may also want to consult LISA’s Best International Web Support Sites publication for more information on the practices of industry leaders. Please note as well that localization of websites often involves extensive server-side preparation to support multiple language versions.
- How many errors are acceptable in a translation?
Show/hide answerWe are asked this question a lot, but there is no simple answer to the question. The answer depends in part upon the severity and nature of the errors, but also upon your business requirements. If you are in a heavily regulated industry, for instance, errors may have major ramifications, so you will not tolerate many of them and will need to invest in extensive quality assurance and quality control processes. In other cases, however, quality will be less important and you may tolerate higher error rates.
To determine what error rates you will tolerate, take samples of previous translations you consider good, barely tolerable, and bad. Review them and identify the sorts of errors that appear in each one so that you understand why they fall into each category and count up the various types of errors. With the profiles for each type of document in hand you can then determine what sorts of errors matter in your case and how many you can tolerate and you can develop an appropriate error profile in the LISA QA Model.
- Do you have any information on automatic translation from Simplified Chinese to Traditional Chinese?
Show/hide answerWhile there are products to assist in conversion from Simplified Chinese to Traditional Chinese (and vice versa), the process cannot be entirely automated and requires careful review. Very often the terminology used in mainland China, Hong Kong, Taiwain, and other Chinese-speaking areas differs and a simple change from one character set to another may result in incorrect terminology or even nonsense. While automated tools can assist in the conversion, they cannot handle all aspects of moving from one version of the Chinese writing system to another.
- I need information about localization issues for a specific language or country. Where can I find this information?
Show/hide answerAt present there no centralized repository for such data. You should discuss issues with a localization service provider or with local staff. If you collect service data for your international markets, you can also see what issues consistently cause problems for particular markets. If you are looking for technical data about locales (such as information on character sets, sort orders, date formats, etc., much of this information is available in the Unicode Consortium’s CLDR project.
- I need to know the language code for [language X] and the country code for [country Y]. Where can I find this information?
Show/hide answerCountry and language codes are defined by the International Organization for Standardization. However, there are a number of free online sources for this information, including the CLDR.
- What are the legal standards companies involved in [industry X] need to comply with in [country Y]?
Show/hide answerThis question is a complex one and there is often a lack of information about it. To determine the answer to this question, you should consult relevant industry associations, local international chambers of commerce, and local partners.
- What guidelines are there for preparing terminology resources?
Show/hide answerThe LISA Terminology SIG is active in developing these resources and their page has information on this topic. If you are a LISA member and want greater access to information, consider joining the SIG.
- What is the best practice for localizing Cascading Style Sheets (CSS) for web content? Should I use one style sheet for all languages or separate style sheets for each language?
Show/hide answerMany components of style sheets will not change on a per-language basis, so it does not make sense to use separate style sheets for each language. However, some components will change. In general the best practice is to use multiple tags in the head of your HTML in order to link to a common style sheet for styles not dependent on specific locale, plus links to locale-specific style sheets. In many cases if you set the xml:lang and/or lang attributes in your HTML declaration, you can use the same style sheet for multiple languages. In this case you might have a common style sheet for all language versions of the site, plus additional style sheet for Roman-script languages and Cyrillic-script languages, Japanese, Korean, etc. (Note as well that in most instances it is best practice to link to style sheets rather than embed them in the head of your HTML files.)
- What particular concerns are there for localizing XML?
Show/hide answerXML offers tremendous advantages for localization in that almost any localization tool can work with XML data. That said, however, XML can post particular challenges. For more information on localization of XML, please see “Coping with Babel: How to Localize XML,” part I and part II, or http://www.opentag.com/xmli18nfaq.htm.
- Where can I find out more about localization testing?
Show/hide answerA great resource for learning more about localization and internationalization testing is the book Software Testing and Internationalization, available as a free download from LISA.
In general, localization testing can be broken down into a number of areas, including review of localized content, functional testing of localized versions, and usability testing of localized versions. Of these areas, review of content is most likely to be offered as part of localization services, although you may wish to initiate a separate independent review process. The LISA QA Model is particularly suited to this type of testing. Functional testing is usually conducted in specialized in-country laboratories and can be contracted for with many larger localization service providers. Usability testing requires a panel of typical users in a lab setting and is very labor intensive but can offer great insight into the language- and culture-specific issues users may face. For more information on testing you can learn more about the LISA QA Model or consult with a services provider about functional and usability testing.
- Where can I find the full text of a particular localization standard?
Show/hide answerIf the standard is a LISA standard or a joint LISA/ISO standard, you can find the text free of charge on the LISA website’s standards page at http://www.lisa.org/Standards.30.0.html. In the case of OASIS standards, please visit http://www.oasis-open.org for more information. For the W3C’s ITS initiative, visit http://www.w3.org/International/its/.











