|
In this issue…
Customizing Your Applications for China: General Considerations
Many of us who are hands-on in the language industry will be familiar with what Xiao Hui Zhu has to say about customizing software applications for various Chinese-speaking markets. However, if you find yourself putting together a business case for upper managers who do not deal with these issues on a regular basis, then Zhu provides in one place all of the general data that you will need to present. (If you are a LISA Member, you can also access details on various business models for Chinese web sites in Zhu’s presentation from the LISA Forum Asia: China Focus at http://www.lisa.org/archive/forums/2006shanghai/presentations.html.
A Bit of HistoryThe Chinese writing system was developed more than 4,000 years ago and consists of individual characters called ideograms. The written form for all spoken Chinese dialects is now standardized. There are more than 80,000 different characters, with literacy considered to be reached with the knowledge of 1,000 characters. Mandarin is the most widespread form of Chinese, and is regarded as the modern standard for the language. It serves as the official language of the People’s Republic of China (PRC), Taiwan and the United Nations. About 70% of China’s population of speaks Mandarin, with around 800 million people in central and North China using it as their first language, and another 100 million speaking it as their second language. After the establishment of the PRC in 1949, the government made a concerted effort to simplify the written forms, creating what is commonly known as Simplified Chinese. In the process of simplification, some characters were left unchanged since they were already quite simple. Others were changed slightly and some significantly. Taiwan and the Chinese Special Administrative Region of Hong Kong, however, continue to maintain the traditional forms. In the 1950s, Pinyin – an alphabet based on Roman letters used as a phonetic transcription of Chinese characters rather than the replacement of them – was developed in the PRC. The alphabetic writing required a standardized spoken language, but Mandarin had several dialects that formed a big obstacle to the development of the Pinyin standard. The Chinese government has made a great effort to standardize the pronunciation of Mandarin, with the Beijing dialect selected due to its popularity. Chinese National StandardGB stands for Guojia Biaozhun or National Standard and is defined and maintained by the Standardization Administration of China. The first version was created in 1980 and is known as GB-2312-80. At that time, the standard included just over 6,000 Chinese characters plus hiragana, katakana and Cyrillic characters. Under this definition, both first and second bytes were in the range 0xA0 - 0xFE, giving it a total of 9,025 code points. Many of the mini-computers at the time considered the codepoints between 0x80 and 0x9F as control characters equivalent to the codepoints between 0x00 and 0x1F, so this codepage (like the Japanese and Korean standards at the time) avoided those values. However, the 6,000 characters proved to be insufficient, so a new standard was announced in 1994. Commonly known as GBK, it contains over 20,000 characters. It is backward compatible with GB-2312-80 and a superset of GB2312, so all characters in GB-2312-80 remain unchanged. The new characters included more simplified characters, all the traditional characters in the Big-5 character set (see below) that were not already in GB-2312-80, along with some characters unique to Hong Kong. To have sufficient codepoints available for all the characters, GBK changed two basic rules to the GB-2312-80 encoding scheme. The values between 0x80 and 0x9F were added to the possible values of the first byte. The second byte could now take the values between 0x40 and 0x7E, plus the values between 0x80 and 0xFE. A third standard, known as GB-18030, was announced in March 2000. It was designed to be backward compatible with GBK, and at the same time, to support all the characters defined in Unicode. All of the code points that were not defined in GBK were defined in GB-18030. This codepage standard is important for the software industry because China has mandated that any software application that is released for the PRC market after September 1, 2001 must support GB 18030. Big-5 (or Big5) is the standard for encoding Traditional Chinese characters and is used mostly in Taiwan and Hong Kong. The name Big-5 refers to the five major PC manufacturers in Taiwan that worked together to define the standard. Just over 13,000 characters are defined in this standard. While both Taiwan and Hong Kong use the Traditional Chinese writing system, but they do not speak the same language. Mandarin Chinese is the common language of Taiwan, while Cantonese Chinese is the common language of Hong Kong. Due to differences between Mandarin and Cantonese, the Big-5 character set does not fully meet the needs of Hong Kong. The government of Hong Kong published an extension to Big-5 in 1994 and called it the Government Common Character Set (GCCS). In 1999, they revised the GCCS and renamed it the Hong Kong Supplementary Character Set (HKSCS). The latest revision of HKSCS was published in 2001, with 4,818 characters. Since it is a "supplementary" character set, Hong Kong requires the Big5 characters plus HKSCS-2001. Chinese Cultural NuancesLocale data (generally including language and country/region) specifies cultural preferences for a given group of people. Cultural differences usually show up in time/date format, calendar format, address format, currency format, units of measurement, etc. Chinese-related locales include (1) Simplified Chinese for the PRC and Singapore and (2) Traditional Chinese for Taiwan and Hong Kong. Locales can be customized, so new ones may appear in the future to define other cultural preferences based on previous locales. For example, Simplified Chinese for Canada may prove to be a good locale setting for new immigrants from the PRC, since they can integrate their original cultural formats (such as date/time format) with certain cultural norms of their new country (such as the Canadian address format). Chinese Translation RequirementsThere are also some translation-related government requirements (laws) in the PRC that you must be aware of in order to be successful in that market. For example, the PRC is very sensitive to any references to Taiwan, Hong Kong or Macao that suggest that the latter are countries. It is also sensitive to the use of the term “Republic of China” or “R.O.C.” in any references to Taiwan. The term “Taiwan” is the only acceptable form to be used in the PRC. Hong Kong must be referred to as “Hong Kong S.A.R. of the PRC”. Macao must be referred to as “Macao S.A.R. of the PRC.” (“S.A.R.” stands for “Special Administrative Region.”) According to Chinese law, all of the following types of product content must be available in Simplified Chinese for the PRC: (1) product packaging, including product description and manuals, (2) product labeling, (3) safety and warning notices and (4) product specifications, grades and content. Taiwanese law also requires the following content to be written in Traditional Chinese for use in Taiwan: (1) product name (product names can be in English when the English name is registered with a trademark), version and languages, (2) system requirements, (3) product features, usage and content, (4) safety and warning notices and (5) information about production and the manufacturer. Editor’s Note: Here are links to additional relevant information on doing business in China and on Chinese localization: http://www.lisa.org/globalizationinsider/2006/04/lisa_forum_asia.html http://www.lisa.org/globalizationinsider/2006/03/china_to_build.html Xiao (Catherine) Hui Zhu joined IBM China eleven years ago. She has always focused on globalization and has performed many different roles, including Globalization Tester, Globalization Architect, Project Manager, Consultant and Technical Writer. Zhu is now an Advisory Software Engineer at IBM’s China Software Development Lab and can be reached at zhuxiaoh@cn.ibm.com. |
LISA Business Data Forum Summaries and Presentations LISA Globalization Consulting Network Webinars and TouchPoint Advisory Calls LISA Forum USA LISA@Chinasoft Fair LISA Forum Asia LISA Forum Europe LISA Forum India Open Standards • TBX • TMX |
||