LISA Home page [© 2008 • ISSN 1420-3693 • www.localization.org]
© 2008 SMP Marketing • ISSN 1420-3693 • www.localization.org

In this issue…


Some Issues Associated with Handling Double-Byte Character Sets

Jan Pfefferkorn, Director, Language Automation Inc.

Part of the challenge of localizing a writing system may depend on whether the characters of the writing system are single- or multi-byte characters. That is, some of the issues that may need to be considered are whether the format is the 8-bit character format used for the ASCII environment of European and/or English-speaking locales or whether the format is multi-byte, wherein a single character, European or Asian, is represented by one, two or more bytes in a code set.

To further complicate matters, a writing system may be considered "simple": e.g., Roman, Greek, or Cyrillic. Alternatively, it might be large but "non-complex." The Japanese, PRC Chinese, Taiwanese, and Korean systems, with approximately 6900, 7500, 13,800 and 8200 official characters, respectively, are included in this large, non-complex classification. Hebrew, Arabic and Southeast Asian languages are considered complex because although the actual character sets may be relatively small, the writing systems are bidirectional and/or contextual. For instance, although Hebrew is a right-to-left writing system, numbers are written from left-to-right. So a sentence containing numbers will have script written in both directions.


The remainder of this article is available only to LISA members and Newsletter subscribers. Please log in with your user name and password to read the entire article.

User Name:

Password:

Note: If you are not a member, but are interested in receiving the Globalization Insider, click here.

Note: If you do not presently have access to archives of Globalization Insider older than one year and would like to access them, please contact the LISA Administration or upgrade your LISA Membership (form).

E-mail LISA Administration for further enquiries.


Forgotten your password?




LISA 2008 events

Advertise with LISA


The Internationalization & Unicode Conference 32

Free Online English Russian Dictionary

LISA Forum USA

23-27 June 2008
Register Today
Sponsorship Request



LISA Surveys

EventsNews

Joining LISA

Best Practice Guides

LISA Wireless Primer


OSCARTBXTMX

Terminology SIG

Job and CV Postings