|
ISO/IEC JTC1/SC22/WG20 or Internationalization Standards for Computer Programming Languages
There are a number of global organizations which promote standards for software internationalization including the International Organization for Standardization (ISO), the International Electrotechnical Commission (IEC), the World Wide Web Consortium, the Unicode Consortium, and the Free Standards Group Linux Internationalization Initiative. In this article we take a look at the activities of one group "ISO/IEC JTC1/SC22/WG20 Internationalization" which has been preparing software internationalization technical reports and standards for more than a decade.
The long sequence of acronyms in the name ISO/IEC JTC1/SC22/WG20 describes the position of WG 20 within the hierarchy of ISO/IEC technical committees. ISO and IEC are global organizations that prepare and publish international standards. The two groups collaborate in areas of mutual interest and about fifteen years ago they established a Joint Technical Committee called JTC 1 which works on Information Technology standards. SC 22 is a subcommittee of JTC 1 which focuses on programming languages, their environments, and system software interfaces. WG 20 is a working group within JTC1/SC22 which develops internationalization standards targeted towards the programming language standards community. The key technical reports and standards prepared by WG 20 are discussed below. Programming LanguagesHistorically, software internationalization solutions for different platforms and applications were developed on an ad hoc basis and there were few common objectives. WG 20's initial project was to establish an organized approach for internationalization. The work resulted in the technical report ISO/IEC TR 11017 which provides a framework for internationalization of programming languages and applications. The report describes software internationalization requirements, lists conventions that differ by culture, and proposes a uniform approach to support a wide range of cultural conventions in an internationalized application. TR 11017 does not present specific technical solutions but rather was intended to set the direction for software internationalization and serve as a reference for future internationalization standards. During the last several decades international standards have been developed for a number of computer programming languages including COBOL, Fortran, C, C++, and others. While each programming language is unique, there are many common features that are shared among languages. ISO/IEC TR 10176 is a technical report which provides guidelines for the preparation of programming language standards. The objective of the report is to gather past design experiences into a set of guidelines which can be used to simplify the development of new programming language standards and promote standardization across languages. WG 20 updated ISO/IEC TR 10176 to include guidelines on how programming languages should handle internationalization issues related to character sets, character data, and character processing. Special consideration was given to the extended repertoire of characters that are allowed for identifiers in programming languages. Recent editions of the technical report have included updates to align the table of characters with the repertoire of the Unicode character set (ISO/IEC 10646). SortingAs information technology expands globally it is important for software applications to be able to sort multilingual text data in a culturally correct manner. While the conventions for sorting character strings in a single language and script may be well known, the rules for ordering text in all the world's scripts are not widely understood by information technology companies. ISO/IEC 14651 is an international standard for string ordering and comparison. It provides a method for collating multilingual text data which can be used to produce culturally appropriate results for a given language while retaining a reasonable sort order for other scripts. The algorithm uses multiple levels to properly handle characters from different scripts, upper/lower case characters, diacritics, and special symbols. The standard includes a Common Template Table which describes a default ordering for all the characters in the Unicode character set (ISO/IEC 10646). Portions of the table can be tailored to meet the requirements of a given language and country and the default ordering can be used for all the other scripts that are in Unicode. One of WG 20's recent efforts has been to align the Common Template Table with the quickly growing repertoire of characters in ISO/IEC 10646. An additional design objective for ISO/IEC 14651 is to ensure that the resulting sort order matches the order from other sort algorithms, such as the Unicode Collation Algorithm, ISO/IEC 12199, and the European Ordering Rules CEN/ENV 13710 EOR. Cultural ConventionsOne of the key goals of software internationalization is to create programs which are culturally neutral and can adjust their behavior based on the language and cultural conventions of the user. Cultural conventions include things like sort order, character classification, time/date format, number format, monetary format, postal address format, and telephone number format. ISO/IEC TR 14652 is a technical report which describes a formal method for specifying cultural conventions. It is based on earlier POSIX locale standards and includes enhancements to the earlier model. Participants in WG 20 could not agree on many portions of ISO/IEC TR 14652 so it was changed from a proposed standard to a technical report. Due to the large number of controversial sections and the fact that TR 14652 is a technical report rather than a standard, it appears that it will not be widely implemented. Once cultural conventions have been proposed for a given locale they need to be reviewed, identified, and made available to information technology companies. ISO/IEC 15897 is an international standard which defines registration procedures for cultural conventions. It started as a European standard, was fast-tracked to become the JTC 1 standard ISO/IEC 15897, and has been assigned to WG 20 for maintenance. The standard allows registration of cultural elements in narrative text or machine-readable formats. The objective is to create a global registry of cultural conventions which will make it easier for information technology companies to access the information and use it to adapt their products and services for different languages and countries. WG 20 is currently updating the standard to improve the registration and review process and ensure the correctness and quality of registered data. ChallengesInternationalization continues to be an important market requirement for software applications. While WG 20 has made some good contributions over the years it faces a number of challenges. The group suffers from a shortage of resources, lack of consensus, and little interest in new projects. As information technology companies reduce expenses they are less willing to allow their employees to devote time to standards activities and to travel internationally to attend meetings. About four national bodies regularly participate in WG 20 and 5-8 people attend the meetings which is far less than the participation in other ISO groups like the one working on character set standards. As new technologies have emerged, interests have shifted and much of the internationalization activity today is driven by the World Wide Web Consortium, Unicode, Linux, Java, and Microsoft. AcknowledgementThe author would like to thank Arnold F. Winkler (Unisys), convener of WG 20, for informative discussions about international standards and the activities of WG 20. is an independent internationalization consultant. He has an academic background in Computer Science and twenty years experience developing software in the US and Japan. David managed internationalization and localization at two software companies and has developed many multi-tier web, wireless, and desktop products for Europe, the Middle East, and Asia. He can be reached at david.fine2@verizon.net. |
![]() 8-12 December 2008 |
|||