The three-day conference will feature a full day of tutorials, followed by two days of presentations, panels and discussions. There will also be technology exhibits and demonstrations. Sessions will cover a range of topics including internationalization, globalization, the web, security and localization to name a few. There will be a mix of case studies, panel discussions and technical discussions geared towards beginner, intermediate and advanced practitioners. The full conference program is available at http://www.unicodeconference.org/agenda.
We interviewed Mark Davis, President of the Unicode Consortium, about what’s new at IUC 29. Editor’s Note: Check out Globalization: Resistance Is Futile, a recent presentation by Mark Davis.
Why should someone who has never before attended an IUC conference come to IUC 29?
Mark Davis: A lot of companies have software products that need to be globalized; they just don't know the best way to go about it. For example, they may be expanding into China, so they have an immediate need to enable their English-only application to work in Chinese. IUC 29 has excellent tutorials and presentations, including case studies, to get people started on a successful path to globalization.
Why should past attendees go to IUC 29?
Davis: The program has been significantly revamped to make it more valuable to participants. We’ve restructured the format to offer longer and more detailed presentations that allow for a greater depth of coverage. There’s an increased focus on tutorials, and we have great keynote presenters lined up (see below). We are also going to have new birds-of-a-feather discussions, in addition to the great networking opportunities that have always been available. We’re really excited about the program we've put together because we know that attendees will receive a lot of value.
IUC 29 will be the first conference to cover the new Unicode 5.0 standard; what’s significant about it?
Davis: Unicode has now become pervasive in the world of software. As products and protocols (such as internationalized domain names) are extended to handle it, serious issues are raised – such as security. It is vital for developers to understand these issues to avoid making serious mistakes. Unicode 5.0 and the other globalization standards available through the Unicode Consortium are being updated to address these issues. The conference gives people the opportunity to hear about the latest developments in these standards, and the most important issues facing software developers when handling globalization.
Of course, new characters are being added to Unicode 5.0. It also includes enhancements and corrections of data and specifications that affect the use of older characters.
New Standards, New Products, Hot Topics
As mentioned above IUC 29 will be the first conference to cover the content of Unicode 5.0, which supersedes all previous versions of Unicode and which is synchronized with the latest version of ISO 10646. It will also be the first to cover the content of LDML 1.4 (used for interchanging locale data) and CLDR 1.4 (the latest version of the Common Locale Data), and the latest information available in the area of Unicode security. The conference includes presentations on Microsoft Office 12 and Windows Vista, several talks on ICU topics, as well as an overview and case study on CLDR.
The conference features keynote presentations by Tuoc Luong of Ask Jeeves, Inc. on Going Global with a Search Engine; Col. Daniel Scott of the Defense Language Institute Foreign Language Center on Unicode as a ‘Unifying Force’ in Language Education; and Charles Bigelow of Bigelow & Holmes Inc. on The Effect of Unicode on Type Design. Presenters represent Apple Computer, Ask Jeeves, ASMUS, Basis Technology, Google, IBM, Microsoft, ModernGigabyte, Nagaoka University of Technology, Oracle, PayPal, Quest Software, Sun Microsystems, Tavultsoft Pty. Ltd., U.C. Berkeley, VeriSign, Inc., W3C and Yahoo!
Tutorials: Get Ready for Tomorrow’s Internationalization Challenges
Tutorials at IUC 29 are aimed at people with multiple levels of technical knowledge. These sessions will provide a deep understanding of the issues and technologies involved in internationalization, globalization and localization, helping to prepare you for future challenges. Details are provided below.
Unicode 5.0 Tutorial: Part 1 - Characters in Action
Part I of the Unicode 5.0 Tutorial is a uniquely accessible and entertaining way of visualizing the core concepts of the Unicode standard. In this part you will find answers to these questions: What is a Unicode character and how are Unicode characters represented and used in a modern computing environment? How are Unicode characters entered into and displayed on a computer? How are Unicode characters interchanged? What is the interaction between Unicode and rich text (markup)? How do end users experience Unicode? With the help of concrete scenarios for the use of Unicode characters, you will become familiar with the role the Unicode Standard plays and the benefits of supporting it.
Unicode 5.0 Tutorial: Part 2 - Fundamental Specifications
Part II of the Unicode 5.0 Tutorial systematically presents the details of fundamental specifications that are part of the Unicode Standard. Topics include: organization of the Unicode code space; principles used to allocate and unify characters; encoding forms including definition of UTF-8, UTF-16, UTF-32 and when to use each; how to use byte order mark; how to combine characters and equivalent code sequences; format characters and other special characters and code points; organization of the Unicode Standard.
Unicode 5.0 Tutorial: Part 3 - Unicode Algorithms
The Unicode Standard and related specifications by the Unicode Consortium specify a number of algorithms that depend on Unicode Character Properties. Part III of the Unicode 5.0 Tutorial surveys the algorithms specified in the Unicode Standard, and extends the discussion of Unicode character properties as they relate to each algorithm.
Internationalization: An Introduction
What is internationalization? What do developers, product managers and quality engineers need to know about it? How does a software development organization incorporate internationalization into the design, implementation and delivery of an application? Attendees will be introduced to the overall concepts and approach necessary to analyze a product for internationalization issues, how to develop a design/approach, and how to deliver a global-ready solution. The focus is on architectural approaches and general concepts, but will include specific examples and exercises. Some of the topics covered will include: character encodings and Unicode; processing text in different languages; preparing for the localization (translation) of user interfaces; making applications “locale-aware,” including format and display differences; as well as approaches to delivering multilingual and multi-locale software or content.
Web Internationalization - Standards and Best Practices
This tutorial is an introduction to internationalization on the World Wide Web. The audience will learn about the standards that provide for global interoperability and how to work with multilingual data on the web. Character representation and the Unicode-based Reference Processing Model are described in detail. HTML, XHTML, XML (eXtensible Markup Language; for general markup), and CSS (Cascading Style Sheets; for styling information) are given particular emphasis. The tutorial addresses language identification and selection, character encoding models and negotiation, text presentation features, and more. The design and implementation of multilingual web sites and localization considerations are also introduced.
An Introduction to Writing Systems & Unicode
This tutorial provides a good understanding of the many unique characteristics of non-Latin writing systems, and illustrates the problems involved in implementing such scripts in products. It does not provide detailed coding advice, but does provide the essential background information needed to understand the fundamental issues related to Unicode deployment, across a wide range of scripts. The tutorial goes beyond encoding issues to discuss characteristics related to the input of ideographs, combining characters, context-dependent shape variation, text direction, vowel signs, ligatures, punctuation, wrapping and editing, font issues, sorting and indexing, keyboards, and more. The concepts are introduced through the use of examples from Chinese, Japanese, Korean, Arabic, Hebrew, Thai, Hindi/Tamil, Russian and Greek. While the tutorial is perfectly accessible to beginners and has proven to be an excellent orientation for newcomers to the conference (often providing the background needed for understanding the other presentations!), it has also attracted very good reviews from people at an intermediate and advanced level, due to the breadth of scripts discussed. No prior knowledge is required.
The Dao of Unihan
Over half of the characters in the Unicode Standard are ideographs. This ideographic repertoire, termed Unihan, is intended to provide complete coverage for all the characters in current or past use in all varieties of Chinese, Japanese, Korean and Vietnamese. This presentation provides an overview of the structure of the current repertoire of Unihan and its organization, along with a discussion of some practical implementation issues and how to deal with them. We will also provide an overview of the Unihan database’ a large body of normative and informative data, which is maintained by the Unicode Consortium and included among the data files that are a part of each release of the standard.
Internationalization Features in XPath, XQuery and XSLT
In recent years, the W3C has worked on 17 (!) documents that deal with the XML query language Xquery and the transformation language for XML documents XSLT 2.0, henceforth noted as QT. The newest QT working drafts include several features for Unicode processing and for general software internationalization. This tutorial discusses the use of QT technology in the context of global (web) application development. Since XQuery is built on top of the features available in XPath, the tutorial starts with an introduction to XPath 2.0. We introduce the basics of XQuery to the audience by showing how to use the FLWOR expression to query on an XML document. We then focus on the Regular Expression and Collation facility in XPath/XQuery. We will also briefly describe some current work and issues in the research of developing XQuery Full Text Extension. We describe the common properties of XSLT 2.0 and XQuery 1.0, focusing on XML Schema datatypes for dates and time zones. XSLT-specific features for internationalization are also covered.
Advanced ICU Topics
ICU is a mature, widely used set of C/C++ and Java libraries for Unicode support and software internationalization and globalization. It grew out of the JDK 1.1 internationalization APIs (which the ICU team contributed) and continues to be developed for the most advanced Unicode/internationalization support. ICU is widely portable and enables applications to produce the same results on all platforms and between C/C++ and Java software. This tutorial walks the audience through the core concepts of using the ICU library (character conversion, collation, message formatting and text boundary analysis) through the presentation of an internationalization task. The tutorial walks through code snippets to solve this task, followed by demonstration applications and a discussion of core features and conventions, advanced techniques and how to obtain further information.
The early-bird registration discount is available until February 1, 2006. To register, visit http://www.unicodeconference.org/registration.htm. LISA Members qualify for a US$ 250 discount, so they should register at http://www.unicodeconference.org/lisa. For information on how to exhibit, contact Sandy Burke at sandy@omg.org or +1-781-444 0404. For sponsorship opportunities, contact Nicole Rikkinen at nicole@omg.org. For all other conference-related questions, contact Kevin Loughry at loughry@omg.org or +1-781-444 0404.
The conference is produced by The Object Management Group™ (OMG™) and is sponsored by: Gold Sponsor Translations.com; Silver Sponsors IBM Corporation, Basis Technology; Media Sponsors MultiLingual Computing Inc., Localisation Research Centre; and Organizational Sponsors LISA and GALA.
About The Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. For more information, visit http://www.unicode.org/.
About Object Management Group
The Object Management Group™ (OMG™) is the new Event Producer for the Internationalization & Unicode Conferences. The OMG is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Our specifications include MDA®, UML®, CORBA®, MOF™, XMI® and CWM™. OMG’s specifications are all available for download without charge. For more information, visit http://www.omg.org.