LISA Home page [© 2010 • ISSN 1420-3693 • www.localization.org]
© 2010 SMP Marketing • ISSN 1420-3693 • www.localization.org

In this issue…


Globalizing Painlessly with Automated Content Enrichment

Marc Bookman, CEO, Sentius Corporation

If translating is too expensive, and not translating loses too many customers, don’t despair; there is an intermediate solution!


Globalization Pain

“Language is the number one obstacle of globalization.”
eMarketer

An eMarketer study says 96 percent of all e-commerce is conducted in English. English may be the common language of content on the web, but according to Global Reach, an estimated 60 percent of users on the web are non-native English speakers. That means that the information English speakers take for granted, understand and act upon quickly is an obstacle for non-native speakers.

This is particularly problematic in an era in which information is so quickly disseminated over the Internet. A company’s success depends on how quickly it can make its information understandable and usable across the extended global enterprise so it’s no wonder that globalization is at the top of the agenda for the world’s leading CEOs.

Online Globalization

There are two commonly used approaches to making online information universally comprehensible: language translation and a new method called automated content enrichment.

Language translation has a number of benefits. Assuming a capable localization firm is employed, translation can impart local nuances to content thereby building rapport with users in their own language. However, translation has notable drawbacks: it is not scalable and it is expensive. Given the increasing volume of documents that corporations and publishers are generating, full translation can be cost prohibitive and simply not manageable within tight publishing cycles.

Translation can also radically alter content – especially in more specialized industries like publishing, financial, technical or medical information. Precise meaning is also at risk when using machine translation in its current form. Overwhelmed by the issues, many organizations just give up and do nothing.

Figure 1

Figure 1. Japan’s largest news publisher, Asahi Shimbun, uses automated content enrichment to instantly globalize its English language news from the International Herald Tribune and San Jose Mercury News.

Figure 2

Figure 2. Apple Computer uses automated content enrichment for globalization in multiple European languages.

Automated Content Enrichment

Pioneered by Sentius, the first automated content enrichment (ACE) solution on the market is called RichLink®. As shown in Figure 1, ACE automatically adds relevant, in-context language annotations to online documents. The additional content pops up when the user clicks on a word or phrase. For example, if the reader doesn’t understand the phrase “public assistance” such as in the Asahi Shimbun news article below, he clicks directly on the word and gets a pop up with a Japanese language definition.

Users access the rich content without leaving the page or losing their place, thus enhancing user satisfaction and page ‘stickiness.’ ACE greatly improves the user’s ability to understand and use online documents. In addition, the documents remain available to users in their original English language form. This is critical to support business professionals who require rich, precise information. Because it is scalable, automated content enrichment solutions are also highly cost effective.

In addition to single language annotations, automated content enrichment can also enable multi-language annotations. Apple Computer will be using ACE to instantly globalize its sales extranet across Europe. Figure #2 is a demonstration of how it will appear. In this sample, a European reader who doesn’t understand the word “revolutionary” can click on the word and then see the definition in their local language. As an additional added value feature, Apple’s site will automatically determine the language of the reader when they login and display the appropriate annotation for that reader.

“When Apple announces a new product, we typically have several dozen documents communicating items related to that product in the areas of sales, support, training, and marketing. By using automated content enrichment to provide native language support in pop-up annotations, we’ve dramatically reduced the lead-time, cost, and headcount issues,” said Henry Kim, Senior Sales Communication Manager with Apple.

Based on extensive feedback from non-native speaking users of English web content, Sentius recommends translating only top-line marketing messages, positioning statements and website navigation items such as menus, buttons and headings. This usually amounts to about ten percent of business content. For the remainder of the business critical content, ACE should be sufficient. See Figure 3.

Figure 3

Figure 3. Translating the top 10% of website content combined with enriching the remaining 90% is the most cost-effective way to globalize 100% of online content.

ACE vs. MT

Machine translation may be satisfactory for gisting or some low-end consumer use, but according to iXL, in their 1999 article Creating a Global Internet Presence, “Machine translations lack the… sophistication to provide translations effectively mirroring human communication.” Automated content enrichment avoids this problem by preserving the original document.

ACE vs. MT
 Automated Content EnrichmentMachine Translation
End User
AccuracyHighModerate
Globalization for non-
technical consumer use
YesYes
Globalization for highly
technical documents
YesNo
Point of impact information
delivery capabilities
YesNo
Some ESL requiredYesNo
Enterprise
Maintains integrity of
original documents
YesNo
Brand integrity preservedYesNot always
PriceAbout the sameAbout the same
Post processing expensesMinimalExpensive
Special document
preparation
Not necessaryPreparation needed

Figure 4. A comparison of automated content enrichment and machine translation as globalization solutions.

Automated content enrichment takes advantage of the combination of computer technology and human judgment. Rather than determining a particular meaning of a term in context, which machine translation frequently does erroneously, automated content enrichment pop-ups provide multiple language definitions and let the reader intelligently select the appropriate contextual definition.

While the preceding automated content enrichment examples presupposes some level of English language comprehension, this is not a significant hindrance because, according to eMarketer, “Nearly half of the population of the 15-nation member European Union can converse in a language other than their native tongue.” For most of those people, as well as their Asian counterparts, their second language is English.

Research Confirms Value of ACE

Research by IDC, CAP Ventures and Asahi Shimbun verify ACE’s effectiveness as a globalization solution.

“RichLink’s unique ability to embed multiple layers of links in documents opens new opportunities for companies to communicate and compete more effectively in the global market,” said Joshua Duhl, Analyst with IDC in a July 2000 IDC white paper entitled Sentius: Enabling Global Understanding Through Automated Content Enrichment.

“Content enriched documents appeal to readers because they deepen understanding while maintaining context,” said Duhl. “For companies, the appeal can be far reaching, from increasing the value and use of existing content to increased reader satisfaction to significant top- or bottom-line impact.”

In an April 2001 survey of online readers, Asahi discovered that the average reader clicked on a word or phrase six times in each article—and 33% clicked over 10 times. “88% of Asahi readers say they find RichLink helpful,” said Jun Ohmae, Media Strategy Officer of Asahi Shimbun.

Most recently, in April 2001, industry analysts CAP Ventures published the white paper Improving the Productivity of Web Content: The Value of ACE (Automated Content Enrichment). The white paper analyzed preliminary research wherein two groups of subjects were given a six question test. The control group was instructed to find the answers online in a standard hyperlink-driven web page environment. The study group was instructed to find the answers on the same web page supplemented with automated content enrichment. All subjects were required to answer all questions correctly.

On average, the control group took 20.4 minutes to complete the test and the study group took 13.5 minutes to complete the test. “Test subjects who used RichLink-enabled content correctly answered the questions 34 percent faster than the control group without access to the RichLink-driven system,” said Becky Barclay, Senior Consultant with CAP Ventures. “Automated content enrichment capability enables businesses to deliver information quickly and effectively to global audiences.” See Figure 5.

Figure 5

Figure 5. Preliminary research shows that automated content enrichment enables online readers to increase comprehension and decrease the time needed to read foreign language documents.

“Automated content enrichment is about 20 times better than hot linking because it’s instant and it keeps your readers on the page,” said Aaron Goldberg, VP and Principal Analyst with Ziff-Davis Media. “It also costs much less to implement.”

Figure 6

Figure 6. ACE technology takes language libraries and other content and automatically globalizes websites.

ACE Core Technology

At the core of the automated content enrichment technology is an intelligent parsing and query engine that automatically interprets English language documents through intelligent natural language processing. The engine parses the document’s text and links the content to selected databases. The result of the process is an associated annotation file which contains rich layers of in-context information which pops up with a single user click. In addition to language definitions, each annotation is capable of containing an almost limitless range of multimedia data, including graphics, audio, video, hyperlinks and advertising.

The entire ACE process, as well as the database libraries, can be hosted with an application services provider, or can be hosted on the customer’s site. An enriched web page is only slightly larger than the original so download time is typically of no concern.

When a reader wants to view an annotation, a single click instantly displays a popup window that shows all of the annotations associated with the selected text. The reader can then view each annotation without ever leaving the original document. In the Asahi and Apple examples, this would be a language annotation provided as part of a content enrichment eGlobalization Solution. (In addition to globalization, ACE also has applications in publishing, training, and marketing.)

Because annotations are added as layers to the document, the original content remains unchanged, eliminating misinterpretations created through the translation process. As a result, ACE lets companies quickly serve diverse global audiences by increasing document understanding without changing the original content.

The additional layers of content are embedded in the web page. This means that the full database does not need to remain on the Web server, travel with the enriched document, or be downloaded by the user. And since the additional content is part of the document, the pop up information appears instantly.

There are a wide variety of language and industry-specific database libraries which a website may license for use with automated content enrichment applications. If you already have a corporate knowledge base, it can be automatically transformed into pop ups which enrich your documents. Your glossaries can become the basis for automatically embedded, contextually relevant information that pops up at the reader’s request. You can also create custom databases from scratch using your own relevant information, including terminology definitions, product descriptions, interactive surveys and other digital assets.

Painless ACE Solution

Combining ACE with translation gives businesses the fastest, most cost-effective way to extend the reach and impact of their online content. Taking a two-tiered approach, one should expect to increase the volume of globalized information by a factor of at least 1000 to 1—and deliver it simultaneously around the globe, instantly crossing barriers of language and expertise.

About the author

Marc Bookman, CEO of the Sentius Corporation
Marc Bookman founded the company in 1993 after spending seven years developing the electronic publishing business for Sony Corp. in both the US and Japan. Mr.Bookman is an advisor to several Silicon Valley start-ups, a guest lecturer at the Stanford Business School, a regular speaker at industry conferences, and a published author. He holds two patents, including the basic patent for RichLink® the first scalable, automated content enrichment solution. He holds a B.S. in Accounting, and an MBA and Masters degree in East Asian Studies from the University of Chicago.




Contents


LISA Business Data

LISA Publications Catalog

Industry Insights Reports

Best Practice Guides

Surveys

QA Model

Forum Summaries and Presentations

LISA Globalization Consulting Network

Webinars and TouchPoint Advisory Calls


Join LISA

Subscribe


Upcoming Events

LISA Forum USA
(Foster City, California, April 13–16, 2010)

LISA@Chinasoft Fair
(Chengdu, China)

LISA Forum Asia
(Suzhou, June 28–July 1, 2010)

LISA Forum Europe
(Budapest, October, 2010)

LISA Forum India
(New Delhi, December, 2010)


Open StandardsTBXTMX

Terminology SIG

Job and CV Postings