|
In this issue…
Symantec Houdini: How Symantec addresses some of its help localisation issues
IntroductionLocalising Windows Help should be a straightforward process. After all, it's simply a matter of translating the Rich Text Format (RTF) files used to build the help text, and making sure they compile without errors isn't it? Well it is if you have been given accurate counts for words and graphics, receive the help files when they are completely stable, and have complete confidence in your translators to be consistent in their cross-referencing (even if the US writers were not) and unlikely to destroy any of the internal links in the help. And then there's the real world. Help files are more likely to be delivered piece meal, either as individual RTF files, or worse, as individual topics. Then towards the end, the writers decide to change their browse sequences, alter the context strings and improve their keyword indexing. New topics also get added at the last minute to take account of feature creep, existing topics have their cross- referencing improved, and topics (already in translation) get deleted because the writers do some last minute restructuring. How do you keep track of it all? Translation memory systems may offer a solution, but there are other alternatives. Symantec has developed an internal tool - Houdini - to cut down the time testing help and comparing against original files. Houdini is distributed to our current vendor base and has attracted a lot of positive comment to the extent we are considering a possible release to the market on the next release of Symantec C++. This article looks at the general issues associated with localising help, and how Houdini tries to address those issues. Help Localisation IssuesHelp localisation issues can be grouped into five categories:
Generating project statisticsEstimating the word count for a help project is particularly troublesome. You need to open each RTF file in a word-processor and generate a word count. A product like Norton Utilities for Windows `95 has approximately 10 help projects, using about 35 RTF files. Getting a word count for each RTF file in a project like this is time-consuming and slow. So imagine the headache you'd have trying to get help writers to supply you with this information on a regular basis. Another problem is identifying the number of bitmaps used in a project. Ideally you are working with a clean help project, where the only bitmaps in the help directory are those used by the project. However, it's not unusual for writers to leave redundant bitmaps in the directory as they build and test their help. Traditionally, one way to find out which bitmaps are used by the project is to remove all the bitmaps from the directory, compile the help file and note the errors reported for missing bitmaps. Once again, this is time-consuming and not particularly efficient, either for the publisher or the vendor. Generally it's useful to know where topics are located within the help system, how they refer to each other and what sort of attributes are associated with each topic (such as context strings, titles, browse sequences and so on). Trying to "map" out a help system like this is difficult, but if you have such a map it makes it a lot easier to track down and sort out problems. Building the HelpBuilding the help involves organising and capturing any new bitmaps that need to be translated, including segmented bitmaps (bitmaps containing jumps), translating various options in the help project file, and finally compiling the help project. Most of this is a fairly straightforward process. Ensuring the consistency of the helpMaintaining consistency covers two areas: cross- referencing and formatting. Checking the consistency of cross-referencing usually looks for inconsistencies between:
Checking the consistency of formatting looks for inconsistencies in:
Maintaining the integrity of the helpIf you are translating a help file, particularly on a topic by topic basis during simultaneous translation, there's always a huge risk your files will get out of step with the US teams. Integrity inspections try to identify:
Managing UpdatesManaging updates can involve tracking weekly changes between help projects during a simultaneous ship, or identifying changes between two significant product releases. Tracking is easy if writers use tracking sheets to identify their changes, but tracking sheets that are maintained manually are difficult to implement, especially as deadlines draw close. If you are working on a simultaneous ship, then this sort of information becomes even more crucial; significantly, you need to be get this information on a regular basis so that you can monitor the progress of the help system. From a localisation perspective, the crucial task is to determine:
Once you've identified the changes you'll probably find that the biggest problem is modifying the topics common to both releases: changing browse sequences, updating the keyword list, and (an absolute nightmare) modifying the topic's unique identifier (the context string) because writers renamed them in the new project. (On one Symantec project, a simple one hour change in the US required about 96 hours of work on the translated edition.) Symantec HoudiniClearly, there's a lot involved in localising help. Traditionally, the main tool would have been a word processor to translate the material, and then running the files through the help compiler to check that everything worked. However, to check consistency between jumps and formatting issues, you would have had to manually work your way through the file. This is time consuming and, depending on the size of the help file and the type of checks you want to make, this can at least 5 days. Symantec developed Houdini to try and address many of these problems. The tool works on the project file (the HPJ) and the source files (the rich text format files). The first version simply reported on the consistency between page and footnote titles and jumps. This feature alone reduced consistency testing down to about a day, compared with the more usual 5 days checking. The next version introduced a statistics feature, providing statistics such as topic counts, word counts, bitmap counts and so on. This version was given to the US writers so that they could keep track of their word counts without having to go through the tedious process of counting everything in a word processor. Additional reports were subsequently added to identify formatting errors and keyword issues. All Houdini reports can be saved as a text file; to make the reports easy to read, the files are tabbed-delimited, so they can be opened in a spreadsheet. The next edition of Houdini focused on keyword translation. Typically, keywords are duplicated throughout the help project but there is no easy way to translate them once and have the translation replicated across the help project. Houdini now extracts the unique keywords into a table. Translators can edit this table have the changes updated within the RTF files. Again, this saves time and ensures consistency. The latest edition of Houdini focuses on comparing the structure of the English help project with the translated project. The comparison reports on topics missing from the translated file, translated topics which no longer appear in the English file, hotspot differences between English and translated topics, and differences in footnotes such as browse sequences, macros, build tags and so on (footnotes which frequently get changed at the last minute). The next edition of Houdini will focus on inserting and extracting individual topics to and from RTF files. This feature will make is easier to provide updates to translators during a simultaneous ship, rather than supplying them with complete RTF files containing half-written topics. It will also have an footnote updating feature, so that the latest English footnotes (such as browse sequences) can be updated to the translation build. Like all tools, Houdini's main success comes from being integrated into the localisation process. US writers use the tool to obtain word counts and imporve the consistency within the text before passing it on for translation; localisation uses the tool to compare the various differences between US builds, so that we can keep vendors informed of progress, and vendors are given the tool so that they can check their work before returning the project to Symantec. The quality of work being returned by vendors has been raised significantly, reducing the amount of inspection work we have to do internally. We hope to release a version Houdini with the next release of Symantec C++, sometime in the December quarter. John Rowley
|
LISA Business Data Forum Summaries and Presentations LISA Globalization Consulting Network Webinars and TouchPoint Advisory Calls LISA Forum USA LISA@Chinasoft Fair LISA Forum Asia LISA Forum Europe LISA Forum India Open Standards • TBX • TMX |
||