Term Link 1.0 Specification

Initial Draft 0.1.2, 4 December 2007

This version has been updated to reflect the change in name from TBX Link to Term Link, but in other respects is identical to version 0.1.1

This version:
http://www.www.lisa.org/standards/tbxlink/tbxlink.html
Editors:
Alan K. Melby <akm@byu.edu>
Andrzej Zydroń <azydron@xml-intl.com>
Copyright © The Localization Industry Standards Association [LISA] 2007. All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to LISA.

The limited permissions granted above are perpetual and will not be revoked by LISA or its successors or assigns.


Abstract

This document defines the LISA Term Link Specification. The purpose of this vocabulary is to define a link between a term that is embedded in an XML document and its entry in a corresponding TermBase eXchange (TBX) format document or repository.

Status of this Document

This document constitutes an initial draft for discussion.

This document and the information contained herein is provided on an "AS IS" basis and LISA DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Table of Contents

1. Introduction
2. Key Concepts
2.1. TBX Document
2.2. Termbase
3. General Structure
3.1. Main Term Link Element
3.2. Term Element
3.4. Attributes
4. Detailed Specifications
4.1. Term Link Namespace Declaration
4.2. Elements
4.2.1. Main Term Link Element
4.2.2. Term Elements
4.3. Attributes
4.3.1. Term Link Attributes

Appendices

A. Term Link XML Tree Structure
B. Term Link Document Type Definition and Schema
C. References
D. Glossary

1. Introduction

Term Link is a namespace based XML notation that enables specific identified terms within an XML document to be linked to a specific TBX - (TermBase eXchange (TBX) format) XML document.

The purpose of the Term Link specification is to provide a rigorous notation for linking embedded terms in an XML document to a their entries in a TBX document or a TBX database repository.

2. Key Concepts

The use of Term Link is predicated upon the existence of a TBX Document or database repository that contains the TBX term entries that are being linked to. Term Link allows individual terms to be linked to such a repository.

2.1. TBX Document

The TBX document is the object that contains the terms that are linked to by the 'termid' attributes of individual terms.

2.2. Termbase

The Termbase resolves the actual identifier of the main TBX document or repository that the indivudual terms are linking to. The Termbase will allow a Term Link compliant application to resolve the actual identifier and location of the TBX dataset.

3. General Structure

Term Link provides a very simple namespace based XML notation to allow for the linking of terms to a TBX document.

The Term Link document model hierarchical structure comprises the following elements:

tbx
This is the top level element for Term Link.
term
The individual term elements.

An example of Term Link usage:


<?xml version='1.0' encoding='UTF-8'?>
<doc xmlns:tbx="urn:lisa-tbxlink-tags">
    <tbx:tbx termbase="http://purl.org/xml-intl.com/tbx-link:8700" version="1.0"
    date="2004-12-18T13:06:52Z" tool-name="XYZ Term Finder" tool-version="1.32" language="en_US">
    <p>An example paragraph with an embedded
	<tbx:term termid="a125fg" termbase="http://purl.org/xml-intl.com/tbx.xml">
	    term
	</tbx:term>
       that is linked to a non-default TBX repository.
    </p>
    <p>A second
        <tbx:term termid="fde12a">

	   example
	</tbx:term>
       that uses the default TBX repository as specified in the "Termbase"
       attribute of the main "tbx:tbx" element.
    </p>
    </tbx:tbx>
</doc>

3.1. The Main Term Link Element

The <tbx> element is the top level of the Term Link hierarchy. It signals the start of the Term Link namespace DOM tree. Its direct children are one or more <term> elements.

3.2. The Term Element

The <term> element is used link the encompassed term to its entry in the TBX repository.

4. Detailed Specifications

4.1. Term Link Namespace Declaration

The Term Link document structure is designed to exist as a namespace so that it can be embedded into any document.

The Term Link namespace declaration will have the following form:

  xmlns:tbx="urn:lisa-tbxlink-tags"
  

All Term Link elements will normally be prefixed with the Term Link namespace identifier tbx:.

4.2. Elements

Elements <tbx>, <term>.

4.2.1. Term Link

The main Term Link element has the following format:

<tbx>

Term Link Element - The <tbx> element.

Required attributes:

termbase - the Termbase identifier for the default TBX Repository that is being linked to.

version - the fixed Term Link current version id, currently "1.0".

date - the date that the Term Link namespace was created for this document.

language - the language for the terms being linked to.

tool-name - the tool used to identify the Term Link terms.

tool-version - the version identifier of the tool used to identify the Term Link terms.

Optional attributes:

None.

Contents:

Zero or more <term> elements.

4.2.2. Term

The Term element has the following format:

<term>

The Term Link Term Element.

Required attributes:

termid - The term identifier in the TBX repository.

Optional attributes:

termbase - the Termbase identifier for a non default TBX Repository that is being linked to.

date - the date that the term element was created.

language - the language of the term.

Contents:

The PCDATA contents of the term.

4.3. Attributes

This section lists the attributes used in the Term Link elements. An attribute is never specified more than once for each element.

Term Link attributes date, language, version, termbase, termid, tool-name, tool-version,

4.3.1. Term Link Attributes

date

Date - The date attribute indicates when a given element was created or modified.

Value description:

Date in [ISO 8601] Format. The recommended pattern to use is: CCYY-MM-DDThh:mm:ssZ 
Where: CCYY is the year (4 digits), MM is the month (2 digits), DD is the day (2 digits), hh is the hours (2 digits), mm is the minutes (2 digits), ss is the second (2 digits), and Z indicates the time is UTC time. For example:

date="2002-01-25T21:06:00Z"
is January 25, 2002 at 9:06pm GMT
is January 25, 2002 at 2:06pm US Mountain Time
is January 26, 2002 at 6:06am Japan time

Default value:

Undefined.

Used in:

<tbx>, <term>

language

language - The language for the main tbx or individual term elements.

Value description:

A language code as described in the [RFC 3066]. For more information see the section on xml:lang in the XML specification, and the erratum E11 (which replaces RFC 1766 by RFC 3066).

Default value:

Undefined.

Used in:

<tbx>, <term>

version

Version - The current Term Link version number.

Value description:

The version number of this tbx document:

Fixed value:

1.0

Used in:

<tbx>.

termbase

Name - The identifier for the TBX repository. This should be in the form of a URL or some other system identifier that allows for the automatic resolution of the TBX repository.

Value description:

The TBX repository.

Default value:

Undefined

Used in:

<tbx>, <term>.

termid

Name - The identifier for the term in the TBX repository.

Value description:

the term key in the TBX repository.

Default value:

Undefined

Used in:

<term>.

tool-name

Name - The identifier of the tool used to insert the Term Link elements.

Value description:

the name of the Term Link tool.

Default value:

Undefined

Used in:

<tbx>.

tool-version

Name - The version identifier of the tool used to insert the Term Link elements.

Value description:

the version identifier of the Term Link tool.

Default value:

Undefined

Used in:

<tbx>.

A. Term Link Tree Structure

The following figure shows the possible structure as a tree. Each element is followed by notation indicating its possible occurrence according to the corresponding legend.

(legend: 1 = one
         + = one or more
         ? = zero or one
         * = zero, one or more)

<tbx>1
|
+--- <term>*

B. Term Link Document Type Definition and Schema

C. References

Normative

[IANA Charsets]
IANA Names for Character Sets. IANA (Internet Assigned Numbers Authority), Aug 2001
[ISO 639]
Codes for the Representation of Names of Languages. ISO (International Standards Organization), Nov 2001.
[ISO 3166]
Codes for the representation of names of countries and their subdivisions. ISO (International Organization for Standardization), Jun 2000.
[ISO 8601]
Representation of dates and times. ISO (International Organization for Standardization), Dec 2000.
[RFC 3066]
RFC 3066 Tags for the Identification of Languages. IETF (Internet Engineering Task Force), Jan 2001.
[TBX 1.0]
TBX 1.0 Specification. LISA (Localisation Industry Standards association), May 2002.
[XML 1.0]
Extensible Markup Language (XML) 1.0 Second Edition. W3C (World Wide Web Consortium), Oct 2000.
[XML Names]
Namespaces in XML. W3C (World Wide Web Consortium), Jan 1999.

Non-Normative

[ISO]
International Organization for Standardization Web site.
[LISA]
Localisation Industry Standards Association Web site.
[OSCAR]
OSCAR (Open Standards for Container/Content Allowing Re-use) Web site.
[OASIS]
Organization for the Advancement of Structured Information Standards Web site.
[Unicode]
Unicode Consortium Web site.
[W3C]
World Wide Web Consortium Web site.

D. Glossary

DTD
An XML document can have an associated Document Type Definition (DTD) that specifies the rules for the structure of the document. Several industries have standardized on various DTDs for the different types of documents that they share.
OSCAR
LISA special interest group (Open Standards for Container/Content Allowing Re-use).
UTC
UTC stands for Coordinated Universal Time.
XML
eXtensible Markup Language.