Title: Fach
1Terminology Exchange without Loss?Feasibilities
and Limitations of Terminology Management
Systems (TMS)
Uta Seewald-Heeg Hochschule Anhalt (FH) GLDV
Workshop Köthen, June 17, 2005
2Terminology Exchange Without Loss?
- Workflow and Interchange Szenarios
- Interchanging Terminological Data - Standards
- Terminology Management Systems (TMS)
- Conceptuel Features of TMS
- Commercial Products
- Supported Formats
- Overview
- Exchange Szenarios
- Conclusion
3Workflow and Interchange Szenarios
Freelancer
Terminology
Glossary
Terminology
?
Company
Glossary
Glossary
Translation Service Provider
Terminology
4Workflow and Interchange Szenarios The SAP
Example
SAP System Translation Tool (SE63)
SAP System Documentation Tool (SE61)
Maintenance/ reference
Reference
Download
In development
SAPterm Terminology Database
Realtime feedback tool for software development
SAP System Translation Proposal Pool (STMP)
SAP Knowledge Warehouse (Glossary)
TRADOS Translators Workbench / MultiTerm
SAPterm Web Access
Machine Translation and Controlled Language Tools
SAPterm CD
Docu CD (Glossary)
Quelle Mark D. Childress, SAP AG
5Mapping Different Data Formatsby an Interchange
Format
Interchange Format field identification
Termbase B subject
Termbase A domain
6CLS Framework(Concept-oriented with Links and
Shared references)
Source http//www.ttt.org/clsframe/reltef.html
7CLS Framework Termbase Structure
Source http//www.ttt.org/clsframe/graphic.html
8Interchanging Terminological Data Standards
- OLIF Open Lexicon Interchange Format
- since 1990s
- Interchange format for lexicons of different NLP
systems (originally only MT) - MARTIF - MAchine-Readable Terminology Interchange
Format (ISO 12200) - since 1999
- SGML based
- Easily adaptable to XML
- supported by very few TMS
- XLT XML Representation of Lexicons and
Terminologies - since 1999
- Meta-Standard for interchange of terminology and
lexical data - deliverable of the SALT Project
- TBX TermBase eXchange Format
- since 2002
- XML-based standard format for terminological data
9Terminology Databases (TMS) Conceptuel Features
- Language concept
- monolingual
- bilingual
- multilingual
- Entry structure
- free
- definable
- predefined
- Entry model
- lexicographically oriented
- concept-oriented
10CommercialTerminology Management Systems
- GFT DataTerm (GFT)
- UniTerm (Acolada)
- Déjà Vu Terminology (Atril)
- SDL TermBase (SDL)
- MultiTerm (Trados)
- TermStar (Star)
- crossTerm (Nero)
11GFT DataTerm
- Lexicographically oriented
- Bilingual
- Predefined entry structure
Export
Import
12UniTermTerm Import and Export
- Supporting conceptual information
- Multilingual
- Definable entry structure
Export of XML file with DTD (Term2003.dtd)
13Atril Déjà Vu
- Lexicographically oriented
- Supporting conceptual information
- Multilingual
- Predefined entry structure
14Terminology Database(TBX Template)
- Templates
- XML files (.dvtdt) (Déjà Vu terminology
database template) - TBX
15DéjàVu Terminology Import
16Terminology ImportFrom Excel to Déjà Vu
17Terminology Export
18SDL TermBase Definition of Termbank Structure
- Concept oriented
- Multilingual
- Predefined entry structure
Adding fields
Deleting selected field
Deleting all fields
Multiple selection only possible with fields of
type Attribut.
19Import / Export support in SDLX TermBase
- Déja-Vu terminology files
- Import / Export wizard
- SDL format (.tdb)
- Tab-delimited text file
- Trados MultiTerm format (MultiTerm 5)
20Trados MultiTerm iX
- TMS allowing concept oriented keeping of data
- Definable entry structure
- Separation of data structure and terminological
data - Termbank definition (.xdt)
- Terminology (.xml)
21Import support in MultiTerm
- - MultiTerm format (XML)
- - Excel (.xls, .csv)
- - Text file
Support of external data formats through
MultiTerm Convert
22MultiTermConverting Excel data
23Export support in MultiTerm
- - MultiTerm format (XML)
- - MultiTerm 5 format (txt)
- - Text file
24Star TermStar
- TMS allowing concept oriented keeping of data
- Definable entry structure
25Import support in TermStar
- - MARTIF- TermStar XV-Lexicon- Request file-
User defined formats
26Export support in TermStar
- - MARTIF- TermStar XV-Wörterbuch
27Nero crossTerm 3
- across terminology component
- Concept oriented
28Import / Export Support in crossTerm
- .csv
- Langenscheidt
- Trados MultiTerm 5 format
- Star Martif
- TBX
29Supported FormatsOverview
Trados XML HTML RTF
TBX
acrosscrossTerm
XML HTML RTF
TradosMultiTerm
Trados 5
AcoladaUniTerm
MultiTermConvert
MARTIF
MARTIFTermStar XVRequest fileUser defined
formats
txt
Trados 5(user defined)
xls,txt, csv
txt
csv
Excel
StarTermStar
MARTIF
csv, txt
xls
Trados 5
GFTDataTerm
txt (Synonyme werden nicht exportiert)
SDLXTermBase
Déjà VuTermi-nology
tdb
txt ? tab delimited
30Terminology Exchange Szenarios
TMS1
TMS2
Excel
. . .
TMSn
31Excel Data
- Starting point
- Multilingual glossary (3 terms)with additional
information(3 fields)
32From Excel to MultiTerm iX(through MultiTerm
Convert)
Allocation of column headers to MultiTerm fields
33From Excel to MultiTerm(through MultiTerm
Convert)
Specification of entry structure with MultiTerm
Convert
- Result of the conversion
- Termbank definition file
- XML file
Defining additional fields when creating an new
termbase
Default import definition
34Excel Data in MultiTerm iX
35Exported MultiTerm Data in Excel(through
Tab-delimited Text File)
5 additional categeroies entry number, user
type, creation date, modifier type, modification
date No column headers
36Exporting MultiTerm Data in Excel(through
Tab-delimited Text File)
Unsynchronized column content
37Exporting MultiTerm Data in Excel(through
Tab-delimited Text File)
MultiTerm Export Problem Line breaks in
definition texts cause absolute line breaks, so
that an import of type 1 line 1 record
(entry) is not possible any more.
38Exporting MultiTerm Data in MultiTerm5 Format
Only bilingual export
39Exporting MultiTerm Data in MultiTerm5 Format
Only bilingual export
40Form Excel to TermStar XV (Build
333)Tab-delimited
41Form Excel to TermStar XV (Build
333)Tab-delimited
42Excel Data in TermStar XV
Excel column headers imported as first entry!
43Form MultiTerm5 to TermStar XV (Build 333)
Assigning predefined fields to tags
44MultiTerm5 Data in TermStar XV (Build 333)
45TermStar Export as MARTIF file
Header
Body (extract)
46Form Excel to crossTerm 3(CSV)
Assigning across field names to CSV fields
47Excel Data in crossTerm
48From Star MARTIF to crossTerm 3
Assigning across field names to Martif fields
49Star MARTIF in crossTerm 3
50From Excel to SDL TermBaseTab-delimited
Assigning column headers to SDL TermBase field
names
51Exporting SDL TermBase to Excel
Tab-delimited export without field labels
Synchronized field export
52Standard's Role in an Automated Localization
Workflow Model
Source http//www.lisa.org/standards/
53TermBase Exchange Format(TBX)
- Initiated by LISA (Localization Industry
Standards Association) - Worked out by LISA/OSCAR as standard for the
localisation industry - XML terminology markup format
- TBX is consistent with ISO 12200 (MARTIF)
- TBX format specifications working draft, may
2002 - TBX relevant informationen http//www.lisa.org/tb
x/
54 TBX Document
-
-
-
-
-
-
-
-
- lt?xml version'1.0'?gt
- lt!DOCTYPE martif SYSTEM "./TBXcoreStructureDTD-v-
1-0.DTD"gt - ltmartif type'TBX' xmllang'en' gt
- ltmartifHeadergt
- ltfileDescgtltsourceDescgtltpgtnoteText
lt/pgtlt/sourceDescgtlt/fileDescgt - ltencodingDescgtltp type'DCSName'gtTBXdefau
ltXCS-v-1-0.XMLlt/pgtlt/encodingDescgt - lt/martifHeadergt
- lttextgt ltbodygt
- ...
-
-
Nach Reineke (2004) fFpPtTdDcClLrR???
Wissenseinheiten in der Softwarelokalisierung
(und deren Verwaltung in TVS). http//www.iim.fh-k
oeln.de/dtt/DTT2004-HTMLs/Reineke-Dateien/frame.ht
m 20.09.2004
55Conclusion
- Standardized interchange formats for platform
independent terminology interchange seldomly
supported by commercial systems - CSV is quasi-standard
- Problematic export of glossaries to Excel Files /
in CSV format (line breaks, number of fields as
in MultiTerm) - Type and number of information describing an
entry must be homogeneous over all entries - System and platform independent terminology
management necessitates high degree of manual and
programming tasks - Thus, terminology exchange without loss in many
cases still is an unrivaled ideal.