Title:
1SALTXLT Markup and Mapping in Termbases-
Empirical Experience -
- Klaus-Dirk Schmitz
- University of Applied Sciences Cologne
- Institute for Information Science
- klaus.schmitz_at_fh-koeln.de
2SALT - Work Package 4
- Analyse in detail data categories and format
structures of existing terminological data
collections and formats in order to develop
conceptual mapping tables and procedures to and
from DXLT. - This will serve as technical specifications for
the develop-ment and implementation of filters
(converters) from specific databases and formats
to DXLT and vice versa. - Based on Deliverable 2 / 3.1 describing existing
terminological data formats and structures of
concrete sample data.
3One of the formats Eurodicautom
- Eurodicautom is the terminological databank of
the EU Commission, developed and filled with data
since the end of the 60ties. - Eurodicautom is a main frame application with
more than 1 million records, and since some
years, the data are provided to the public via a
web interface.
4Eurodicautom Sample Data
- BE BTB
- TY DAG77
- NI 612
- CF 3
- CM AG4 CH6 GO6
- DA
- VE C/N kvotient1kulstof-kvælstofforhold2
- RF A.KlougartVE1,VE2
- EN
- VE C-N ratio
- RF CILF,Dict.Agriculture,ACCT,1977
5Eurodicautom Sample Data
- NL
- VE C/N-quotient1koolstof-stikstofverhouding2
- DF (in bodem)verhouding vh totale
koolstofgehalte tot het totale stikstofgehalte
van organische stoffen... - RF Agr.WPVE1,VE2Huitenga,Landbouwwdbk
N-EVE1,VE2 - NT NTE(in plant)verhouding van koolstof en
stikstof(koolhydraten en eiwitten)...VE1,VE2
6Eurodicautom Data Structure
- After a general block of entry-related
(concept-related) information, language blocks
are repeated for each of the EU languages. - Every data category can only appear once within a
language block, i.e. only one data category for
all terms in one language. - The Note field can be structured by unique
starting tags that can be seen as virtual data
categories.
7Eurodicautom Data Categories (part)
- BE (EU) terminology service responsible for the
entry (M) - TY collection code (M)
- NI entry number (M)
- NX entry number for updating (R)
- NZ entry number for deleting (R)
- CF reliability code (1 lowest, 5 highest)
- AU author, originator
- DATE date (of last modification)
- CM subject field (Lenoch Code) (M)
- .........
-
MMandatory / RRare or old
8Eurodicautom A first DTD (part)
- lt!-- DTD for EURODICAUTOM KDS 30.8.2000 --gt
- lt!ELEMENT EURODICAUTOM (entry )gt
- lt!ELEMENT entry (BE , TY , (NZ NX NZ ) , CF
, AU? , DATE? , CM , langSet)gt - lt!ELEMENT BE (PCDATA )gt
- lt!ELEMENT TY (PCDATA )gt
- ...
- lt!ELEMENT langSet (VE?, AB?, PH?, DF?, RF?, MC?,
MC?, NT?)gt - lt!ATTLIST langSet lang CDATA IMPLIED gt
- lt!ELEMENT VE (term , termID? )gt
- lt!ELEMENT AB (term , termID? )gt
- lt!ELEMENT PH (term , termID? )gt
- lt!ELEMENT RF (text , refID )gt
- lt!ELEMENT term (PCDATA )gt
- ...
9Eurodicautom Graphical Representation
10Eurodicautom Graphical Representation
11Eurodicautom Mapping Procedure (Part)
12Eurodicautom Mapping Procedure (Part)
13Eurodicautom Mapping Procedure (Part)
14Eurodicautom Mapping Procedure (Part)
15Eurodicautom Hand-coded XML (Part)
16Eurodicautom Hand-coded XML (Part)
17Eurodicautom Hand-coded XML (Part)
18Eurodicautom Hand-coded XML (Part)
19Eurodicautom Hand-coded XML (Part)