Title: Kein Folientitel
1GEMET, the General Environmental Multilingual
Thesaurus Development, User Perspectives and
Plans for a Thesaurus System
Part 1 overhead presentation Bruno Felluga, CNR
- Consiglio Nazionale delle Richerche, Rome,
Italy Part 2 slide show Stefan Jensen, project
leader ETC/CDS, Lower Saxony Ministry of the
Environment Hannover, Germany Open Forum on
Metadata Registries, Santa Fe, NM January 20,
2000
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
2Outline GEMET presentation - part 2 linking
terminology and applications
GEMET activities performed by the ETC/CDS -
co-ordination of the thesaurus development -
GEMET usage for indexing and retrieving
environmental metadata - development of
application around GEMET - assessing 3rd party
user needs to incorporate into future
developments
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
3Co-ordination of the development
- Encouraging and co-ordinating the translation
of the core terminology into 12 languages -
Contracting application development around
GEMET - implement shared coding lists (value
domains) - Promoting the use of GEMET through
marketing activities - Distributing GEMET and
supplying technical helpdesk
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
4GEMET - usage for indexing metainformation
GEMET has been used in the work of EEA to index
metadata from the following
resources - The Directory of Information
Resources (DIR) - The Reporting Obligation
Database (ROD) to do this, 2 applications
were developed - MS-Access based tool for
metadata registry (WinCDS) - Webbased JAVA tool
for online registration(prototype)
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
5Thesaurus part of WinCDS
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
6JAVA based online registration - the indexing
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
7GEMET - usage for indexing metainformation
Directory of Information Resources Total dataset
sum 931 Controlled terms in use 655 of
5300 Total descriptors sum 4714 (GEMET terms
used for indexing) Term ranking 121 of 655
terms have been used more than 10 times
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
8Terms used more than 60 times
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
9DIR term ranking
10DIR term ranking
11GEMET - usage for indexing metainformation
Reporting Obligations Database ROD prototype
Questions Datasets total 22844 Controlled terms
in use 323 of 5300 have been used 86628
times Top ranking Term atmospheric emissions
has been used 14443 times Sources Datasets
total 42 Controlled terms in use 65 of 5300
have been used 256 times
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
12ROD questions term ranking
13ROD questions term ranking
14ROD questions term ranking
15ROD sources term ranking
between 10 and 33
16ROD sources term ranking
17GEMET - usage to browse and retrieve metadata
- GEMET is used to browse or retrieve metadata
within 5 applications - The ThesShow GEMET browser
- The WebCDS accessing the DIR via HTML
- The WebCDS accessing the DIR via JAVA applets
- The multilingual search service (MSS)
- The Reporting Obligation Database (ROD)
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
18JAVA based thesaurus browser for WebCDS
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
19The Multilingual Search Service (MSS)
- Motivation
- distributed multilingual document collections of
European institutions (European Environment
Agency, EEA) - support query formulation in users native
language - search and retrieve documents in all
understandable languages - Approach of EEAs Multilingual Search Service
(MSS) - thesaurus support for query formulation(domain
specific thesaurus required, e.g. GEMET) - translation by making use of multilinguality of
theseaurus(GEMET is available in 12 languages) - use translations as input for off-the-shelf Web
search engine (e.g., Netscape Compass Server)
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
20Using a term for searching metadata
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
21Search results from websites within the EERC
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
22Questionnaire about GEMET usage
Goals - learn more about current users - get
guidance from usage for future development Proces
s - the current 200 GEMET users from all over
the world have been addressed by e-mail -
the 2 page (annexes) questionnaire was made
available digitally and as a form on internet -
Survey was performed in November and December
1999
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
23Areas and frequency of GEMET usage
100 translation
56 translation22 indexing/retrieval
42 translation 33 indexing 22 retrieval
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
24Current usage of languages in GEMET
25Evaluation of the thesaurus content
26Usage of the GEMET browser ThesShow
27GEMET - conclusions from the own indexing
experience and the questionnaire
General guidelines - The GEMET content should
remain stable, minor improvements are
justified - There is a need to add new
functionalities to the tools to allow the user
to customise an own thesaurus system
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY
28Contact information
URL http//www.mu.niedersachsen.de or
http//etc-cds.eionet.eu.int eMail
etc/cds_at_mu.niedersachsen.de
EUROPEAN TOPIC CENTRE ON CATALOGUE OF DATA
SOURCES (ETC/CDS) EUROPEAN ENVIRONMENT AGENCY