Title: Overview of technological solutions to terminology services
1Overview of technological solutions to
terminology services
- Doug Tudhope
- Hypermedia Research Unit
- University of Glamorgan
JISC Terminology Workshop, London, February 2004
2Presentation
- Networked Knowledge Organisation Systems/Services
- Broad review technological approaches
- NKOS Lifecycle
- Introduce Workshop Demonstrations
- Critical Issues and possible gaps
- References
3Taxonomy of Knowledge Organisation Systems
- Term Lists
- Authority Files, Glossaries, Gazetteers,
Dictionaries - Classification and Categorization
- Subject Headings
- Classification Schemes and Taxonomies
- eg DDC, scientific taxonomies
- Relationship Schemes
- Thesauri
- Semantic Networks (eg WordNet)
- (Ontologies)
- Hodg00, http//www.clir.org/pubs/abstract/pub91ab
st.html
4KOS ctd.
- Thesauri
- 3 Standard Relationships between concepts
(Aitc00) - Equivalence, Hierarchical, Associative
- Inherent domain lexicon (lead-in vocabulary)
- Concept definitions and warrant (Scope Notes)
- Ontologies
- Higher level conceptualisation (McGu02, Noy)
- formal definition of relationships
- inference rules and definition of roles
(sometimes) - KOS an element of ontologies and schemas
- Jaco03, Ontologies and the Semantic Web,.
- ASIST Bulletin, April/May 2003, Special Issue on
Semantic Web
5Recent Sources
- NKOS Networked Knowledge Organization
Systems/Services - http//jodi.ecs.soton.ac.uk/?vol4iss4 NKOS
JoDI Special Issue - http//www.multites.com/conference03.htm
MultiTes Conference - http//nkos.slis.kent.edu/ JCDL and ECDL
Workshops 2003 - http//www.lub.lu.se/SEMKOS/ SEMKOS IP
Proposal Resources - Semantic Web - RDF/XML, RDF Schema, Metalog, OWL
- http//www.w3.org/2001/sw/ W3C Semantic Web
Activity - http//www.semanticweb.org/
- http//ontoweb.aifb.uni-karlsruhe.de/ OntoWeb
- http//www.w3c.rl.ac.uk/SWAD/thesaurus.html
SWAD-Europe Thesaurus index - Semantic Grid - Semantic Web, Web service,
eScience, GRID links - http//www.semanticgrid.org/
- http//www.w3.org/2002/ws/ W3C Web Services
Activity - http//www.ariadne.ac.uk/issue29/gardner/intro.htm
l Gardners Intro to Web Services
6JISC Application Area
- Search/retrieval for educational purposes(?)
- students, teachers, researchers
- possibly
- Generalised search
- possibly integrated into applications
- triggered to take account of context (eg Brow02)
- link eScience applications?
- Current operational systems (eg RDN)
- lack terminology services
- some browsing categories
- but not integrated into search
7Technologies
- Information Science
- Controlled Terminology
- Information Retrieval (probabilistic, full text)
- Intellectual/Automatic Indexing
- search/browse, user interfaces
- Facet Analysis
- Ontology Engineering (AI Knowledge
Representation) - formal (finer grained) representation,
description logics - automated reasoning, Semantic Web
- Distributed Systems
- Z39.50, Web Services, Semantic Grid
- Language Engineering
- Social Engineering
8Enriching / Formalising KOS
- KOS Legacy - large (multilingual) vocabularies,
- indexed multimedia (and print) collections
- Product of peer review and follow standards
- However
- Not utilised to full potential in some
applications - Designed for human inspection, semantic structure
not explicitly represented - May be inconsistently evolved from various
sources -
- Opportunity to formalise / enrich
- Partly a matter of representation in RDF/XML
- but may be inconsistencies in logical structure
- --gt deconstruction and ontological formalisation
- --gt mutually exclusive concept structures
-
9Facet Analysis (a link between technologies)
- Fundamental categories / foundational concepts
- eg CRG Entity, Part, Property, Material,
Process, Operation, Product, Agent, Space, Time,
... - Mapped to facets for particular KOS
- Basis of several scientific and industrial KOS
- Synthesis rules for principled combination of
concepts - rules for combining base concepts when
indexing/querying - Browsing and Searching applications
10KOS integration into DL services from Hill02
Research Agenda KOS/DL
- Taxonomy of KOS - KOS types linked to DL service
protocols - Registries of KOS and KOS-level metadata to
represent them - XML/RDF KOS representations - customisable
- Core set of relationship types across all KOS
- General KOS service protocol
- from which protocols for specific types of KOS
can be derived - Robust linking model in which DL entities
(collections, objects, and services) can refer to
KOS entities (concepts, labels, and
relationships) - Visualization tools that fully use and display
the rich semantics embedded in KOS - gt move towards a model of search service flow?
- - how semantic search services combine
11Terminology Services from Koch04 Structured
Overview - Activities to advance the powerful use
of vocabularies
- Searching for concepts
- schemes in registries
- concepts/terms in taxonomy servers
- Search support for queries
- collection finding
- cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- query expansion, disambiguation
- automatic indexing and classification
- extraction/mining of terms
- translation support using vocabularies
12Workshop Demonstrations
- in context of NKOS Information Lifecycle
13NKOS Information Lifecycle
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
14High Level Thesaurus (HILT) - Information Science
Pilot Terminology Service HILT team, Wordmap s/w,
OCLC discovery of collections cross-searching
JISC collections mapping from Terminologies to
DDC spine DDC, LCSH, UNESCO, MeSH, AAT
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
15geoXwalk - Geographic Information Science
Geo-spatial Gazetteer Service Edina, Data
Archive, CIE feature (concept) searching geographi
c searching, spatial operators spatial result
visualisation, flexible footprint geoparser -
automated geographic indexing
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
16Renardus - Information Science
cross-browsing service NetLab, UKOLN, ILRT, SUB,
classification mapping via DDC cross-searching
EU subject gateways (multilingual) user interface
for browsing in large classifications
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
17Learning and Teaching Portal SSL - Information
Science
Systems Simulations Ltd, Index Learning and
Teaching Support Network Web-based thesaurus
service vocabulary management - Suggest a
Term data entry browse and search
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
18CIE Health Demonstrator - Information Science,
facet analysis
Adiuri Systems Ltd (from IDEA Project) Waypoint
Health Info search demonstrator faceted,
multi-concept query via browsing non-zero
match, postings displayed faceted browsing user
interface
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
19COHSE Conceptual Open Hypermedia - Ontology,
description logic, hypertext navigation
Link Navigation Using Ontologies Manchester,
Southampton University Open Hypermedia System
(Soton DLS) open-source downloadable tools
for Ontology and Annotation Services eg OilEd
lightweight ontology editor for DAMLOIL
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
20OpenGALEN - Ontology, GRAIL logic, facet analysis
Open GALEN Common Reference Model - Medical
coding and classification systems Manchester
University, faceted compositional rather than
traditional enumerative medical
codes multilingual GALEN-in-use
Project OpenKnoME, GALEN Case Env toolsets
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
21Co-ODE Collaborative Open Ontology Development
EnvOntology management
Manchester University new project develop
Ontology management tools as plugins for Protégé
(Stanford) building on earlier experience with
OilEd concern with usability
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
22FACET faceted knowledge organisation for
semantic retrieval- Information Science, facet
analysis
University of Glamorgan, Science Museum faceted,
multi-concept bestmatch search semantic
expansion as browsing service faceted thesaurus
search interface standalone and Web demonstrators
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
23E-Biosci EC platform e-publishing and info
integration in Life Sciences - Information Science
European Molecular Biology Organisation Collexis
B. V. technology semantic matching conceptual
fingerprints link genomic data life sciences
research lit multilingual integrated search full
text/data/researchers peer-reviewed, different
publishing models
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
24SKOS Simple knowledge organisation for the
semantic webInformation Science, Ontology
CCLRC, SWAD-EUROPE project Migrate existing KOS
to SemWeb via common RDF schema for thesauri and
for inter-thesaurus mapping (formal OWL spec
planned) use cases for thesaurus
services lightweight RDF service demonstrators
using Jena RDF API toolkit
- KOS creation and maintenance
- Mapping, merging vocabularies
- Document creation and maintenance
- Indexing, classification, annotation
- intellectual, automatic
- Discovery of services and databases/collections
- Searching for concepts --gt controlled
terminology, auto-disambiguation - Querying and result display
- Cross-searching, cross-browsing, mapping services
- KOS browsing and user interface/visualisation
- Query expansion
- Extraction/mining of terms
- Translation support using vocabularies
- Content integration and mediation
25Some critical issues
- Standards
- User Interface
- Gaps?
26Critical issues (1) Standards
- Ongoing initiatives to revise thesaurus standards
- ANSI/NISO Z39.19
- BS 5723 and BS 6723 - Dext03
- BSI public draft soon, extended scope,
interoperability - Thesaurus Representations
- RDF - SWAD03 Topic Map - Ligh03 various XML
- Possibilities to extend current relationships by
specialisation, - enriching standards but maintaining compatibility
- KOS Service Protocols - Bind04
- service oriented approach with composite service
provision - not based on atomic elements of data structures
and relationships - expansion service provision
- NKOS Registry - Vizi01 MEG Registry Project
27Cost/benefit issues
- Thesaurus long-lived, pragmatic and useful tool
- cost-effective granularity of relationships
- for some search apps
- Domain lexicon (UF/ALTs, Scope Notes)
- Cost/benefit issues in KOS formalisation
- Application dependent level of precision in
concept use - Some apps very precise use of concepts (medical?)
- Other apps may vary in concept application
(humanities?) - Indexer - Searcher variation
- Results based on probable relevance judgements
28Critical issues (2) User interface
- User interface critical
- given controlled terminology demands
- Offer different options
- Move beyond minimal assumptions
- of current web search engines on
- users, query structure, collections
- Link with service protocol issues
- kind of interfaces easily afforded
- Accessibility issues
29Critical issues (3) Gaps?
- Language Engineering
- Related standards - Shre03
- POS tagging tools
- large statistical corpora
- --gt source of context data
- for disambiguation, annotation, proactive search
- JISC-specific corpora?
- Collect portal use data --gt taxonomies, synonyms
- Time-varying synonyms - BBCi04
- Probabilistic IR
- term frequency information, automatic weighting
30Social Engineering?
- What do users really want?
- Problems of introducing new technologies
- Sometimes a matter of both reflecting and shaping
user needs - Done implicitly by successful projects
- but also extant literature on sociology/philosophy
of innovation - Lessons from
- Participatory Design, Rapid Application
Development - Tudh00 - evolving network prototypes, user expectations,
requirements and working practices - Lead / Ambassador Users
- training, tailoring and advocacy / motivation.
31Contact Information
- Doug Tudhope
- School of Computing
- University of Glamorgan
- Pontypridd CF37 1DL
- Wales, UK
- dstudhope_at_glam.ac.uk
- http//www.comp.glam.ac.uk/pages/staff/dstudhope
32References
- Aitchison J., Gilchrist A., Bawden D. 2000.
Thesaurus construction and use a practical
manual (4th edition). London ASLIB. - BBCi, A day in the life of BBCi search.
http//www.currybet.net/articles/day_in_the_life/i
ndex.shtml - Binding C., Tudhope D. 2004. KOS at your Service
Programmatic Access to Knowledge Organisation
Systems. JoDI 4(4), http//jodi.ecs.soton.ac.uk/Ar
ticles/v04/i04/Binding/ - Brown P. 2002. From information retrieval to
hypertext linking. New Review of Hypermedia and
Multimedia,8, 231-255. - Dextre Clarke S. 2003. BS 8723 a new British
Standard for structured vocabularies.
http//www.glam.ac.uk/soc/research/hypermedia/NKOS
-workshop20Folder/dextre_clarke.ppt - Hill et al. 2002. Integration of Knowledge
Organization Systems into Digital Library
Architectures. ASIST SigCR - http//www.lub.lu.se/
SEMKOS/docs/Hill_KOSpaper7-2-final.doc - Hodge Gail, 2000. Systems of Knowledge
Organization for Digital Libraries Beyond
Traditional Authority Files. CLIR Pub91. April
2000. http//www.clir.org/pubs/abstract/pub91abst.
html - Jacob Elin. 2003. Ontologies and the Semantic
Web. ASIST Bulletin, April/May 2003, Special
Issue on Semantic Web. http//www.asis.org/Bulleti
n/Apr-03/BulletinAprMay03.pdf - Koch T. Activities to advance the powerful use of
vocabularies in the digital environment -
Structured overview. http//www.lub.lu.se/traugot
t/drafts/seattlespec-vocab.html - Light R. 2003. XML (and Topic Maps).
http//www.richardlight.org.uk/thesauri/thesauri.h
tm - McGuinness D. 2002. Ontologies Come of Age. In
(Fensel et al eds.) Spinning the Semantic Web
Bringing the World Wide Web to Its Full
Potential. MIT Press. - MultiTes 2003. Conference on Thesauri and
Taxonomies http//www.multites.com/conference03.ht
m
33References ctd.
- NKOS Networked Knowledge Organization
Systems/Services, http//nkos.slis.kent.edu/ - NKOS 2003. Workshop ECDL. http//www.glam.ac.uk/so
c/research/hypermedia/NKOS-Workshop.php - NKOS 2004. New Applications of Knowledge
Organization Systems. NKOS Special Issue, JoDI.
http//jodi.ecs.soton.ac.uk/?vol4iss4 - Noy N., McGuinness D. Ontology Development 101 A
Guide to Creating Your First Ontology.
http//protege.stanford.edu/publications/ontology_
development/ontology101-noy-mcguinness.html - Shreve G. 2003. Terminology Standards.
http//www.glam.ac.uk/soc/research/hypermedia/NKOS
-workshop20Folder/Shreve.ppt - Soergel D. The representation of Knowledge
Organization Structure (KOS) data a multiplicity
of standards. http//www.glam.ac.uk/soc/research/h
ypermedia/publications/SoergelNKOS2001KOSStandards
.PDF - SWAD-Europe Thesaurus Activity.
http//www.w3c.rl.ac.uk/SWAD/thesaurus.html - Tudhope D, Beynon-Davies P, Mackay H. 2000.
Prototyping praxis Constructing computer systems
and building belief. Human Computer Interaction,
15(4), 353-383. http//www.glam.ac.uk/soc/research
/hypermedia/publications/tudhope-2000.pdf - Vizine-Goetz D. 2001. NKOS Registry - draft
proposal for KOS-level metadata.
http//staff.oclc.org/vizine/NKOS/Thesaurus_Regis
try_version3_rev.htm