Overview of technological solutions to terminology services - PowerPoint PPT Presentation

About This Presentation
Title:

Overview of technological solutions to terminology services

Description:

Overview of technological solutions. to ... JISC Terminology Workshop, London, February 2004. Presentation ... BBCi, A day in the life of BBCi search. ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 34
Provided by: Dou91
Category:

less

Transcript and Presenter's Notes

Title: Overview of technological solutions to terminology services


1
Overview of technological solutions to
terminology services
  • Doug Tudhope
  • Hypermedia Research Unit
  • University of Glamorgan

JISC Terminology Workshop, London, February 2004
2
Presentation
  • Networked Knowledge Organisation Systems/Services
  • Broad review technological approaches
  • NKOS Lifecycle
  • Introduce Workshop Demonstrations
  • Critical Issues and possible gaps
  • References

3
Taxonomy of Knowledge Organisation Systems
  • Term Lists
  • Authority Files, Glossaries, Gazetteers,
    Dictionaries
  • Classification and Categorization
  • Subject Headings
  • Classification Schemes and Taxonomies
  • eg DDC, scientific taxonomies
  • Relationship Schemes
  • Thesauri
  • Semantic Networks (eg WordNet)
  • (Ontologies)
  • Hodg00, http//www.clir.org/pubs/abstract/pub91ab
    st.html

4
KOS ctd.
  • Thesauri
  • 3 Standard Relationships between concepts
    (Aitc00)
  • Equivalence, Hierarchical, Associative
  • Inherent domain lexicon (lead-in vocabulary)
  • Concept definitions and warrant (Scope Notes)
  • Ontologies
  • Higher level conceptualisation (McGu02, Noy)
  • formal definition of relationships
  • inference rules and definition of roles
    (sometimes)
  • KOS an element of ontologies and schemas
  • Jaco03, Ontologies and the Semantic Web,.
  • ASIST Bulletin, April/May 2003, Special Issue on
    Semantic Web

5
Recent Sources
  • NKOS Networked Knowledge Organization
    Systems/Services
  • http//jodi.ecs.soton.ac.uk/?vol4iss4 NKOS
    JoDI Special Issue
  • http//www.multites.com/conference03.htm
    MultiTes Conference
  • http//nkos.slis.kent.edu/ JCDL and ECDL
    Workshops 2003
  • http//www.lub.lu.se/SEMKOS/ SEMKOS IP
    Proposal Resources
  • Semantic Web - RDF/XML, RDF Schema, Metalog, OWL
  • http//www.w3.org/2001/sw/ W3C Semantic Web
    Activity
  • http//www.semanticweb.org/
  • http//ontoweb.aifb.uni-karlsruhe.de/ OntoWeb
  • http//www.w3c.rl.ac.uk/SWAD/thesaurus.html
    SWAD-Europe Thesaurus index
  • Semantic Grid - Semantic Web, Web service,
    eScience, GRID links
  • http//www.semanticgrid.org/
  • http//www.w3.org/2002/ws/ W3C Web Services
    Activity
  • http//www.ariadne.ac.uk/issue29/gardner/intro.htm
    l Gardners Intro to Web Services

6
JISC Application Area
  • Search/retrieval for educational purposes(?)
  • students, teachers, researchers
  • possibly
  • Generalised search
  • possibly integrated into applications
  • triggered to take account of context (eg Brow02)
  • link eScience applications?
  • Current operational systems (eg RDN)
  • lack terminology services
  • some browsing categories
  • but not integrated into search

7
Technologies
  • Information Science
  • Controlled Terminology
  • Information Retrieval (probabilistic, full text)
  • Intellectual/Automatic Indexing
  • search/browse, user interfaces
  • Facet Analysis
  • Ontology Engineering (AI Knowledge
    Representation)
  • formal (finer grained) representation,
    description logics
  • automated reasoning, Semantic Web
  • Distributed Systems
  • Z39.50, Web Services, Semantic Grid
  • Language Engineering
  • Social Engineering

8
Enriching / Formalising KOS
  • KOS Legacy - large (multilingual) vocabularies,
  • indexed multimedia (and print) collections
  • Product of peer review and follow standards
  • However
  • Not utilised to full potential in some
    applications
  • Designed for human inspection, semantic structure
    not explicitly represented
  • May be inconsistently evolved from various
    sources
  • Opportunity to formalise / enrich
  • Partly a matter of representation in RDF/XML
  • but may be inconsistencies in logical structure
  • --gt deconstruction and ontological formalisation
  • --gt mutually exclusive concept structures

9
Facet Analysis (a link between technologies)
  • Fundamental categories / foundational concepts
  • eg CRG Entity, Part, Property, Material,
    Process, Operation, Product, Agent, Space, Time,
    ...
  • Mapped to facets for particular KOS
  • Basis of several scientific and industrial KOS
  • Synthesis rules for principled combination of
    concepts
  • rules for combining base concepts when
    indexing/querying
  • Browsing and Searching applications

10
KOS integration into DL services from Hill02
Research Agenda KOS/DL
  • Taxonomy of KOS - KOS types linked to DL service
    protocols
  • Registries of KOS and KOS-level metadata to
    represent them
  • XML/RDF KOS representations - customisable
  • Core set of relationship types across all KOS
  • General KOS service protocol
  • from which protocols for specific types of KOS
    can be derived
  • Robust linking model in which DL entities
    (collections, objects, and services) can refer to
    KOS entities (concepts, labels, and
    relationships)
  • Visualization tools that fully use and display
    the rich semantics embedded in KOS
  • gt move towards a model of search service flow?
  • - how semantic search services combine

11
Terminology Services from Koch04 Structured
Overview - Activities to advance the powerful use
of vocabularies
  • Searching for concepts
  • schemes in registries
  • concepts/terms in taxonomy servers
  • Search support for queries
  • collection finding
  • cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • query expansion, disambiguation
  • automatic indexing and classification
  • extraction/mining of terms
  • translation support using vocabularies

12
Workshop Demonstrations
  • in context of NKOS Information Lifecycle

13
NKOS Information Lifecycle
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

14
High Level Thesaurus (HILT) - Information Science
Pilot Terminology Service HILT team, Wordmap s/w,
OCLC discovery of collections cross-searching
JISC collections mapping from Terminologies to
DDC spine DDC, LCSH, UNESCO, MeSH, AAT
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

15
geoXwalk - Geographic Information Science
Geo-spatial Gazetteer Service Edina, Data
Archive, CIE feature (concept) searching geographi
c searching, spatial operators spatial result
visualisation, flexible footprint geoparser -
automated geographic indexing
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

16
Renardus - Information Science
cross-browsing service NetLab, UKOLN, ILRT, SUB,
classification mapping via DDC cross-searching
EU subject gateways (multilingual) user interface
for browsing in large classifications
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

17
Learning and Teaching Portal SSL - Information
Science
Systems Simulations Ltd, Index Learning and
Teaching Support Network Web-based thesaurus
service vocabulary management - Suggest a
Term data entry browse and search
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

18
CIE Health Demonstrator - Information Science,
facet analysis
Adiuri Systems Ltd (from IDEA Project) Waypoint
Health Info search demonstrator faceted,
multi-concept query via browsing non-zero
match, postings displayed faceted browsing user
interface
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

19
COHSE Conceptual Open Hypermedia - Ontology,
description logic, hypertext navigation
Link Navigation Using Ontologies Manchester,
Southampton University Open Hypermedia System
(Soton DLS) open-source downloadable tools
for Ontology and Annotation Services eg OilEd
lightweight ontology editor for DAMLOIL
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

20
OpenGALEN - Ontology, GRAIL logic, facet analysis
Open GALEN Common Reference Model - Medical
coding and classification systems Manchester
University, faceted compositional rather than
traditional enumerative medical
codes multilingual GALEN-in-use
Project OpenKnoME, GALEN Case Env toolsets
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

21
Co-ODE Collaborative Open Ontology Development
EnvOntology management
Manchester University new project develop
Ontology management tools as plugins for Protégé
(Stanford) building on earlier experience with
OilEd concern with usability
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

22
FACET faceted knowledge organisation for
semantic retrieval- Information Science, facet
analysis
University of Glamorgan, Science Museum faceted,
multi-concept bestmatch search semantic
expansion as browsing service faceted thesaurus
search interface standalone and Web demonstrators
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

23
E-Biosci EC platform e-publishing and info
integration in Life Sciences - Information Science
European Molecular Biology Organisation Collexis
B. V. technology semantic matching conceptual
fingerprints link genomic data life sciences
research lit multilingual integrated search full
text/data/researchers peer-reviewed, different
publishing models
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

24
SKOS Simple knowledge organisation for the
semantic webInformation Science, Ontology
CCLRC, SWAD-EUROPE project Migrate existing KOS
to SemWeb via common RDF schema for thesauri and
for inter-thesaurus mapping (formal OWL spec
planned) use cases for thesaurus
services lightweight RDF service demonstrators
using Jena RDF API toolkit
  • KOS creation and maintenance
  • Mapping, merging vocabularies
  • Document creation and maintenance
  • Indexing, classification, annotation
  • intellectual, automatic
  • Discovery of services and databases/collections
  • Searching for concepts --gt controlled
    terminology, auto-disambiguation
  • Querying and result display
  • Cross-searching, cross-browsing, mapping services
  • KOS browsing and user interface/visualisation
  • Query expansion
  • Extraction/mining of terms
  • Translation support using vocabularies
  • Content integration and mediation

25
Some critical issues
  • Standards
  • User Interface
  • Gaps?

26
Critical issues (1) Standards
  • Ongoing initiatives to revise thesaurus standards
  • ANSI/NISO Z39.19
  • BS 5723 and BS 6723 - Dext03
  • BSI public draft soon, extended scope,
    interoperability
  • Thesaurus Representations
  • RDF - SWAD03 Topic Map - Ligh03 various XML
  • Possibilities to extend current relationships by
    specialisation,
  • enriching standards but maintaining compatibility
  • KOS Service Protocols - Bind04
  • service oriented approach with composite service
    provision
  • not based on atomic elements of data structures
    and relationships
  • expansion service provision
  • NKOS Registry - Vizi01 MEG Registry Project

27
Cost/benefit issues
  • Thesaurus long-lived, pragmatic and useful tool
  • cost-effective granularity of relationships
  • for some search apps
  • Domain lexicon (UF/ALTs, Scope Notes)
  • Cost/benefit issues in KOS formalisation
  • Application dependent level of precision in
    concept use
  • Some apps very precise use of concepts (medical?)
  • Other apps may vary in concept application
    (humanities?)
  • Indexer - Searcher variation
  • Results based on probable relevance judgements

28
Critical issues (2) User interface
  • User interface critical
  • given controlled terminology demands
  • Offer different options
  • Move beyond minimal assumptions
  • of current web search engines on
  • users, query structure, collections
  • Link with service protocol issues
  • kind of interfaces easily afforded
  • Accessibility issues

29
Critical issues (3) Gaps?
  • Language Engineering
  • Related standards - Shre03
  • POS tagging tools
  • large statistical corpora
  • --gt source of context data
  • for disambiguation, annotation, proactive search
  • JISC-specific corpora?
  • Collect portal use data --gt taxonomies, synonyms
  • Time-varying synonyms - BBCi04
  • Probabilistic IR
  • term frequency information, automatic weighting

30
Social Engineering?
  • What do users really want?
  • Problems of introducing new technologies
  • Sometimes a matter of both reflecting and shaping
    user needs
  • Done implicitly by successful projects
  • but also extant literature on sociology/philosophy
    of innovation
  • Lessons from
  • Participatory Design, Rapid Application
    Development - Tudh00
  • evolving network prototypes, user expectations,
    requirements and working practices
  • Lead / Ambassador Users
  • training, tailoring and advocacy / motivation.

31
Contact Information
  • Doug Tudhope
  • School of Computing
  • University of Glamorgan
  • Pontypridd CF37 1DL
  • Wales, UK
  • dstudhope_at_glam.ac.uk
  • http//www.comp.glam.ac.uk/pages/staff/dstudhope

32
References
  • Aitchison J., Gilchrist A., Bawden D. 2000.
    Thesaurus construction and use a practical
    manual (4th edition). London ASLIB.
  • BBCi, A day in the life of BBCi search.
    http//www.currybet.net/articles/day_in_the_life/i
    ndex.shtml
  • Binding C., Tudhope D. 2004. KOS at your Service
    Programmatic Access to Knowledge Organisation
    Systems. JoDI 4(4), http//jodi.ecs.soton.ac.uk/Ar
    ticles/v04/i04/Binding/
  • Brown P. 2002. From information retrieval to
    hypertext linking. New Review of Hypermedia and
    Multimedia,8, 231-255.
  • Dextre Clarke S. 2003. BS 8723 a new British
    Standard for structured vocabularies.
    http//www.glam.ac.uk/soc/research/hypermedia/NKOS
    -workshop20Folder/dextre_clarke.ppt
  • Hill et al. 2002. Integration of Knowledge
    Organization Systems into Digital Library
    Architectures. ASIST SigCR - http//www.lub.lu.se/
    SEMKOS/docs/Hill_KOSpaper7-2-final.doc
  • Hodge Gail, 2000. Systems of Knowledge
    Organization for Digital Libraries Beyond
    Traditional Authority Files. CLIR Pub91. April
    2000. http//www.clir.org/pubs/abstract/pub91abst.
    html
  • Jacob Elin. 2003. Ontologies and the Semantic
    Web. ASIST Bulletin, April/May 2003, Special
    Issue on Semantic Web. http//www.asis.org/Bulleti
    n/Apr-03/BulletinAprMay03.pdf
  • Koch T. Activities to advance the powerful use of
    vocabularies in the digital environment -
    Structured overview. http//www.lub.lu.se/traugot
    t/drafts/seattlespec-vocab.html
  • Light R. 2003. XML (and Topic Maps).
    http//www.richardlight.org.uk/thesauri/thesauri.h
    tm
  • McGuinness D. 2002. Ontologies Come of Age. In
    (Fensel et al eds.) Spinning the Semantic Web
    Bringing the World Wide Web to Its Full
    Potential. MIT Press.
  • MultiTes 2003. Conference on Thesauri and
    Taxonomies http//www.multites.com/conference03.ht
    m

33
References ctd.
  • NKOS Networked Knowledge Organization
    Systems/Services, http//nkos.slis.kent.edu/
  • NKOS 2003. Workshop ECDL. http//www.glam.ac.uk/so
    c/research/hypermedia/NKOS-Workshop.php
  • NKOS 2004. New Applications of Knowledge
    Organization Systems. NKOS Special Issue, JoDI.
    http//jodi.ecs.soton.ac.uk/?vol4iss4
  • Noy N., McGuinness D. Ontology Development 101 A
    Guide to Creating Your First Ontology.
    http//protege.stanford.edu/publications/ontology_
    development/ontology101-noy-mcguinness.html
  • Shreve G. 2003. Terminology Standards.
    http//www.glam.ac.uk/soc/research/hypermedia/NKOS
    -workshop20Folder/Shreve.ppt
  • Soergel D. The representation of Knowledge
    Organization Structure (KOS) data a multiplicity
    of standards. http//www.glam.ac.uk/soc/research/h
    ypermedia/publications/SoergelNKOS2001KOSStandards
    .PDF
  • SWAD-Europe Thesaurus Activity.
    http//www.w3c.rl.ac.uk/SWAD/thesaurus.html
  • Tudhope D, Beynon-Davies P, Mackay H. 2000.
    Prototyping praxis Constructing computer systems
    and building belief. Human Computer Interaction,
    15(4), 353-383. http//www.glam.ac.uk/soc/research
    /hypermedia/publications/tudhope-2000.pdf
  • Vizine-Goetz D. 2001. NKOS Registry - draft
    proposal for KOS-level metadata.
    http//staff.oclc.org/vizine/NKOS/Thesaurus_Regis
    try_version3_rev.htm
Write a Comment
User Comments (0)
About PowerShow.com