Topic Maps - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Topic Maps

Description:

... Topic Map adapted to different users' viewpoints from free textual documents. ... There's also an online demo. easytopicmaps.com is a wiki site about topic maps. ... – PowerPoint PPT presentation

Number of Views:718
Avg rating:3.0/5.0
Slides: 36
Provided by: lincwebCa
Category:
Tags: free | maps | online | topic

less

Transcript and Presenter's Notes

Title: Topic Maps


1
Topic Maps
  • Vijay Raghavan
  • Distinguished Professor
  • University of Louisiana at Lafayette -ULL
  • The Center for Advanced Computer Studies -CACS

2
What are Topic Maps?
  • An ontology technology
  • based on a model of concepts and relations
  • not hierarchical a network model
  • starts from what the information is about
  • Designed for information and knowledge
    organization
  • based on the concepts from back-of-book indexes
  • used to organize large sets of information
    resources
  • Also used for information integration
  • identity model used for automated merging
  • information can be gathered from many different
    sources and integrated
  • very powerful for portal integration

3
The 2-Layer Topic Map Model
  • The core concepts of Topic Maps are based on
    those of the back-of-book index
  • The same basic concepts have been extended and
    generalized for use with
  • digital information
  • Envisage a 2-layer data model consisting of
  • a set of information resources (below), and a
    knowledge map (above)
  • This is like the division of a book

4
(1) The Information Layer
  • The lower layer contains the content
  • usually digital, but need not be
  • can be in any format or notation
  • can be text, graphics, video, audio, etc.
  • This is like the content of the book to which the
    back-of-book index refers

5
(2) The Knowledge Layer
  • The upper layer consists of topics and
    associations
  • Topics represent the subjects that the
    information is about
  • Like the list of topics that forms a back-of-book
    index
  • Associations represent relationships between
    those subjects
  • Like see also relationships in a back-of-book
    index

6
(2) The Knowledge Layer Example
7
Linking the Layers Through Occurrences
  • The two layers are linked
  • together
  • Occurrences are relationships with information
    resources that are pertinent to a given subject
  • The links (or locators) are like page numbers in
    a back-of-book index

8
Types
  • In topic maps topics can be typed, which provides
    considerable power for describing the world from
    which the topics are taken.
  • This is a capability is missing from traditional
    classification techniques.
  • Using this, one could create types and assign
    them to topics, and thus say that "topic maps"
    is-a "technology, "Norwegian" is-a "language",
    and "TMQL" is-a "query language", and so on.
  • This is a very simple capability, but it is also
    very powerful.

9
Types
  • Once explicit types have been provided it is
    possible to let the user perform searches such as
    "find 'paris', but show only 'places'", or to
    show lists of all cities, separate from other
    kinds of subjects.
  • Without types there is no way to do this, since
    the necessary information will be missing.
  • Since the types are themselves topics the creator
    of the topic map can choose which types to use
  • As a result, the model is infinitely extensible
    and adaptable and can capture just about any kind
    of information.

10
Additional features Scope
  • Scope can be attached to any name, occurrence, or
    association in a topic map. Basically, scope can
    be attached to anything you can say in a topic
    map. Scope allows you to qualify a statement, but
    still express it.
  • Users can then choose to see all information in
    all scopes, or only those in particular scopes,
    basically tailoring their view of the world as
    they want to see it.

11
Additional features Scope
  • Example
  • If we have topic map about languages, and basing
    it on the ISO 639 and Ethnologue lists of
    language codes.
  • We might want to record that ISO 639 assigns
    English to the Germanic language group, while
    Ethnologue considers it a West Germanic language.
  • This can be done by scoping the association
    between English and Germanic with a topic
    representing ISO 639, and the association between
    English and West Germanic with a topic
    representing Ethnologue.
  • Similarly, one might use scope to record that
    what Ethnologue calls Maldivian, ISO 639 calls
    Divehi.

12
Additional features URI
  • URIs are used to identify subjects.
  • A topic may have any number of subject
    identifiers (URIs) which identify the subject the
    topic is about.
  • These URIs should point to resources which
    describe the subject to a human the resources
    are known as subject indicators.
  • This allows subjects to be uniquely identified
    across topic maps and the entire web. For
    example, the URI http//www.topicmaps.org/xtm/1.0/
    core.xtmsuperclass-subclass uniquely identifies
    the subclassing association type.

13
Additional features Merge
  • This unambiguous identification of subjects is
    used in topic maps to merge topics that, through
    these identifiers, are known to have the same
    subject.
  • Two topics with the same subject are replaced by
    a new topic that has the union of the
    characteristics (names, occurrences, and
    associations) of the two originals. There is in
    fact a well-defined procedure for automatically
    merging topic maps based on this rule.
  • The combination of globally unique identifiers
    and the merging procedure makes integration of
    diverse information sources and reuse of
    information very much easier.

14
OASIS
  • OASIS is a published subjects activity, which is
    developing guidelines for how to create, publish,
    and maintain subject indicators intended for wide
    usage.
  • One example of this is well-known URIs for all
    the countries in the world (based on the ISO 3166
    country codes), which will allow us to tell that
    when you say 'Norway' in one topic map and I say
    'Norge' in another, we mean the same thing. More
    on this in a future article.

15
Use case forskning.no
  • Norwegian government portal to popular science
    and research information
  • basically an online popular science journal
  • owned by the Norwegian Research Council
  • name means research.no
  • Purpose
  • present science and research information to young
    adults
  • intended to raise interest and recruitment

16
The Dual Classification
17
Topic maps Standards
  • Topic maps are an ISO standard, published as
    ISO/IEC 13250 in 2000.
  • That standard defines the basic model and an
    SGML-based syntax for it, which uses HyTime for
    linking, and is therefore known as HyTM.
  • TopicMaps.Org is an organization that was formed
    to create a more web-optimized topic map syntax
    based on XML and URIs.

18
Topic maps Standards
  • TopicMaps.Org published its XML Topic Maps (XTM)
    1.0 specification in early 2001, and in October
    of the same year that syntax was accepted into
    the second edition of ISO 13250 as an annex.
  • Today, XTM is the main topic map syntax and is
    supported by nearly all topic map tools.

19
How to create a topic map?
  • For creating a topic map there are four main
    approaches
  • Have humans author the topic maps manually. This
    usually gives very high-quality and rich topic
    maps, but at the cost of human labor. This is
    appropriate for some projects, while
    prohibitively expensive for others.
  • Automatically generate the topic map from
    existing source data. This can give very good
    results if the existing data are well-structured
    .
  • if not, there are various natural-language
    processing tools that might help.

20
Topic Maps Construction Approaches
  • Automatically produce the topic map from
    structured source data like XML, RDBMSs, LDAP
    servers, and more specialized applications.
  • To produce a topic map we need a text editor,
    and for automatic generation XSLT stylesheets can
    be used perfectly well. This won't be enough for
    all uses, of course, and therefore there is
    specialized software for topic map editing and
    automatic generation of topic maps.

21
Topic Maps Construction
22
Functionalities of the Construction Process
  • A TM building approach must include the following
    functionalities
  • Defining resources identifying resource types,
    adding, deleting, modifying, and merging
    resources.
  • Identifying and maintaining concepts/topics.
  • Identifying and maintaining relationships/associat
    ions between topics and relationship instances.

23
Functionalities of the Construction Process..
  • Defining different views on a Topic Map including
    selected topics, relationships, and/or resources.
  • Storing Topic Maps persistently either in
    standard XTM files or in databases.
  • Merging Topic Maps.
  • Importing/exporting Topic Maps.
  • Including external resources
  • Providing a user interface for search and
    navigation in the TM
  • Evaluating and validating the resulting Topic
    Map.

24
TM construction approaches
  • Topic Maps Building from Existing Data Sources
  • TM construction from structured document content
    such as XML documents and web pages.
  • from document metadata (RDF documents)
  • From structured knowledge such as ontologies,
    existing database schema, learning repositories
  • from unstructured documents.

25
TM construction from structured document content
  • This approach intends to extract knowledge from
    web sites to help users find relevant information
    on the Web using clustering (unsupervised
    learning) techniques.
  • The process starts by defining the profile of a
    TM (and later applying it to Web sites). Which
    characterize Topic Maps and help evaluate their
    relevance to users' information needs.
  • Second, the analysis identifies topics that have
    no interest

26
from document metadata
  • This aims to develop a framework and toolkit for
    auto-generating topic maps, called MapMaker, it
    consists of a set of configurable processing
    modules
  • which are chained together according to the needs
    of each individual auto-generation application.
  • The different modules have access to an RDF model
    that is constructed during processing.
  • The RDF model is cleaned and extended, finally
    converted to a topic map.

27
From structured knowledge
  • The architecture of the repository includes
    wrappers created to convert disperse knowledge
    structures into an integrated XML schema used in
    the repository
  • the repository is implemented as a relational
    database (using MYSQL), an XML-enabled
    application server, a customized XML schema for
    Topic Maps, a set of XML stylesheets for
    transforming and displaying topic maps, a set of
    Java servlets and jsp programs to generate XML
    files dynamically from the database,

28
from unstructured documents
  • The proposed approaches are based on different
    extraction techniques, namely learning techniques
    and Natural Language Processing techniques.
  • This approach has focused on Inductive Natural
    Language Processing techniques to construct a
    Navigable Topic Map adapted to different users'
    viewpoints from free textual documents.
  • In fact, by keeping track of words' association
    patterns, the system detect fluctuations in
    words' meanings which can reveal different points
    of view.

29
Putting Topic Maps in Context
  • Topic maps are really an add-on to XML, something
    that adds extra value beyond what XML itself can
    do.
  • Topic maps can be used without using XML at all.
  • The two standards are similar without competing.
    They both have data models, interchange syntaxes,
    query languages, schema languages, and so on.
  • Being developed for different purposes and doing
    different things. they can peacefully coexist and
    complement one another.

30
Putting Topic Maps in Context
  • The relationship between RDF and topic maps is
    less obvious, however. Structurally, they are
    very similar, and their semantics are very close,
    although the distinctions in topic maps between
    base names, occurrences, and associations do not
    exist in RDF.
  • At first glance it may appear that they are
    nearly the same, but on closer inspection it
    turns out that their respective communities think
    of the technologies in very different ways, and
    that features such as scope and merging actually
    make them rather different after all. Again, the
    conclusion seems to be that they are good for
    different things, and that there is room for
    both.

31
What should be used where?
  • Generally, use XML for interchange and document
    contents, RDF for fine-grained metadata, and
    topic maps for making information findable and
    anything that is mostly about relationships

32
Benefits of topic maps
  • A simple, intuitive model
  • easy to teach to people, easy to apply
  • focus on findability subjects and their
    relations
  • Supports information architecture patterns
  • taxonomies, thesauri, faceted classification
  • synonym rings, best bets
  • Formal structure
  • supports advanced searching and querying
  • can be exploited in many different ways, once
    created
  • Easy to create web sites from
  • the site structure flows from the Topic Maps
    ontology
  • a simple natural model means you get an
    understandable site
  • advanced search capabilities can be added easily

33
Summary
  • Topic maps are not so much an extension of the
    traditional schemes as on a higher level. That
    is, thesauri extend taxonomies, by adding more
    built-in relationships and properties.
  • Topic maps do not add to a fixed vocabulary, but
    provide a more flexible model with an open
    vocabulary.
  • A consequence of this is that topic maps can
    actually represent taxonomies, thesauri, faceted
    classification, synonym rings, and authority
    files, simply by using the fixed vocabularies of
    these classifications as a topic map vocabulary.

34
Tools References
  • The TAO of topic maps, the classic introduction
    to topic maps.
  • TM4J is an open source topic map engine project
    in Java.
  • Perl XTM is an open source topic map engine in
    Perl.
  • tmproc is an open source topic map engine in
    Python.
  • The Omnigator is a free (as in beer) topic map
    browser that can display any topic map. There's
    also an online demo.
  • easytopicmaps.com is a wiki site about topic
    maps.

35
Tools References
  • topicmap.com is a useful site about topic maps.
  • XTM 1.0 is currently the most important
    specification. It has now been incorporated in
    the second edition of the topic map ISO standard.
  • isotopicmaps.com tells you where the topic map
    standards are headed next.
  • LTM, the Linear Topic Map notation, is a
    text-based syntax for topic maps that is easier
    to read and write for humans than XTM.
  • Jan Algermissen maintains a registry of publicly
    available topic maps.
Write a Comment
User Comments (0)
About PowerShow.com