Title: Real-Time Generation of Topic Maps from Speech Streams
1Real-Time Generation of Topic Maps from Speech
Streams
- TMRA'05
- Internatioal Workshop on Topic Maps Research and
Applications06.10.2005 - Karsten Böhm, Lutz Maicher
- University of Leipzig
- boehmmaicher_at_informatik.uni-leipzig.de
2Introduction
- Topic Maps are means for
- representing (powerful) indexes
- of any information collection
- for semantic information integration
- Our goal
- real-time generation of conceptual indexes of
speech streams, - represented as Topic Maps
- for integration with other information systems
3How to create Topic Maps
- Topic Maps are a semantic technology ...
- ...only in the perspective of information
integration - holding the Co-location objective always true
- Subject Proxies indicating identical Subjects
has to be viewed as merged ones - Subject Equality Decision Approach
- Subject Viewing Approach
- We have to represent the created indexes to hold
the Co-location objective true in the perspective
of the creator .... - ... and therefore we need a theoretic
fundament.
4Subject Equality Decision Chain
From the child's perspective (Elgs are
sweet.) I caught always the same Subject, an
elg.
From the zoologist's perspective (Elgs are
loners.) I caught two deers and three elgs.
From the ranger's perspective (Bernd needs a
cow) I caught Lisa, Ud (fighting), and Bernd
(in summer, in winter and as calf).
5Subject Equality Decision Chain
2. Sensory Systems come to stage, catching
Subject Stages
3. Documenting the impressions (from the rangers
perspective)
4. Subject Equality is decided according to the
governing SMD
(1) Subjectness I'm only interested in Lisa, Ud,
and Bernd not in snow, trees.
(2) Creating Subject Proxies for the current
Subject Stages of Lisa, Ud and Bernd
(3) Try to document the decision about the
Subject Identity of the current Subject
Stage by the given means of the governing SMD
ontology, TMV ontology and TMV vocabulary.
Subject Identity of Subject Stages is
mapped to Subject Indication of the
Subject Proxy
(4) Document all further information observed
about the Subject Stage. (Documenting
modelling loosing information)
6Subject Equality Decision Chain
- Co-Location Objective Subject Proxies indicating
identical Subjects - World without any sensory system
- How to make a qualified assertion about the very
nature of Subjects? - Sensory systems come on stage, catching Subject
Stages - Never Subjects, only Subject Stages (see Quine)
are observed - Subject Identity Subject Stages caught at
different occassions belong to the same Subject
(see Vatants hubjects) - perspective dependent (see Biezunsky)
- decision process under uncertainty
- Documenting the impressions from a perspective
- Subjectness in the current perspective
- observations are documented restricted by the
available vocabulary (SMD Ontology, TMV ontology,
TMV vocabulary) - Decision about Subject Identity is documented
according to the governing Subject Indication
Approach - Subject Equality is decided according to a SMD
7The Observation Principle
.. or how to create Topic Maps from digital
domains?
(1.) Observe the information collections in
interest (texts, video streams, etc.) and
detect Subject Stages of Subjects in interest
from the current perspective.
(2.) Decide about the Subject Identity of the
observed Subject Stages.
(3.) Create a Subject Proxy for each Subject
Stage in interest.
(4.) Document the decision about the Subject
Identity of the current Subject Stage by
the given means of the governing SMD ontology,
TMV ontology and TMV vocabulary. (
... and with respect to all expected Subject
Equality Decision Approaches applied
later to this Subject Proxy)
(5.) Document all further information observed
about the Subject Stage by the given means
of the governing SMD ontology, TMV ontology and
TMV vocabulary.
8The Semantic Talk System
- Focusses on the support of group oriented
conversation - Implementation of a minimal invasive
IT-solution - Application for interviewing scenarios,
innovation processes and early stages of product
development - Semantic Talk creates powerful, conceptual
indexes of Speech Streams in real-time - Combines speech recognition (LinguaTecs
VoicePro) with Text Mining algorithms - Provides dynamic visualization (extended Version
of TouchGraph) - Networked application with multiple clients
- Provides a generic RDF-export
- Cooperation with University Duisburg-Essen, ISA
Informationssysteme GmbH
9SemantikTalk Speech recognition and text Mining
Overview window (birds eye view)
Window for add. Information (documents,
pictures)
Sliders for configuration parameters (zooms)
local context window
10The Semantic Talk System
11Semantic Talk creates indexes of speech streams
we have to represent them as Topic Maps
and use them for semantic information integration
12From RDF-output to LTM
ST did observe a noticeable usage of the term
"Fisichella" in the speech stream ...
ltstnode rdfID"node_Fisichella"gt
ltstIDgt160615lt/stIDgt ltstlabelgtFisichellalt/stl
abelgt ltstnodelevelgt1lt/stnodelevelgt
ltstref_wort_nr rdfresource"http//www.tt.de/dtd
/st/papnode_160615"/gt ltstvariant stindex"3"
sttype"4" stweight"0.3176"/gt lt/stnodegt
Semantic Mapping between RDF-output and Topic Map
using the Omnigator ...
id7406 id7276 "Fisichella"
_at_"http//www.texttech.de/dtd/st/papnode_160615"
_at_"http//www.texttech.de/dtd/st/papnode_Fisich
ella" id7406, id3670, 1 id7406,
id7650, 160615 id7549( id7406 id463, id464
id2195 ) id464 id464, id1636,
0.31766722453166335 id464, id4378,
3 id464, id787, 4
... and this 'noticeable usage of the term
Fisichella' becomes the Subject in the Topic
Map. (Subject Identity gt the same algorithms
observes the 'noticable usage' twice)
13Integration with other Topic Maps ...
Starting point Integration with an other Topic
Map created by the
observation principle (for
example a motor-sport Topic Map)
- a mapping Topic Map is needed (which should
be created under the observation principle, too)
... to allow more accurate mapping decisions,
it seems to be necessary that the creation
process of a Topic Map needs to be
documented, too.
14Discussion