Graz - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Graz

Description:

About the use of UNL in key words, key images and key concepts transcultural analysis – PowerPoint PPT presentation

Number of Views:154
Avg rating:3.0/5.0
Slides: 25
Provided by: chollet
Category:

less

Transcript and Presenter's Notes

Title: Graz


1
About the use of UNL in key words, key images
and key concepts transcultural analysis
2
Preparing for the 2008 Beijing Olympics The
LingTour and KNOWLISTICS projects
MAO Yuhang, DING Xiao-Qing, NI Yang, LIN
Shiuan-Sung, Laurence LIKFORMAN, Gérard
CHOLLET Presented here by Gérard
CHOLLETchollet_at_tsi.enst.fr ENST/CNRS-LTCIhttp//
www.tsi.enst.fr/chollet
3
Outline
  • Rationale of the proposal
  • Objectives
  • The Beijing 2008 Olympics
  • Approaches
  • Multimedia, multilingual information server
  • Intelligent Camera
  • Bilingual Voice Communicator
  • Needs and relevance
  • A PDA for tourists and travelling businessmen
  • Conclusions and Perspectives

4
Rationale for the IP-KNOWLISTICS
  • Logistics for knowledge in a specific domain (OG)
  • Language independent knowledge representation and
    management
  • Multimedia (text, speech, image, video)
  • Multimodal access (text, speech, visual I/O)
  • Distributed multimedia server accessible from
    mobile terminals (phone, PDA, PC,)
  • Primarily targetted to tourist applications
    initially
  • 2008 Beijing Olympics as a field trial

5
Technical developments
  • Language independent knowledge representation
    (using conceptual graphs and an Intermediate
    Representation Language like Universal
    Networking Language)
  • Summarisation and reformulation of texts
  • Generation in 12 target languages
  • Speech synthesis and recognition
  • VoiceXML-based interactive dialog agent
  • Intelligent camera with Chinese character
    recognition
  • Cross-language Multimodal communicator on a PDA
  • Cross-language lexical access

6
Chinese character recognition
7
Intelligent camera from Tsinghua Univ.
8
Extracting text in scene images
  • Complex color images
  • Uncontrolled illumination
  • Variations size, fonts, orientation, texture
  • Complex backgrounds, shadows

9
Text extraction
  • Searching for character regions (text has uniform
    color)
  • Multi-channel decomposition
  • Connected components analysis
  • Grouping of components
  • Alignment analysis (number of horizontally or
    vertically aligned components)
  • Text identification (language independant
    features size, alignment,)
  • Detection rate 84
  • False alarm rate 5.6

10
Cross-language Multimodal Communicator
  • Use of a visual display (e.g. on a PDA) to
    mediate the dialogue between 2 persons speaking
    different languages.
  • Recognition of short utterances, display of a
    word graph, selection of keywords, visualisation
    (and synthesis) of the translation of key words
    and groups of words.
  • Specialised lexicon for dialog acts in typical
    touristic situations (in a restaurant, at the
    hotel, in the street, in public transport, about
    the Olympic games,)
  • UMTS access to an information server offering
    maps, photographs, video sequences,

11
Generation in target languages
  • Sharing of acoustic models between languages to
    simplify extensibility to other languages.
  • Combination of phone models with small amounts of
    data.
  • Models adaptation to user and environmental
    situations.

Chinese
French
Language specific models
12
Knowledge representation
  • A formal language for representating the meaning
    of natural language sentence.
  • UNL (Universal Networking Language) introduced to
    describe natural language semantics.
  • Language-independent context indexing, possible
    for cross-language information retrieval.
  • Use of conceptual hierarchy of UNL to address the
    inherent ambiguity of natural languages.
  • A set of semantic relations (linking concepts
    together) for a structured information pattern.

13
UNL representation
The cat drank the milk
can be encoded by
agt(drink(iclgtdo,agtgtthing, objgtliquid)._at_past._at_ent
ry, cat(iclgtmammalgtanimal)._at_def)
obj(drink(iclgtdo,agtgtthing, objgtliquid)._at_past._at_ent
ry, milk(iclgtbeveragegtfood)._at_def)
agt, obj are binary semantic relations
14
Role of semantic contents representation in
indexing
Users request
AudioVideo
Textual
Digital
UNL encoding
Cross lingual Multimedia platform
User specific information
UNL decoding
15
Application architecture
UMTS server
Access information
a word graph, a list of keywords
Translation
Speech synthesis
16
Digital OlympicMulti-Language Information
Network Service System Project
17
From VoiceXML to VoiceUNL.
Presented here by Gérard CHOLLETchollet_at_tsi.enst
.fr ENST/CNRS-LTCIhttp//www.tsi.enst.fr/chollet
With the contribution of Christian BOITET
18
Outline
  • Rationale of the proposition
  • Objectives
  • Promotion of a new standard, demonstrations
  • Approaches
  • An extra layer of VoiceXML
  • Need and relevance
  • Multilingual Vocal Servers
  • Integration and structuring effect
  • Conclusions and Perspectives

19
Rationale for VoiceUNL
  • Need for Language Independent Vocal Servers,
  • ? Need for a language independent knowledge
    representation and management formalism
  • Principle of proposed solution
  • Start from UNL graphs augmented with
    voice-oriented semantic marks (special UWs,
    attributes),
  • Generate in the target language,
  • Voice-oriented marks become prosodic markers,
  • Final conversion to VoiceXML
  • 2008 Beijing Olympics as a field trial

20
What is VoiceXML ?
  • A recommendation of W3C (WWW Consortium)
  • An extension of XML for vocal information
    servers,
  • A set of normalised markup tags,
  • Current ags concern language identification,
    voice prompting, speech synthesis, form filling,
    barge in, echo cancelling,
  • No provision to access a semantically encoded
    data base,
  • Need for a UNL-type front-end
  • Compatibility with MPEG4-SNHC (talking head)

21
Applications
22
Prosodic information in UNL
  • Attributes that can influence the grammatical and
    the prosodic structure of a sentence already
    exist
  • _at_emphasis
  • _at_qfocus
  • Representations should be defined, concerning
  • Emotion _at_angry, _at_bored, _at_relaxed?
  • Focus grouping words to emphasize in a scope?
  • Passivity _at_passive?
  • Speaker _at_age, _at_sex, special UWs for voice
    characteristics?
  • Expression (for face and gesture animation)
    special UWs/constructs?

23
Conclusions and Perspectives
  • Demonstrations to be prepared within the
    LingTour, Normalangue and KNOWLISTICS projects
  • First target is the Beijing 2008 Olympics
  • Some concept-oriented formalism
  • (such as Sowa's conceptual graphs)
  • may be used to store knowledge
  • before building in UNL
  • "interlingual prelinguistic, communicative
    content"

24
Conclusions and Perspectives
  • UNL representation of meaning of natural language
    sentences directly available for retrieval,
    indexing and knowledge extraction.
  • UNL with multimedia contents (text, speech,
    image, video) and multimodal access (text,
    speech, visual I/O) to enrich the service for
    communication.
  • Comprehensive and extensive information service
    on PDAs with access to UMTS and wireless LAN.
Write a Comment
User Comments (0)
About PowerShow.com