CONFUCIUS: an Intelligent MultiMedia storytelling interpretation

About This Presentation

Title:

CONFUCIUS: an Intelligent MultiMedia storytelling interpretation

Description:

To interpret natural language story and movie (drama) script ... Non-action verbs (stative, emotion, possession, mental activities, cognition & perception) ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 21

Provided by: informatic9

Category:

more less

Transcript and Presenter's Notes

Title: CONFUCIUS: an Intelligent MultiMedia storytelling interpretation

1
CONFUCIUS an Intelligent MultiMedia
storytelling interpretation presentation system

Minhua Eunice Ma
Supervisor Prof. Paul Mc Kevitt
School of Computing and Intelligent Systems
Faculty of Informatics
University of Ulster, Magee

2
Objectives of CONFUCIUS

To interpret natural language story and movie
(drama) script input and to extract conceptual
semantics from the natural language
To generate 3D animation and virtual worlds
automatically, with speech and non-speech audio
To integrate the above components to form an
intelligent multimedia storytelling system for
presenting multimodal stories

3
CONFUCIUS context diagram
4
Literature review
5
Previous systems

Schanks CD Theory (1972)
Primitive scripts
SAM PAM
Automatic Text-to-Graphics Systems
WordsEye (Coyne Sproat, 2001)
Micons and CD-based language animation
(Narayanan et al. 1995)
Spoken Image (Ó Nualláin Smith, 1994) its
successor SONAS (Kelleher et al. 2000)

MultiModal interactive storytelling
AesopWorld
KidsRoom
Larsen Petersens Interactive Storytelling
Oz
Computer games

Embodied intelligent agents
divergence on agents behavior production
BEAT (Cassell et al., 2000)
Gandalf
PPP persona

7
Architecture of CONFUCIUS
Natural language stories
Script writer
Script parser
Prefabricated objects (knowledge base)
lexicon grammar etc
Natural Language Processing
Text To Speech
Sound effects
Language knowledge
semantic representations
3D authoring tools
mapping
Animation generation
visual knowledge
visual knowledge (3D graphic library)
Synchronizing fusion
3D world with audio in VRML
8
MultiModal semantic representation
Multimodal semantics
High-level multimodal semantic representation XML
/frame-based
Media-independent representation
Visual media-dependent representation
Intermediate level
Audio media-dependent representation
Non-speech audio modality
Language modality
Visual modality
9
Knowledge base of CONFUCIUS
knowledge base
Semantic knowledge - lexicons (eg.
WordNet) Syntactic knowledge - grammars Statistica
l models of language Associations between words
Language knowledge
Object model (nouns) Functional
information Internal coordinate axes (for spatial
reasoning) Associations between objects
Event model (event verbs, describes the motion of
objects)
Visual knowledge
World knowledge
Spatial qualitative reasoning knowledge
10
Categories of events

Atomic entities
Change physical location such as position and
orientation, e.g. bounce, turn
Change intrinsic attributes such as shape, size,
color, and texture, e.g. bend, and even
visibility, e.g. disappear, fade (in/out)
Non-atomic entities
Non-character events
Two or more individual objects fuse together,
e.g. melt (in)
One object divides into two or more individual
parts, e.g. break (into pieces)
Change sub-components (their position, size,
color), e.g. blossom
Environment events (weather verbs), e.g. snow,
rain
Character events
Action verbs
Intransitive verbs
Transitive verbs
Non-action verbs (stative, emotion, possession,
mental activities, cognition perception)
Idioms metaphor verbs

11
Categories of action verbs

Intransitive verbs
Biped kinematics, e.g. walk, swim, other
motion models like fly
Face expressions, e.g. laugh, anger
Lip movement, e.g. speak, say
Transitive verbs
single object, e.g. throw, push, kick
multiple objects
direct and indirect objects, e.g. give, pass,
show
indirect object the instrument, e.g. cut,
hammer

12
Basic predicate-arguments

1) move(obj, xInc, yInc, zInc)
2) moveTo(obj, loc)
3) moveToward(obj,loc,displacement)
4) rotate(obj,xAngle,yAngle,zAngle)
5) faceTo(obj1, obj2)
6) alignMiddle(obj1, obj2, axis)
7) alignMax(obj1, obj2, axis)

8) alignMin(obj1, obj2, axis) 9)
alignTouch(obj1, obj2, axis) 10) touch(obj1,
obj2, axis) 11) scale(obj, rate) 12)
squash(obj, rate, axis) 13) group(x, y_,
newObj) 14) ungroup(xyList, x, yList)
13
Visual definition word sense
polysemy
verb
word sense
visual definition entry
mapping
synonymy

a normal door (rotation on y axis)
a sliding door (moving on x axis)
a rolling shutter door (a combination of rotation
on x axis and moving on y axis)

Example close (a door)
word sense -- minimal complete unit of meaning in
the language modality visual definition entry --
minimal complete unit of meaning in the visual
modality
14
Implementation semantics?VRML
Example A ball is bouncing
bounce(ball)- moveTo(ball, 0,0,0),
moveTo(ball,0,20,0)L. (a) visual definition of
bounce
DEF ball Transform translation 0 0 0
children Shape appearance
Appearance material Material
geometry Sphere radius 5
(b) VRML code of a static ball
15
Comparison of intelligent multimedia systems
16
Software Analysis

Java programming language
parsing intermediate representation
changing VRML code to create/modify animation
integrating modules
Natural language processing tools
Gate (pre-processing)
PC-PARSE (morphologic and syntax analysis)
WordNet (lexicon, semantic inference)
3D graphic modelling
existing 3D models on the Internet
3D Studio Max (props stage)
VRML (Virtual Reality Modelling Language) 97,
H-anim 2001 spec.
The Actors using embodied agents
Microsoft Agent (the narrator and minor actors)
Character Studio, Internet Character Animator
(protagonists)

17
Reuse NLP toolkits
GATE 2.0
PC-PARSER
FEATURES
Semantic inference
WordNet 1.6
18
Contribution prospective applications

Contribution
multimodal semantic representation of natural
language
automatic animation generation
multimodal fusion and coordination

Prospective practical applications
Childrens education
Multimedia presentation,
Movie/drama production,
Script writing,
Computer games,
Virtual Reality

19
Conclusion

The objectives of CONFUCIUS meet the challenging
problems in language visualisation
formalizes meaning of action verbs and states
mapping language primitives with visual
primitives
a reusable common senses knowledge base for
other systems
sophisticated spatial and temporal reasoning
representing stories by temporal multimedia
requires significant coordination

20
Project schedule

Write a Comment

User Comments (0)