Title: Larry Rudolph
1A Presentation Manager Developed with
the Communications-Oriented Programming and
Routing Environment CORE
- Larry Rudolph
- Oxygen Research Group
- Laboratory for Computer Science
2Goals
- Integrate Many Oxygen Technologies
- Application Driven One that
- We understand
- Personally use often
- Should be more human-centric
- Develop Architectural Infrastructure
- Exposes new requirements
3Application Scenario
4Integration of Technologies
- Speech
- Vision
- Handhelds (H21)
- Search Engine
- Location Manager
- Intentional Names
- Ad-hoc networks
- SFS File System
- Microphone Array
- Camera Array
- Projector Array
- Meta-Glue
- Collaboration
Used in non-standard, challenging way
5Vision / Gesture Recognition
- Laser Pointer
- Great for drawing attention to content
- Audience is primary consumer
- Secondary use to control presentation
- But it is not a mouse
- Semantics are tied to slide context
- Differs from Intelligent-room use
- Small number of identified gestures
- Gestures easily punctuated
- Low computational overhead
- Soon will be handled with a H21
6Vision / Gesture Recognition
- Laser Pointer
- Great for drawing attention to content
- Audience is primary consumer
- Secondary use to control presentation
- But it is not a mouse
- Semantics are tied to slide context
- Differs from Intelligent-room use
- Small number of identified gestures
- Gestures easily punctuated
- Low computational overhead
- Soon will be handled with a H21
7Speech Recognition
- Galaxy is geared towards Dialog
- Dialog does not suit a presentation
- A prompt is an alienating distraction
- Navigation commands primarily for audience
- Different Use of Galaxy
- No for audio feedback
- There is natural feedback.
- No false-positives
- For dialog, better to guess than ignore
- For us, high cost for incorrect guess
- Most words are not relevant to speech system
8Slide Tracking
- Vary sensitivity, e.g. Next Slide
- less likely meaningful at start
- more likely after many words have been said.
- Play video only relevant on slides with video
- Recognizer will follow along
- Keep track of what has been said
- Slide-dependent recognition (a domain/slide)
- Multi-level Commands
- Individual Slide Navigation
- Presentation Manager
- Command Manager
- Implemented via multiple recognizers
9Three Output Modes
- Speaker view (notes)
- Projection
- Also gets archived
- Just slides shown
- Associate audio/laser/questions with each slide
- Merge personal notes with slides
- Collaboration
- Projector
- Multiple projectors (no shadows!)
This summer
10System Architecture Requirements
- Something simple composable
- Communications-oriented
- Dynamic, rule-based
- Can add commands during run time
- Using ordinary speech
- Compatible with other Oxygen Techs.
- Easy to debug (even by naïve users)
11CORECommunication Oriented Routing Environment
Larry Bear
12CORE Essentials
Nodes Specify via INS
Cam deviceweb-cam location518
PTRvision deviceprocess OSLinuxFileLaser
Vision, ..
CORE
Laser Vision
13CORE Essentials
Links specify with nicknames
Lcamera,vision (Cam,PTRvision)
Slide Speech
Presentation Speech
Command Speech
CORE
Laser Vision
14CORE Essentials
Messages flow over the links
Next Slide!
Slide Speech
Presentation Speech
Command Speech
CORE
Laser Vision
15CORE Essentials
Messages flow over the links
Slide Speech
Presentation Speech
Command Speech
CORE
Next Slide!
Next Slide!
Laser Vision
16CORE Essentials
Messages flow over the links
Slide Speech
Presentation Speech
ADVANCE
Command Speech
CORE
ADVANCE
ADVANCE
Laser Vision
17CORE Essentials
How do we change output for questions?
Questions?
Slide Speech
Presentation Speech
Command Speech
CORE
Laser Vision
18CORE Essentials
Slide Speech
Presentation Speech
Command Speech
Question?
CORE
Question?
Laser Vision
19CORE Essentials
RULES (trigger,action)
( MESSQuestion , Lslide,lcd -- Lslide,qlcd )
Slide Speech
Presentation Speech
Questions
Command Speech
CORE
Questions
Questions
Laser Vision
20Deep Issue 1Extract I/O Validation
- With numerous input modes, how to ensure valid
input? - Pull out input specification from app
- WEB forms
- Galaxy grammar
- Vision ?
- General form?
- Similar issues for multi-modal output
21Deep Issue 2Extract Connectivity
- How does stuff get connected?
- Automatically
- Within code via URLs or URNs
- By agents
- By some higher-level program
- What is right language style?
- Where to draw the line?
22Deep Issue 3Debugging
- Eternal Applications
- Environment is constantly changing
- Transient errors hardest to find
- Trace logs not sufficient
- Run backwards forwards
- Modify something and continue
23Status
- In Progress
- Reversible CORE
- Slide Tracking
- Dynamic Domains
- Output Modes
- Full Archive
- Fault Tolerance
- Optimizations
- Working
- CORE system
- Multiple Speech Recognizers
- Laser Tracking
- H21 Control
- Dynamic Reconfiguration
24Conclusions
- Application interesting on its own
- Speakers ? content delivery control
- Students ? archiving questioning
- Core Language Interesting
- Simple language
- Add rules via speech, vision, touch
- Not an all Java, self-contained world
- Make progress by going backwards
- undo the damaged caused by technology!