Title: An Overview of QuickSet, from OGI
1An Overview of QuickSet, from OGI
- Cohen, P. R., Johnston, M., McGee, D., Oviatt,
S., Pittman, J., Smith, I., Chen, L., and Clow,
J. (1997). QuickSet Multimodal Interaction for
Distributed Applications, Proceedings of the
Fifth International Multimedia Conference
(Multimedia '97), ACM Press, pp 31-40. - Pittman, J., Smith, I., Cohen, P., Oviatt, S. and
Yang, T. QuickSet A Multimodal Interface for
Military Simulation, In Proceedings of the 6th
Conference on Computer-Generated Forces and
Behavioral Representation, University of Central
Florida, 1996, 217-224. - Pittman, J. Recognizing Handwritten Text, In
Proceedings of CHI 1991, Human Factors in
Computing Systems, ACM/SIGCHI, NY, pp. 271-275. - Oviatt, S.L., Cohen, P.R., Wu, L.,Vergo, J.,
Duncan, L., Suhm, B., Bers, J., Holzman, T.,
Winograd, T., Landay, J., Larson, J. Ferro, D.
Designing the User Interface for Multimodal
Speech and Pen-based Gesture Applications
State-of-the-Art Systems and Future Research
Directions, Human Computer Interaction, 2000,
vol. 15, no. 4, 263-322.
2What is QuickSet? A pen voice front-end
interface, scaleable from a hand-held to a
wall-sized format
3Used for Map-Based ApplicationsThe user sketches
on top of an existing map.e.g., To drive a
training simulator for U.S. Marines
4Medical InformaticsFind a doctor within a
sketched region
5Open Agent ArchitectureUses Interagent
Communication Language
6Some Symbols Used for Pen Gestures
7How is this different from hand-writing
recognition?
Sketched Routes
Sketched Regions
Other Sketched Symbols
8Two Recognizers for Pen Gestures
- Neural Network
- Pre-processing
- size normalized, centered in a 2D image
- pixels are fed into the NN
- Hidden Markov Model
- Pre-processing
- smoothed, re-sampled, converted to deltas
- Combine probability estimates from the two
recognizers to compute a probability for each
possible gesture - Recognized 68 pen-gestures (1997) and 190
gestures (2000)
9From Pittmans 1991 paper on Handwriting
recognition (Preliminary work on the Neural Net
only)
- Standard back-propagation network with 2 hidden
layers - His conclusion architecture doesnt matter size
of the training set does matter - Collecting labeling a training corpus of
10,000-20,000 letters - Pre-processing Use the center point of the
character for normalization re-sizing - Using Context (adjacent characters) to help
recognition (no help for symbol recognition in
the map)
10Example Probabilities are estimated for each
gestureThe route is mis-recognized as an area
11Typed Feature Structure Unification
From Speech Recognizer
From Pen Gesture Recognizer
12The Take-away
- Multi-agent architecture
- Pen gesture recognizer
- Pre-processing
- Neural Net might work fine for our purposes
- Doesnt help with object/route vs. a symbol
- Find out more about the HMM
- Semantic representation