Title: Intelligent Flight Control
1Sub Vocal Speech Recognition Control
Chuck Jorgensen Neuro Engineering Laboratory NASA
Ames Research Center June 29, 2004
2Overview
- Neuro electronic interfacing with machines
- Sub Vocal Speech
- Background and method
- Recent results (short demonstration video)
- Future directions
3 EMG Neuro Flight Control
Dry Electrode Arm Band
F-15 Active No Joystick Control
747-400 Flight Director Console
737 Bioelectric Landing
4EMG Virtual Keypad
POC - Kevin Wheeler - ARC
4
5Sub Vocal Speech Recognition
Sub Vocal Speech Recognition The use of
non-audible, electro-myographic signals from the
surface of the larynx and lingual areas of the
throat to control devices and silently
communicate
6Previous Research
Licklider Englebart ARPA HCI/BCI Interface
Program with SRI --- 1968 -1974 Noted
connection between EEG and EMG activity Partridge
et. al., Speech Through Myo Electric Signal
Recognition Auburn University - 1990 Proposed
using EMG for speech recognition Kingsburry, Dual
Tree Complex Wavelet Transforms - (Cambridge)
-1997 Developed minimal shift invariant form of
wavelet transform Englehart, Hudgins, Parker, and
Stevenson, Classification of Myoelectric Signals
using Time-Frequency Based Representations,
1999. Proposed using wavelets as features for EMG
understanding Chan, Englehart, Hudgins, and
Lovely, Myoelectric Signals to Augment Speech
Recognition, Medical and Biological Engineering
Computing, pp. 500-506 vol 39(4), - 1999.
Demonstrated EMG signals for speech recognition
augmentation for aircraft pilot masks
7Sub Vocal Speech Application
Augment Vocal Speech Recognition High noise
environments, occluded lip movement, low pressure
suits, breathing gas mixtures, micro gravity
Private Communication Data base query, silent
control, security, military commo
SVS Technology
Speech Information Enrichment Emotion
identification, workload, safety,fatigue
Medical Monitoring Stress, illness, drug
interactions, biometrics
Device control / Design Enhancement Vehicle
control, computer input, multi-modal interfaces
8Sub Acoustic Speech Method
EMG Data Sampled Filtered
Classifiers Models
Categorization, Generation, Action, or voice
coding
Feature Transform
FFT
Data Acquisition Signal Segmentation
SCGNN SVM HMM Others
Recognition, playback, and control
DTWT
LPC Others
9Signal Acquisition
Signal Acquisition Segmentation
Signal capture of Sub-vocal Alpha
Electrode Placement
Ground
10Non-contact Sensor Development
Latest EMG Sensor
Design Goals
Eight-channel EEG plus EOG and QUASAR Sensor
Recordings
- Near-term
- Refine non-contact technology
- E-field sensor (normal to scalp)
- Shielded room
- Mid-term
- Differential sensor (tangential to scalp)
- Mini sensors (2-3X smaller, thinner, with
manufactured cover) - Long-term
- Unshielded room
- Multichannel
Mini-differential Sensor
QUASAR Quantum Applied Science and Research Inc.
11EMG Signal Characteristics
Signal Characteristics
Handling Approach
- Non-stationary
- Multimodal distribution of feature values
- Dependence between features and channels
- Artifacts e.g. swallowing, fatigue
- Electrode placement sensitivity
- Circadian metabolic changes (guzzle coffee)
- Short time windows and transforms
- Feature choice and time handling
- Eliminate via mutual information
- Manual removal for training
- Vary as little as possible
- Rich training samples
12Signal Thresholding
13EMG Signal Segmentation
Auto Threshold Point
14Feature Transform
Discrete Wavelet Transform
Sampled data, regularly spaced Maps a vector onto
a grid of discrete wavelet coefficients
(translation dilations)
15Signal Spectral Energy
DTWT
Right
Left
Signal Energy Differences
Stop
Go
16Classifier Training
SCGNN
Scaled Conjugate Gradient Neural Network Training
and Validation of Sub-vocal Patterns (e.g. 50
inputs, 100 hidden, 16 outputs)
17Results for 6 Control Words
18SVM Classification 16 Words
Overall Linear 73.12 , Tanh 73, Radial 73.13,
Poly 72.7, Raw 72.5
134 Samples Trained, 40 Samples Tested, 5000
complex DTWT Coefficients, No Tessellation
19Results Consonants w/o some alveolars
- Success rate 72 (2500 iterations, dual tree
wavelet transform, one subject) - Previous results suggested that alveolars (tip
of the tongue touches alveolar ridge) are
problematic for this method, so six (t, d, s, z,
ch, j) were removed. Removing the remaining
alveolars (n, l, and maybe r) would probably
further improve the results, as they seem to
often be garbage can categories.
20Biometric Recognition
Three Speakers 2 male one female Two Keywords
Left and Right Number of training examples per
word 50 Number of generated training samples
used 1720 Number of untrained test examples
600 Results Female 100 percent on both Key
Words Male (young) 99 percent for Right, 65
Percent for Left Male (old) 95 percent for Right,
97 percent for Left
Female
Young Male
Old Male
F-R F-L M-R M-L M-R M-L
21Sub Acoustic Device Control
Recognition, playback, and control
Machine Annunciation of Sub-Acoustic Command
STOP!
22Real Time Sub Acoustic Control - Vehicle Movement
Navigation Test Area
Drivers View
EMG Signal Processing raw input data on lower
left, segmented word pattern lower
right operation of voting scheme upper left and
final control word sent to vehicle simulation
upper right
23Sub Acoustic Web Browsing