Title: Automatic Speech Recognition in Noisy Environments
1Automatic Speech Recognition in Noisy
Environments Human Factors Approach Te-Won Lee
2Overview
- Motivation
- State of the Art in Automatic Speech Recognition
- Approaches to Speech Enhancement
- Blind Source Separation
- Independent Component Analysis (ICA)
- Sound Separation Voice Recognition (live demo)
- Proposed ASR System for Noisy Shuttle
Environments - ICA software filter embedded in ASR DSK
- Vocabulary optimization
- Other Applications of ICA relevant to NASA and
Human Factors - Image Processing
- Biomedical Signal Processing
- Conclusions
3Team
- SoftMax, Inc.
- Te-Won Lee
- Erik Visser
- Steven Chan
- Oh-Wook Kwon
- Jeff Elman (UCSD)
- NASA, JSC
- Mihriban Whitmore
- Cindy Hudy
- George Salazar
- Robonaut Team
- Nancy Niedzielski (Rice)
4Motivation
- Methods are required to facilitate hand-free
control of spacecraft systems - Experiments in glove box require automated
assistance via voice command and control
5Motivation
Robonaut
Teleoperator
NASA Shuttle Robonaut Currently command
operated by IBMs Via Voice
6Automatic Speech Recognition
- State of the Art Voice Recognition
- ASR have improved significantly over the last
decade - Large vocabulary recognition with speaker
dependent and independent systems exist - Problems with Current State of the Art Systems
- Accuracy drops significantly in realistic noisy
environments - Initial space shuttle experiments with speaker
dependent voice recognition studies showed
significant drops in accuracy - Source of Error
- Space shuttle noise interferes with the spoken
words - Noise frequency spectrum overlaps with speech
signal spectrum - Possible Solutions
- Training with noise signals
- Removing noise signals
7Approaches to Speech Enhancement
- There are several techniques to enhance the
speech signal - Spectral Subtraction
- Due to spectral overlap between noise and speech,
this technique creates artifacts known as musical
tones - Beamforming
- Many microphones (4 to 8) are required and
performance is still poor - Acoustic Model Adaptation
- Requires pre-training with clean speech signals
and stationary noise model
8SoftMax Overview
SoftMax, Inc. develops advanced signal processing
solutions to enhance the interface between humans
and machines. SoftMax is commercializing the
development of its platform technology to create
new standards for understanding and processing
information.
ICA
ICA
9Independent Component Analysis
The SoftMax platform is a state-of-the-art signal
processing algorithm designed to mimic how the
human brain processes signals, solving the
cocktail party problem by pulling out a single
desired speaker.
10Blind Source Separation
- Example of density modeling cocktail party
problem Blind Source Separation (BSS) - It requires unsupervised learning of the
probability density function of the sources and
finding non-orthogonal directions.
11ICA Versus PCA
- Independent Component Analysis (ICA) finds
directions of maximal independence in
non-Gaussian data (higher-order statistics).
- Principal Component Analysis (PCA) finds
directions of maximal variance in Gaussian data
(second-order statistics).
12(No Transcript)
13(No Transcript)
14Real-Time Speech Separation
Our ICA algorithm is unique in that it utilizes
blind source separation to identify each
independent component of a signal data set and
separated unwanted noise from the desired
signal.
ICA Learning
Noise can be identified as sounds, wave
artifacts, or irrelevant data.
15Live Demonstration ICA and ASR
- Blind Source Separation
- Integration of SoftMax Signal Processing
Technology and Voice Navigator
16Live Demonstration of Speech Recognition
Technology
- Integration of SoftMax Signal Processing
Technology and Voice Navigator - Voice Navigator is a state of the art
off-the-shelf speech recognition system
?
Without ICA
With ICA
17Distinguishing Characteristics of this Unique
Technology Platform
- Reduces noise in real time
- Utilizes blind source separation, requiring no
pre-training - Separates impulsive and non-stationary white
noise from clear signal - Utilizes higher order statistics No distortion
of processed signal through
18Proposed ASR System for Noisy Shuttle Environments
- ICA as software patch to separate noise from
speech signal - Commercial Off-The-Shelf Voice Recognition System
- Software Development Kit (SDK)
- Vocabulary Optimization
- Confusion matrix analysis
- Node structure definition
19Work in Progress
- Demonstration of ICA-based voice recognition with
IBMs Via Voice in highly noisy environments - Robonaut evaluation Summer 2004
- Speaker independent voice recognition
- Vocabulary optimization
- Evaluation
20Other ICA Applications
- Applications relevant to NASA
- Image Processing
- Image enhancement
- Object identification
- Biomedical Signal Processing
- Cardiac diagnosis
- EEG and MRI analysis
- High-Dimensional Data Understanding
- Unsupervised clustering
21Summary
- Proposed speech recognition system for deployment
in noisy environment - ICA filter patch
- COTS speech recognition
- Evaluation in Robonaut application
- Other applications in NASA
- Biomedical signal processing
- Image enhancement and pattern recognition
- Complex data understanding
22TE-WON LEE, PH.D.SOFTMAX, INC. 4180 LA JOLLA
VILLAGE DRIVE, SUITE 455LA JOLLA, CA
92037PHONE (858) 452-7477FAX (858)
452-7373 WWW.SOFTMAX.COM TLEE_at_SOFTMAX.COM