Automatic Speech Recognition in Noisy Environments - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

Automatic Speech Recognition in Noisy Environments

Description:

Methods are required to facilitate hand-free control of spacecraft systems ... technology to create new standards for understanding and processing information. ... – PowerPoint PPT presentation

Number of Views:263

Avg rating:5.0/5.0

Slides: 23

Provided by: margecun

Category:

more less

Transcript and Presenter's Notes

Title: Automatic Speech Recognition in Noisy Environments

1
Automatic Speech Recognition in Noisy
Environments Human Factors Approach Te-Won Lee
2
Overview

Motivation
State of the Art in Automatic Speech Recognition
Approaches to Speech Enhancement
Blind Source Separation
Independent Component Analysis (ICA)
Sound Separation Voice Recognition (live demo)
Proposed ASR System for Noisy Shuttle
Environments
ICA software filter embedded in ASR DSK
Vocabulary optimization
Other Applications of ICA relevant to NASA and
Human Factors
Image Processing
Biomedical Signal Processing
Conclusions

3
Team

SoftMax, Inc.
Te-Won Lee
Erik Visser
Steven Chan
Oh-Wook Kwon
Jeff Elman (UCSD)

NASA, JSC
Mihriban Whitmore
Cindy Hudy
George Salazar
Robonaut Team
Nancy Niedzielski (Rice)

4
Motivation

Methods are required to facilitate hand-free
control of spacecraft systems
Experiments in glove box require automated
assistance via voice command and control

5
Motivation
Robonaut
Teleoperator
NASA Shuttle Robonaut Currently command
operated by IBMs Via Voice
6
Automatic Speech Recognition

State of the Art Voice Recognition
ASR have improved significantly over the last
decade
Large vocabulary recognition with speaker
dependent and independent systems exist
Problems with Current State of the Art Systems
Accuracy drops significantly in realistic noisy
environments
Initial space shuttle experiments with speaker
dependent voice recognition studies showed
significant drops in accuracy
Source of Error
Space shuttle noise interferes with the spoken
words
Noise frequency spectrum overlaps with speech
signal spectrum
Possible Solutions
Training with noise signals
Removing noise signals

7
Approaches to Speech Enhancement

There are several techniques to enhance the
speech signal
Spectral Subtraction
Due to spectral overlap between noise and speech,
this technique creates artifacts known as musical
tones
Beamforming
Many microphones (4 to 8) are required and
performance is still poor
Acoustic Model Adaptation
Requires pre-training with clean speech signals
and stationary noise model

8
SoftMax Overview
SoftMax, Inc. develops advanced signal processing
solutions to enhance the interface between humans
and machines. SoftMax is commercializing the
development of its platform technology to create
new standards for understanding and processing
information.
ICA
ICA
9
Independent Component Analysis
The SoftMax platform is a state-of-the-art signal
processing algorithm designed to mimic how the
human brain processes signals, solving the
cocktail party problem by pulling out a single
desired speaker.
10
Blind Source Separation

Example of density modeling cocktail party
problem Blind Source Separation (BSS)
It requires unsupervised learning of the
probability density function of the sources and
finding non-orthogonal directions.

11
ICA Versus PCA

Independent Component Analysis (ICA) finds
directions of maximal independence in
non-Gaussian data (higher-order statistics).

Principal Component Analysis (PCA) finds
directions of maximal variance in Gaussian data
(second-order statistics).

12
(No Transcript)
13
(No Transcript)
14
Real-Time Speech Separation
Our ICA algorithm is unique in that it utilizes
blind source separation to identify each
independent component of a signal data set and
separated unwanted noise from the desired
signal.
ICA Learning
Noise can be identified as sounds, wave
artifacts, or irrelevant data.
15
Live Demonstration ICA and ASR

Blind Source Separation
Integration of SoftMax Signal Processing
Technology and Voice Navigator

16
Live Demonstration of Speech Recognition
Technology

Integration of SoftMax Signal Processing
Technology and Voice Navigator
Voice Navigator is a state of the art
off-the-shelf speech recognition system

?
Without ICA
With ICA
17
Distinguishing Characteristics of this Unique
Technology Platform

Reduces noise in real time
Utilizes blind source separation, requiring no
pre-training
Separates impulsive and non-stationary white
noise from clear signal
Utilizes higher order statistics No distortion
of processed signal through

18
Proposed ASR System for Noisy Shuttle Environments

ICA as software patch to separate noise from
speech signal
Commercial Off-The-Shelf Voice Recognition System
Software Development Kit (SDK)
Vocabulary Optimization
Confusion matrix analysis
Node structure definition

19
Work in Progress

Demonstration of ICA-based voice recognition with
IBMs Via Voice in highly noisy environments
Robonaut evaluation Summer 2004
Speaker independent voice recognition
Vocabulary optimization
Evaluation

20
Other ICA Applications

Applications relevant to NASA
Image Processing
Image enhancement
Object identification
Biomedical Signal Processing
Cardiac diagnosis
EEG and MRI analysis
High-Dimensional Data Understanding
Unsupervised clustering

21
Summary

Proposed speech recognition system for deployment
in noisy environment
ICA filter patch
COTS speech recognition
Evaluation in Robonaut application
Other applications in NASA
Biomedical signal processing
Image enhancement and pattern recognition
Complex data understanding

22
TE-WON LEE, PH.D.SOFTMAX, INC. 4180 LA JOLLA
VILLAGE DRIVE, SUITE 455LA JOLLA, CA
92037PHONE (858) 452-7477FAX (858)
452-7373 WWW.SOFTMAX.COM TLEE_at_SOFTMAX.COM

Write a Comment

User Comments (0)