Visual Aid For the Hearing Impaired - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Visual Aid For the Hearing Impaired

Description:

4 parabolas are used to apply 2-D effect. D1. d2. Fig. 1. Fig. 2. Fig. 3 ... Implementation of Animation ... look at our animation system. Future Considerations ... – PowerPoint PPT presentation

Number of Views:327
Avg rating:3.0/5.0
Slides: 29
Provided by: josean4
Category:

less

Transcript and Presenter's Notes

Title: Visual Aid For the Hearing Impaired


1
Visual Aid For the Hearing Impaired
  • Katherine Andrade
  • Carlos Castillo
  • Frank Taranto, Jr.
  • Jason Vieira

2
How The Ear Works
  • Sounds create vibrations
  • Outer ear collects the vibrations
  • Sound waves strike the eardrum
  • Travel through the middle ear
  • Inner ear fluid is set in motion
  • Nerve cells are excited and the impulses reach
    the brain

3
What Causes Hearing Loss?
  • Anything that completely blocks the ear canal can
    cause hearing loss
  • Blockage with earwax (also called cerumen) is
    common
  • Infections with swelling that shuts the ear canal
  • Foreign bodies in the ear
  • Neural Problems
  • Noise
  • Birth defects
  • A growth in the ear canal
  • Ear Infections
  • Tumors
  • Noise

4
Who Is Affected ?
  • Hearing loss affects more American families than
    any other chronic health condition
  • Ironically hearing loss is the most preventable
    chronic health condition
  • More than 40 million Americans have hearing loss
  • About 3 out of every 1,000 children in the
    United States are born deaf or hard-of-hearing
  • Hearing loss affects approximately 17 in 1,000
    children under age 18
  • About 15 of college graduates have a level of
    hearing loss

5
Existing Solutions
  • The first step in treating hearing loss is an
    accurate diagnosis - finding out exactly what's
    causing the hearing loss
  • Treating the underlying disease, such as
    hypothyroidism, with antibiotics if a disease is
    found
  • A hearing aid to provide amplification of sound
    (for most people with sensorineural hearing loss,
    amplification is the best or only option)
  • Surgery for mechanical causes such as chronic ear
    infections with a cochlear implant
  • Videophone offers combined video and audio across
    phone lines

6
Problems With These Solutions
  • Prescription Drugs
  • Routine
  • Side Effects
  • Costly
  • Hearing Aid
  • Background Noise
  • Evaluation Cost
  • Maintenance
  • Surgery
  • Invasive
  • Risk
  • Costly
  • Videophone
  • Video and Audio appear out of sync
  • Expensive
  • Both parties must have similar technology

7
Applications of our Design
  • Assist a person who is hearing impaired better
    understand speech over the phone by looking at a
    pair of artificial lips on a screen
  • On television, whenever a pair of lips are not
    available a hearing impaired person can look at a
    generated set
  • By having the sound processed at the receiver,
    delay between video and audio is eliminated
  • Lip Information Complemented with Speech improves
    intelligibility in environments with low
    signal-to-noise ratio (SNR)

8
Intelligibility Pattern of Integrated A/V
9
Procedure
  • Record all 42 English among several subjects
  • Photograph different subjects as they pronounce
    all the phonemes and measure lip shape parameters
  • Using COLEA toolbox to obtain LPC coefficients,
    which describe the state of the vocal tract
  • Utilizing neural networks to best associate LPC
    coefficients to respective lip shapes
  • Fitting parabolas to approximate phoneme lip
    shapes and animate coherent speech using MatLab

10
Phoneme Acquisition
11
Measuring Lip Parameters
  • Measure inner width
  • Measure inner height of upper lip
  • Measure inner height
  • of lower lip
  • Measure outer width
  • Measure outer height
  • of upper lip
  • Measure outer height
  • of lower lip

12
Organs of Speech and Linear Predictive Coding
(LPC)
  • A technique for modeling the vocal tract
  • Ideal for pitch and formant detection determines
    area functions of vocal tract
  • Two types of methods for LPC analysis
  • Autocorrelation method
  • Covariance method
  • Another way of representing a sound is by
    Cepstral Coefficients, which are more stable than
    LPC coefficients

13
Extracting LPC coefficients
  • Linear Predictive Coding is a means to compress
    a continuous signal
  • LPC coefficients are derived from previous values
  • a0s(j)a1s(j-1) .... ans(j-n)
  • aCoefficients
  • s(j)Present sample
  • Procedure is repeated over set of n samples
  • Optimum number of coefficients between 10 and 20

14
Using Parabolas To Simulate Lip Shapes
  • We begin with a parabola, because its properties
    resemble a lip
  • By varying d1 and d2 we obtain various
    configurations of a parabola y ax2c
  • 4 parabolas are used to apply 2-D effect

D1
d2
Fig. 1
Fig. 2
Fig. 3
15
Long a, /A/, as in Fonzies Greeting
16
Other Phonemes
Short a, /a/, as in flat
b sound, /b/, as in ball
17
Using Neural Networks
  • Neural Networks complement the measurement of LPC
    coefficients
  • Operates by supplying network with training set
    of data (LPC inputs and parabolic coefficient
    outputs) and performing least squares for varying
    sets
  • Robust method for determining lips shapes for
    people with different pitches and vocal tract
    elasticity

18
Fundamentals of Neural Networks
  • Units known as perceptrons contain weights for
    each of its inputs and biases
  • Weights are adjusted during training, thereby
    decreasing error over several epochs
  • Three options of intelligibility
  • Linear
  • Logarithmic Sigmoidal
  • Tangent Sigmoidal

19
Variability in LPC Values
20
Variability in Cepstral Values
21
Neural Network Training
  • For a system that matches a tangent network
    function, error will decrease to zero
  • Actual systems will have variability
  • In our case, we used a normalized training set to
    reduce the effect of anomalies

22
Architecture of Neural Networks
  • LPC coefficients serves as the input layer
  • Eight parabola coefficients serve as the output
    layer
  • As a rule of thumb, we selected 22 hidden units
    (50 more units than input layer)

Architecture for a 2 layer feed-forward back
propagation network
23
Simulating Neural Network
  • Tested the accuracy of the trained network by
    inputting a random phoneme and comparing to
    actual lip shape
  • Trained network with 2 training sets of inputs
    and targets combined the mean and standard
    deviation of both
  • Consonants were the most difficult to mimic
    acute sound intermixed with vowels

24
Implementation of Animation
  • Sound sampled at 22050 Hz was partitioned into
    sections to achieve a frame rate of between 15-30
    fps
  • LPC coefficients obtained directly through
    auto-correlation of the samples constant noise
    will not substantially change LPCC
  • An input of LPC coefficients would result in an
    output of 8 normalized parabola coefficients that
    are backconverted using stored mean and standard
    deviation

25
Shortcomings of Animation
  • No apparent parallel processing feature in MatLab
    allowing for simultaneous sound playback and lip
    animation
  • Higher frame rate sacrificed for smoother audio
    playback
  • MatLab only does a hard pixel redrawing of lip
    shapes yielding a rough, though effective
    animation

26
Final Design
  • Now, lets look at our animation system.

27
Future Considerations
  • Train neural network with more subjects (male and
    female)
  • Test cepstral coefficients in place of LPC
    coefficients for greater stability
  • Use consonant sounds in different vowel contexts
    (eg. Ka, Ke, Ki, Ko, Ku)
  • Use open source toolbox (Netlab) for greater
    availability and less restrictive delivery to
    market
  • Test teeth and tongue formations for each phoneme

28
Acknowledgement
  • The authors wish to thank the advising support
    provided by Dr. Richard Foulds, Dr. Joel
    Schesser, Dr. Tara Alvarez, Dr. Sergei Adamovich,
    Mr. Michael Bergen, Mr. John Hoinkowski
Write a Comment
User Comments (0)
About PowerShow.com