Title: Meena Ramani
1EEL 6586 Automatic Speech Processing
2Topics to be covered
- Lecture 1 The incredible sense of hearing 1
- Anatomy
- Perception of Sound
- Lecture 2 The incredible sense of hearing 2
- Psychoacoustics
- Hearing aids and cochlear implants
3Lecture 1The incredible sense of hearing
Behind these unprepossessing flaps lie
structures of such delicacy that they shame the
most skillful craftsman" -Stevens, S.S.
Professor of Psychophysics, Harvard University
4Why study hearing?
- Best example of speech recognition
- Mimic human speech processing
- Hearing aids/ Cochlear implants
- Speech coding
5Interesting facts
- The stapes or stirrup is the smallest bone in our
body. - It is roughly the size of a grain of rice 2.5mm
- Eardrum moves less than the diameter of a
hydrogen atom - For minimum audible sounds
- Inner ear reaches its full adult size when the
fetus is 20-22 weeks old. - The ears are responsible for keeping the body in
balance - Hearing loss is the number one disability in the
world. - 76.3 of people loose their hearing at age 19
and over
6Specifications
- Frequency range 20Hz-20kHz
- Dynamic range 0-130 dB
- JND frequency 5 cents
- JND intensity 1dB
- Size of cochlea smaller than a dime
7A N A T O M Y
8Outer ear
Pinna /Auricle
Auditory Canal
- Focuses sound waves (variations in pressure) into
the ear canal - Pinna size
- Inverse Square Law
- Larger pinna captures more of the wave
- Elephants hear low frequency sound from up to 5
miles away - Human Pinna structure
- Pointed forward has a number of curves
- Helps in sound localization
- More sensitive to sounds in front
- Dogs/ Cats- Movable Pinna focus on sounds from
a particular direction
9Pinna /Auricle
Outer ear
Auditory Canal
Horizontal localization
Sound Localization
Vertical localization
Is sound on your right or left side?
Interaural Time Difference (ITD) Interaural
Intensity Difference (IID)
10Interaural differences
- The signal needs to travel further to more
distant ear - More distant ear partially occluded
by the head Two types of interaural difference
will emerge - Interaural time difference (ITD) -
Interaural intensity difference (IID)
11- Illustration of interaural differences
Left ear
Right ear
time
sound onset
12 Illustration of interaural differences
Left ear
Right ear
time
sound onset
arrival time difference
13 Illustration of interaural differences
Left ear
Right ear
time
sound onset
14 Illustration of interaural differences
Left ear
intensity difference
Right ear
time
sound onset
15Thresholds
- Interaural time differences (ITDs)
- Threshold ITD ? 10-20 ms ( 0.7 cm)
- Interaural intensity differences (IIDs)
- Threshold IID ? 1 dB
16D U P L E X T H E O R Y
-
- Interaural time differences (ITDs) ? Low
frequencies - Up to around 1500 Hz sensitivity declines
rapidly above 1000 Hz - Smallest phase difference corresponds to the true
ITD - Interaural intensity differences (IIDs) ? High
Frequencies - The amount of attenuation varies across frequency
- below 500 Hz, IIDs are negligible (due to
diffraction) - IIDs can reach up to 20 dB at high frequencies
17Outer ear
Pinna /Auricle
Auditory Canal
Horizontal localization
Sound Localization
Vertical localization
Is sound above or below?
Pinna Directional Filtering
- Pinna amplifies sound above and below differently
- Curves in structure selective amplifies certain
parts of the sound spectrum
18Outer ear
Pinna /Auricle
Auditory Canal
- Closed tube resonance ¼ wave resonator
- Auditory canal length 2.7cm
- Resonance frequency 3Khz
- Boosts energy between 2-5Khz upto 15dB
19A N A T O M Y
20Middle Ear
Eardrum
Ossicles
Oval window
Pressure variations are converted to mechanical
motion Eardrum ?Ossicles?Oval Window Ossicles
Malleus, Incus, Stapes
- Impedance matching
- Acoustic impedance of the fluid is 4000 x that of
air - All but 0.1 would be reflected back
- Amplification
- By lever action
- Area amplification 55mm2 ? 3.2mm2 15x
- Stapedius reflex
- Protection against low frequency loud sounds
- Tenses muscles? stiffens vibration of Ossicles
- Reduces sound transmitted (20dB)
21A N A T O M Y
22Inner Ear
Semicircular Canals
Cochlea
- Body's balance organs
- Accelerometers in 3 perpendicular planes
- Hair cells detect fluid movements
- Connected to the auditory nerve
23Semicircular Canals
Inner Ear
Cochlea
- Cochlea is a snail-shell like structure 2.5 turns
- 3 fluid-filled parts
- Scala tympani
- Scala Vestibuli
- Cochlear duct (Organ of Corti)
- Organ of Corti
- Scala tympani
- Scala vestibulli
- Spiral ganglion
- auditory nerve fibres
24Inner Ear
Semicircular Canals
Cochlea
- Organ of Corti
- Basilar membrane
- Inner hair cells and outer hair cells (16,000
-20,000) - IHC100 tiny stereocilia
- The body's microphone
- Vibrations of the oval window causes the cochlear
fluid to vibrate - Basilar membrane vibration produces a traveling
wave - Bending of the IHC cilia produces action
potentials - The outer hair cells amplify vibrations of the
basilar membrane
25- The cochlea works as a frequency analyzer
- It operates on the incoming sounds frequencies
26Place Theory
- Each position along the BM has a characteristic
frequency for maximum vibration - Frequency of vibration depends on the place along
the BM - At the base, the BM is stiff and thin (more
responsive to high Hz) - At the apex, the BM is wide and floppy (more
responsive to low Hz)
27Tuning curves of auditory nerve fibers
- To determine the tonotopic map on Cochlea
- Apply 50ms tone bursts every 100ms
- Increase sound level until discharge rate
increases by 1 spike - Repeat for all frequencies
Response curve is a BPF with almost constant
Q(f0/BW)
28Auditory Neuron
Auditory Area of Brain
- Carries impulses from both the cochlea and the
semicircular canals - Connections with both auditory areas of the
brain - Neurons encode
- Steady state sounds
- Onsets or rapidly changing frequencies
-
29Auditory Neurons Adaptation
- At onset, auditory neuron fiber firing increases
rapidly - If the stimulus remains (a steady tone for eg.)
the rate decreases exponentially - Spontaneous rate Neuron firings in the absence
of stimulus
Neuron is more responsive to changes than to
steady inputs
30Perception of Sound
- Threshold of hearing
- How it is measured
- Age effects
- Equal Loudness curves
- Bass loss problem
- Critical bands
- Frequency Masking
- Temporal Masking
31Threshold of Hearing
- Hearing area is the area between the Threshold
in quiet and the threshold of pain
32- Bekesy Tracking
- STEPS
- Play a tone
- Vary its amplitude till its audible
- Then tones amplitude is reduced to definitely
inaudible and the frequency is slowly changed - Continu\e
33Threshold variation with age
- Presbycusis
- Hearing sensitivity decreases with age especially
at High frequencies - Threshold of pain remains the same
- Reduced dynamic range
34Equal Loudness Curves
Loudness is not simply sound
intensity! Factor of ten increase in intensity
for the sound to be perceived as twice as loud.
35The Bass Loss Problem
For very soft sounds, near the threshold of
hearing, the ear strongly discriminates against
low frequencies. For mid-range sounds around 60
phons, the discrimination is not so pronounced
For very loud sounds in the neighborhood of 120
phons, the hearing response is more nearly flat.
Eg. Rock music Too low?no bass Too high?too much
bass
36Elephants
- Sound Production
- A a typical male elephants rumble is around an
average minimum of 12 Hz, a female's rumble
around 13 Hz and a calf's around 22 Hz. - Produce sounds ranging over more than 10
octaves, from 5 Hz to over 9,000 Hz - Produce very gentle, soft sounds as well as
extremely powerful sounds. (112dB recorded a
meter away) - Hearing
- Wider tympanic membranes
- Longer ear canals (20 cm)
- Spacious middle ears.
Low frequency detection