Hallucinations in Auditory Perception - PowerPoint PPT Presentation

1 / 81
About This Presentation
Title:

Hallucinations in Auditory Perception

Description:

Hallucinations in Auditory Perception – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 82
Provided by: malcolm99
Category:

less

Transcript and Presenter's Notes

Title: Hallucinations in Auditory Perception


1
Hallucinations in Auditory Perception!!!
  • Malcolm Slaney
  • Yahoo! Research
  • Stanford CCRMA

2
Hadoop
3
(No Transcript)
4
(No Transcript)
5
One Dimensional (waveform)
Pressure
Time
Cochlear Processing
Two Dimensional (not a spectrogram)
Cochlear
Place
Time
Correlogram Processing
Three Dimensional (neural movie)
Cochlear
Place
Time
Autocorrelation Lag
6
Correlogram
Distance down cochlea
Center Frequency
Time Interval (s) Autocorrelation Lag
With help from Richard O. Duda
7
Correlogram
8
Success
  • Reconstructing from correlogram
  • NIPS Keynote

9
Problems
  • Continuation
  • Tone and Noise
  • Parliament Cough
  • Hear two voices?
  • What do you hear?
  • Waveforms?
  • Ideas?

10
Pressure
Time
Cochlear Processing
Cochlear
Place
Time
Correlogram Processing
Cochlear
Place
Time
Autocorrelation Lag
11
Speech Examples
Wedding
Sine
Natural
12
What Vowel is This?
Word 1
Word 2
Peter Ladefoged
Word 3
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
McGurk
17
Sinewave
18
(No Transcript)
19
(No Transcript)
20
ASR
Three
Three
Three
Language model for the words one, two,
three
Two
Two
Two
One
One
One
Word model showing phonemes for the word one
/w/
/? /
/n/
Acoustic (phoneme) model for the phoneme /? /
S1
S2
S3
21
Conventional Scene Analysis
Slide by Dan Ellis (Columbia)
22
BarkerASR
23
GotoCASA with MIDI
MIDI Sequence
24
Old plus New Principle
Slide by Dan Ellis (Columbia)
25
EllisPrediction Driven
26
Saliency
27
Saliency Example
  • Time-frequency display
  • Saliency map shows high-interest locations

28
Saliency Maps
  • Longer tones better
  • Missing parts salient
  • Modulation more salient
  • Forward masking works

29
Sound Examples
  • Birds
  • Calls
  • Cows
  • Horse
  • Waterfall

30
Saliency Comparison
  • Details of saliency comparison
  • Model predictions

31
Relational Network (Simple)
  • Patches of neurons
  • Each measureone quantity
  • Bidirectionalrelations for feedback/feedforward
  • Thanks to Rodney Douglas

32
Relational Network (example)
33
ASR Relational Network
Bidirectional links enforce phoneme/word
constraints
Phone Recognizer
Cochlea
Word Recognizer
Phone Recognizer
Delay
A patch of neurons (one of N output)
Note We dont know how to represent delays
34
Desired Results
Relational Feedback
With
/A/ Phoneme Patch
/I/ Phoneme Patch
AI Word Patch
IA Word Patch
Phoneme Input
35
Simulation
36
Simulation 2
37
Simulation 3
38
GrossbergART
39
Statistical Means
  • ICA
  • Different distributions
  • One Microphone
  • GMM models of distribution

40
Conventional
41
Better?
42
Thanks
  • malcolm_at_ieee.org

43
(No Transcript)
44
Pitch
45
Silicon Frequency Response
  • Tone ramps into two cochleas

46
Cochlear Best Frequency
47
Cochlear Rate Profiles
Spikes per utterance
Left Cochlea
Right Cochlea
48
Hardware Overview
Phoneme
Word
Cochlea
Learning
PCI-AER (for remapping)
Learning
Cochlea
Learning
Giacomo Indiveri
Shih-Chii Liu
PCI-AER (for remapping)
Implemented in MATLAB
49
(No Transcript)
50
LSH Movie
51
Auditory Map
By Lloyd Watts
52
Please do more Neurophysiology!
David
Jerry
Prabhakar
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
Timbre definition
  • Sound color
  • Instruments
  • Vowels

All sound
Static
Timbre
Pitch
Dynamic
Loudness
62
Multi-Dimensional Scaling of Timbre
  • Measure
  • Distances
  • Estimate
  • Positions
  • Art
  • Label axis

Decay
Spectral centroid
Spectral flux
McAdams et al. (1995)
63
Desired perception model
  • Compact (parsimonious)
  • Three Properties
  • Predictive
  • Explain distance perception
  • Simple model
  • Orthogonal axis
  • Linear model
  • Interpolate sounds

Test Euclidean distance
Assumption
64
Experimental Contrast
Guess a model that fits the data
  • Old Way
  • New Way

Sound
Perception
Model
Parameter space
Perception
65
Spectral shape using MFCC
Time (frames)
  • A huge tapestry hung in her hallway.

66
MFCC and LFC
67
Kernel function of DCT
  • Spectrum
  • superposition of DCT kernels
  • Cepstrum coefficients
  • Coefficients for superposition

68
Parameter space MFCC
C60
0.25
0.5
0.75
C30
0.25
0.5
0.75
69
Parameter space LFC
C60
0.25
0.5
0.75
C30
0.25
0.5
0.75
70
Synthesize stimuli
  • Harmonics pitch and vibrato
  • Amplitude weighted by the spectral shape

flat
weighted
Desired spectral shape Vertical - frequency,
Horizontal - amplitude
71
Experiment procedures
  • Paired stimuli (AB, AG, AD, )
  • Rate dissimilarities using 0- 9 scale
  • 10 subjects
  • Quiet office
  • Individual sessions (headphone)

72
Euclidean Fitting
  • 2D linear regression
  • Known values x, y, d - estimate a and b
  • Residual from Euclidean model

73
Results summary
Tristimulus model
LFC
MFCC
74
Experiment results
  • MFCC most successful timbre model
  • Less linearity for high coeffs

75
(No Transcript)
76
Remix Examples
Abba Gimme Gimme
Madonna Hung Up
Tracy Young Remix of Hung Up
Tracy Young Remix 2 of Hung Up
77
Specificity Spectrum
Cover songs
Remixes
Fingerprinting
Genre
Look for specific exact matches
Bag of Features model
Our work (nearestneighbor)
78
Cross-Correlation
72 Billion
  • 2M songs
  • 3 minutes
  • 10 frames/second

72 Billion
79
Curse of Dimensionality
  • Histogram of distances between Gaussian data
  • Normalizedto the mean
  • NearestNeighborIll-posed?

80
Distractors
81
Correlogram
Distance down cochlea
Center Frequency
Time Interval (s) Autocorrelation Lag
Write a Comment
User Comments (0)
About PowerShow.com