Title: Pitch Recognition with Wavelets
1Pitch Recognition with Wavelets
- 1.130 Final Presentation
- by Stephen Geiger
2What is pitch recognition?
- Well, what is pitch? . . .
- How HIGH or LOW a sound is
- Which note?
- Perceived Frequency
3Relationship Between Pitch and Frequency
Pitch
Fundamental Frequency
4For Example
Frequency 262 Hz
5For an A Scale
A 2202(0/12) 220 Hz A 2202(1/12) 233
Hz B 2202(2/12) 247 Hz C 2202(3/12) 262
Hz C 2202(4/12) 277 Hz D 2202(5/12) 294
Hz D 2202(6/12) 311 Hz
E 2202(7/12) 330 Hz F 2202(8/12) 349
Hz F 2202(9/12) 370 Hz G 2202(10/12)
392 Hz G 2202(11/12) 415 Hz A
2202(12/12) 440 Hz
6An Octave Up
Frequency 524 Hz
7A Sum with 2 Frequencies
Frequency 262 Hz and Frequency 524 Hz
8Freq in a Piano - Middle C
Frequency, Hz
9FFT of a Oboe Middle C
Frequency, Hz
10Mono vs. Poly
- Monophonic
- one note at a time
- (e.g. trumpet)
- Polyphonic
- multiple notes at a time
- (e.g. piano, orchestra)
Creates a problem for pitch recognition. (especial
ly octaves!)
11Some Existing Methods
- Time Domain Pitch Period estimation
- With wavelets.
- With auto-correlation function.
- Freq. Domain Find Fundamental
- Auditory Scene Analysis
- Blackboard Systems
- Neural Networks
- Perceptual Models
12What applications are there?
- Transcription of Music
- Modeling of Musical Instruments
- Speech Analysis
- Besides its an Interesting Problem
13 14A Novel Wavelet Approach
Based on an observation made by Jeremy Todd,
that
For a piano playing these notes, a CWT could be
used to identify a G with certain scale/wavelet
combinations. Even with some polyphony !
15Finding a G in a C Scale
Original Signal
CWT _at_ Specific Scale
16The Continuous Wavelet Transform
Where a scaling factor
b shift factor f(t) function
we start with Y(t) Mother wavelet
17What is Scale?
LOW SCALE Compressed Wavelet Lots of Detail High
Frequency
HIGH SCALE Stretched Wavelet Coarse Features Low
Frequency
(You are here)
(And here)
18Gaussian 2nd Order Wavelet
19Initial Work
- Took an empirical approach.
- Ran a number of CWTs at varying scale, and
looked at the results. - Picked out a CWT scale for each note in the C
scale. -
20Finding Notes in a C Scale
Original
Scale 594 530 472 446 394 722 642 606
21Finding Notes w/ Polyphony
Original
Scale 594 530 472 446 394 722 642 606
22More Complex Polyphony
Original Scale 594 530 472 446 394 722 642 606
23Testing with different timbre
Original
Scale 594 530 472 446 394 722 642 606
24Why does this work?
- The scale parameter
- in the CWT affects
- frequency response.
- However, our scales that
- work dont seem to follow
- a clear pattern.
25Training Algorithm
- Again, took an empirical approach.
- Ran CWTs at varying scales,
- on sample files containing one note.
- Picked out scales, where
- maximum of the CWT for
- one note gtgt other notes
- (and collected results).
26- Results of
- Training Algorithm
- . . .
27Longer C Scale Trained on 3 Octaves of Notes
28A Fragment by Chopin
From Right Hand of Prelude in C, Op. 28 No. 1
29Training on a Real Guitar
- Only able to find 5 of 8 pitches for C Scale
training case. (With limited attempt). - Results on a test file were not completely
accurate. - Expected to be a more difficult case than a
piano. - Could merit a more thorough try.
30Entire 88 K on a P
- Work in progress.
- It takes a long time to run many CWTs on 88
different sound files. - Initial results able to
- identify notes 70-88.
31Frequency Response Revisited
Frequency Response of a 2nd Order Gaussian Wavelet
32Resulting Scales for 22 Piano Notes
SCALE
NOTE NUMBER
33Resulting Scales for 8 Sinusoidal Notes
SCALE
NOTE NUMBER
34Conclusions
- The novel wavelet approach isnt perfect.
- Requiring training is a handicap.
- Most likely not suited to sources with varying
timbre. (e.g. guitar, voice) - Some interesting results.
- The mechanism of detection could be further
investigated and better understood.