Title: Applied Psychoacoustics Lecture 4: Pitch
1Applied PsychoacousticsLecture 4 Pitch Timbre
Perception
2Homework 2 Raw Data
3Homework 2 Mean
4Homework 2 MeanSTD
5Mean and Standard Deviation
mean
Standard deviation
6Homework 2 Mean vs. Median
Mean, blue Median, red
7Homework 2 Median
8Statistics for non-Gaussian distributions
- Median is a number that separates the higher
half of a sample, a population, or a probability
distribution from the lower half. - Quartiles
- first quartile (designated Q1) lower quartile
cuts off lowest 25 of data 25th percentile - second quartile (designated Q2) median cuts
data set in half 50th percentile - third quartile (designated Q3) upper quartile
cuts off highest 25 of data, or lowest 75
75th percentile
9Homework 2 Median
10Homework 2 MedianQuartiles
11Homework 2 Mean vs. Median
Mean, blue Median, red
12Homework 2 MedianQuartiles
13HW 2 MedianQuartilesRange
14HW2 left ear vs. right ear
left ear, blue right ear, red
15Some sounds are higher pitched, being composed of
more frequent and more numerous motions
Euclid (330-275 BC)
16Contents
- Pitch perception
- Pure Tones
- Place and Rate Theory
- Complex Tones
- Timbre
17(No Transcript)
18(No Transcript)
19(No Transcript)
20Equal Temperament
- One semitone equals 12v21.05955.9463
- One cent 1200v21.00060.059
- Perfect fifth 1.5700 cent50
- Perfect Octave 21200 cent100
21Psychometric Function
Describes the relationship between a physical
parameter and its psychological correlate
Example Phon-Sone conversion
22Weber-Fechner Law
- The earliest scientific approach to measuring a
psychometric function - Ernst H. Weber (1795-1878) investigated just
noticeable differences (JNDs) for lifting weights
with the hand. - The subjects were blindfolded and the weight was
gradually increased until they were able to
detect a difference. - He noticed that the JNDs were proportional to the
overall weight. (e.g., if the JND for a 100 g
weigth was 10 g, the JND for a 1000 g weight was
100 g). If the mass is doubled, the threshold is
also doubled.
23Weber-Fechner Law
- Gustav T. Fechner (1801-1887) later developed the
Weber-Fechner Law from Webers findings - Sklog(I/I0)
- With I the physical parameter (Intensity), S its
psychophysical correlate, and k a constant, and
I0 the detection threshold of I.
The JND is then dSklog(dI/I)
24Fechners indirect scales
0 sensation units (0 JND of sensation) stimulus
intensity at absolute detection threshold 1
sensation unit (1 JND of sensation) stimulus
intensity that is 1 difference threshold above
absolute threshold 2 sensation units (2 JND of
sensation) stimulus intensity that is 1
difference threshold above the 1-unit stimulus
25Fechners Law
26Pitch
- Pitch is often thought to be perceived
logarithmically
But for other psychophysical correlates, this
logarithmic relationship does not hold true
27Stevens Power Law
- Stevens was able to provide a general formula to
relate sensation magnitudes to stimulus
intensity - S aIm
- Here, the exponent m denotes to what extent the
sensation is an expansive or compressive function
of stimulus intensity. - The purpose of the coefficient a is to adjust for
the size of the unit of measurement.
log S m log(I-I0) log a
28Examples for Stevens Power Law
29Examples for Stevens Power Law Exponents
30 and now in the log-log space
31Definition of Pitch
- Pitch is that attribute of auditory sensation in
terms of which sounds may be ordered on a scale
extending from low to high. Pitch depends mainly
on the frequency content of the sound stimulus,
but it also depends on the sound pressure and the
waveform of the stimulus.
ANSI standard 1994
32The mel scale
Stevens, Volkmann Newmann, 1937
- Five listeners were
- asked to judge a the
- frequency of a second
- sinusoidal tone generator
- to be perceived half the
- Magnitude of the first
- oscillator with constant frequency
- (method of
- adjustment)
- Sound was switched between both oscillators
- (2-s interval)
- 60 dB SPL
Stevens, Volkmann Newmann, 1937
33Mel Scale - Raw Data
Geometric means for five observers, and average
error for 2 listeners
Stevens, Volkmann Newmann, 1937
34Def. 1000 mels 1000 Hz at 40 dB
Stevens, Volkmann Newmann, 1937
35Solid line mel scale /2.83 Black squares
integrated difference limens open circles
relative location of the resonant positions on
the basilar membrane
Stevens, Volkmann Newmann, 1937
36Size of Musical interval in terms of Mels
Stevens, Volkmann Newmann, 1937
37Hz/mel conversion
- To convert f hertz into m mel use
- m 1127.01048loge(1 f / 700).
- And the inverse
- f 700(em / 1127.01048 - 1).
38Frequency JNDs
Different symbols show different studies
(Fig.Terhardt 1998)
39Frequency Difference Limens
Wier et al., 1977
40Frequency Difference Limens
- At low sound pressure levels (lt10 dB SPL), the
JND for pitch Increases. - The hump at 800 Hz was not confirmed in
follow-up studies - At one 1-kHz the difference limens is about 3
cents (0.2) - At high and low frequencies, we are less
sensitive to pitch (e.g., 0.5 at 200 Hz and 1
at 8 kHz. - Melody recognition disappears for frequencies
above 4-5 kHz
41DL compared to a semitone
42Pure Tone Frequency Discrimination
from Cheveigne, 2004
43Effect of signal duration
Large improvement in F0 discrimination with
duration for unresolved harmonics (White and
Plack, 1998)
d relative to 20 ms
Duration (ms)
44SPINC vs. Bark
based on JNDs
(Fig.Terhardt 1998)
SPINCSpectral Pitch Increment
45Theories on Pitch Perception
- Place Theory
- Pitch is determined by the location of the firing
inner hair cell population on the basilar
membrane - Rate Theory
- Pitch is determined by the rate code of the inner
hair cells (phase locking)
46Place Theory
Excitation on Basilar membrane
from Hartmann, 1996
47Place Theory
Excitation on Basilar membrane
Excitation on Basilar membrane for two sinusoids
of same frequency f but 30 dB level difference
from Hartmann, 1996
48Pitch shift for level variation
according to Terhardt 1982
Pitch variation of a sinusoid as function of SPL
49Autocorrelation
Cross-Correlation Models
t1
S
YY (t) 1/(t1-t0) Y(t)Y(tt)
tt0
Licklider (1951)
50Rate model (Sinusoid analysis)
(from de Cheveigne, 2004)
51Rate Pitch Model
f1/t log(f)log(1/t)
521-kHz sine tone
FFT
531-kHz harmonic complex with equally strong
harmonics
54250-Hz harmonic complex with equally strong
harmonics
55250-Hz harmonic complex with missing fundamental
56250-Hz harmonic complex with decreasing harmonic
strength
57Cochlear Implants
- Research on Cochlear Implant users suggest that
our auditory system makes use of both the rate
and place when determining the pitch. - The analysis of the rate code is not possible for
high frequencies
58Cochlear Implants
Illustration from "Functional Replacement of the
Ear," by Gerald E. Leob, 1985
59Cochlear Implants
60from Plack, Oxenham, 2002
61Which harmonic determines pitch?
from Chris Darwin
62Pitch remains the same without fundamental
(Licklider, 1956)
from Chris Darwin
63Pitch perception of complex sounds
(from de Cheveigne, 2004)
64Figure Explanation
- All spectra (A-E) produce the same magnitude of
pitch. - Solution For each harmonic produce subharmonics
(F/n) and plot these into frequency histogram
(bottom figure). The perceived pitch typically
corresponds to the highest value.
65Pitch perception of complex sounds
(from de Cheveigne, 2004)
66Explanation of Figure
- Landmarks do not work well in the time domain
- The pitch does not match the period of the
envelope if the ratio between carrier frequency
and modulation frequency is lt 10.
(from de Cheveigne, 2004)
67Pitch perception of formant-like sounds
- Can evoke two pitches
- Diagonal line sinusoids
(from de Cheveigne, 2004)
68Another Pitch Definition
The perceptual correlate of the repetition rate
of a sound.
instead of
- that attribute of auditory sensation in terms
of which sounds may be ordered on a scale
extending from low to high (ANSI, 1994).
69F0 discrimination for unresolved harmonics
F0DL ()
(F0 200 Hz)
Lowest Harmonic Number
Low-numbered harmonics (lt10) dominate pitch
perception
Houtsma and Smurzynski, 1990
70Absolute Pitch
- "Passive" absolute pitch
- Persons who are able to identify individual notes
which they hear, - They can typically identify the key of a
composition - "Active" absolute pitch
- Persons with active absolute pitch will be able
to sing any given note when asked. - Usually, people with active absolute pitch will
not only be able to identify a note, but
recognize when that note is slightly sharp or
flat. - 1 in every 10,000 people in the US posses active
absolute pitch possessors (1/20 in some other
locations).
71Motoric Absolute Pitch
- Persons who can reproduce an absolute reference
tone to determine the pitch of other tones (e.g.
professional singer knowing their range, persons
who speak a tone language).
72Timbre
When we hear notes of the same force and same
pitch sounded successively on a piano-forte, a
violin, clarinet, oboe, or trumpet, or by the
human voice, the character of the musical tone of
each of these instruments, notwithstanding the
identify of force and pitch, is so different that
by means of it we recognise with the greatest of
ease which of these instruments was used." (p.
19)
Helmholtz, 1877
73Timbre
Textbooks customarily believe that loudness,
pitch and timbre correlate directly with sound
intensity, fundamental frequency and overtone
structure..."but these experiments show that a
simple one-to-one relationship does not exist."
(p. 59)
Fletcher, 1934
74Timbre
...harmonics manifest themselves in the specific
quality or timbre of the complex tone. . . .
Timbre is multidimensional. ...we do not have a
unidimensional scale for comparing the timbres of
various sounds.
Plomp, 1976
75Timbre
Timbre is that attribute of auditory sensation in
terms of which a listener can judge that two
sounds similarly presented and having the same
loudness and pitch are dissimilar.
ANSI, 1960
76Comment Bregman (1990)
On the ASA definition "This is, of course, no
definition at all. For example, it implies that
there are some sounds for which we cannot decide
whether they possess the quality of timbre or
not. In order for the definition to apply, two
sounds need to be able to be presented at the
same pitch, but there are some sounds, such as
the scarping of a shovel in a pile of gravel,
that have no pitch at all. We obviously have a
problem Either we must assert that only sounds
with pitch can have timbre, meaning that we
cannot discuss the timbre of a tambourine or of
the musical sounds of many African cultures, or
there is something terribly wrong with the
definition." (p. 92)
77Elusive attributes of timbre
- The range between tonal and noiselike character.
- The spectral envelope.
- The time envelope in terms of rise, duration, and
decay. - The changes both of spectral envelope
(formant-glide) and fundamental frequency
(micro-intonation). - The prefix, an onset of a sound quite dissimilar
to the ensuing lasting vibration.
Schouten, 1968
78Grey (1977)
Multidimensional scaling of orchestral instruments
79Grey (1977)
- Stimuli 16 different orchestral instruments
- Listeners had to judge similarity
- Multidimensional scale analysis
- Three scale axes
- Spectral energy distribution
- Synchronicity of attack transients for different
harmonics - Presence of low-amplitude, high frequency energy
in initial attack segment.
80Harmonic Spectra
Flue pipe (open)
Reed pipe with long cyl. res.
Flue pipe (closed)
Reed pipe with short cyl. res.
81Harmonic Spectra
- Relationship of low harmonics
- Spectral centroid
- Dominance of fundamental sound
- Existence of even partial tones
82Comparison
HM Harmonic centroid
Reed Pipe Stops
Flue pipe Stops
HM
HM
83Comparison
Reed Pipe
Flue pipe
Important synchronicity of attack transients
among partial tones, Initial frequency changes
84Literature
- William M. Hartmann (1996) Pitch, periodicity,
and auditory organization, J. Acoust. Soc. Am.
100, 3491-3503. - Alain de Cheveigne (2004) Pitch Perception
models, in Pitch (Plack, Oxenham, eds.),
Springer, New York.