Title: Modelling the perception and cognition of musical structure
1Modelling the perception and cognition of
musical structure
- David Meredith
- ltdave_at_titanmusic.comgt
- Centre for Cognition, Computation and Culture
- Goldsmiths College, University of London
2Algorithmic models of music cognition
3Longuet-Higgins modelLonguet-Higgins, H. C.
(1976). The perception of melodies. Nature,
263(5579), 646-653.Longuet-Higgins, H. C.
(1987). The perception of melodies. In H. C.
Longuet-Higgins (ed.), Mental Processes Studies
in Cognitive Science, pp. 105-129. British
Psychological Society/MIT Press,
London/Cambridge, MA.
- OUTPUT
- 24 C STC -5 G STC 0 G STC 1 AB -1
G TEN REST 4 B STC 1 C TEN
4Longuet-Higgins model of rhythm
- Assumes listener initially assumes pure binary
metre - But willing to change mind at any metrical level
- Evidence for change in metre
- Current metre implies syncopation
- No note onset at beginning of next higher
metrical unit - Current metre implies excessively large change in
tempo - Metre changed if
- evidence for change and
- other division does not imply syncopation or
excessive tempo change
5Longuet-Higgins model of tonality
- For each note, estimates value of sharpness
position of pitch name on line of fifths - Theory of tonality consists of six rules
- First ensures each note spelt so name is as close
as possible to local tonic on line of fifths - Other rules control how algorithm deals with
chromatic intervals and modulations - e.g., second rule states that if current key
implies two consecutive chromatic intervals, then
key should be changed so that both become diatonic
6Longuet-Higgins model Output
- Section of cor anglais solo from Act III of
Wagners Tristan und Isolde - Change from binary to ternary in first beat of
fifth bar (triplets) - Grace note correctly identified in seventh bar
- Agrees fully with original score in tonal and
rhythmic indications - Wagner marked all triplets as staccato fault
with performance, not program! - 98.21 notes spelt correctly (3508 errors) in a
195972 note corpus of classical and baroque music - Cf. 99.44 spelt correctly (1100 errors) by
Merediths PS13s1 algorithm - Meredith, D. (2006). The ps13 pitch spelling
algorithm. Journal of New Music Research, 35(2),
pp. 121-159.
7Lerdahl and Jackendoffs Generative Theory of
Tonal Music (GTTM)Lerdahl, F. and Jackendoff, R.
(1983). A Generative Theory of Tonal Music. MIT
Press, Cambridge, MA.
- WELL-FORMEDNESS RULES define CLASS of POSSIBLE
structural descriptions - PREFERENCE RULES used to find BEST
structuraldescriptions
8Lerdahl and Jackendoffs theory of grouping
structure
- Listener automatically segments music into
structural units of various sizes called groups - Grouping structure of a passage is way that it is
perceived to be segmented into groups - Grouping can be viewed as the most basic
component of musical understanding (Lerdahl and
Jackendoff, 1983, p.13)
9Lerdahl and Jackendoffsgrouping well-formedness
rules
- GWFR 1 Any contiguous sequence of pitch-events,
drum beats, or the like can constitute a group,
and only contiguous sequences can constitute a
group. - GWFR 2 A piece constitutes a group.
- GWFR 3 A group may contain smaller groups.
- GWFR 4 If a group G1 contains part of a group G2,
then it must contain all of G2. - GWFR 5 If a group G1 contains a smaller group G2
then G1 must be exhaustively partitioned into
smaller groups.
10The Gestalt principles of proximity and
similarity in vision and in music
11Lerdahl and Jackendoffssecond grouping
preference rule
- GPR 2 (Proximity) Consider a sequence of four
notes n1, n2, n3, n4. All else being equal, the
transition n2n3 may be heard as a group boundary
if - a. (Slur/Rest) the interval of time from the end
of n2 to the beginning of n3 is greater than that
from the end of n1 to the beginning of n2 and
that from the end of n3 to the beginning of n4,
or if - b. (Attack-Point) the interval of time between
the attack points of n2 and n3 is greater than
that between the attack points of n1 and n2 and
that between the attack points of n3 and n4.
12Lerdahl and Jackendoffsthird preference rule
- GPR 3 (Change) Consider a sequence of four notes
n1, n2, n3, n4. All else being equal, the
transition n2n3 may be heard as a group boundary
if - a. (Register) the transition n2n3 involves a
greater intervallic distance than both n1n2 and
n3n4, or if - b. (Dynamics) the transition n2n3 involves a
change in dynamics and n1n2 and n3n4 do not, or
if - c. (Articulation) the transition n2n3 involves
a change in articulation and n1n2 and n3n4 do
not, or if - d. (Length) n2 and n3 are of different lengths
and both pairs n1, n2 and n3, n4 do not differ in
length. - (One might add further cases to deal with such
things as change in timbre or instrumentation.)
13Temperley and Sleators Melisma music
analyserTemperley, D. (2001). The Cognition of
Basic Musical Structures. MIT Press, Cambridge,
MA.Meredith, D. (2002). Review of David
Temperleys The Cognition of Basic Musical
Structures (Cambridge, MA MIT Press, 2001).
Musicae Scientiae, 6(2), pp. 287-302.
14Temperleys theory of contrapuntal structure
Input representation
15Temperleys contrapuntal well-formedness rules
(CWFRs)
- CWFR 1 A stream must consist of a set of
temporally contiguous squares on the plane. - CWFR 2 A stream may be only one square wide in
the pitch dimension. - CWFR 3 Streams may not cross in pitch.
- CWFR 4 Each note must be entirely included in a
single stream.
16Temperleys contrapuntal preference rules (CPRs)
- CPR 1 (Pitch Proximity Rule) Prefer to avoid
large leaps within streams. - CPR 2 (New Stream Rule) Prefer to minimize the
number of streams. - CPR 3 (White Square Rule) Prefer to minimize the
number of white squares in streams. - CPR 4 (Collision Rule) Prefer to avoid cases
where a single square is included in more than
one stream.
17Using Temperleys theory to model listening,
composition, performance and style
- Temperley and Sleators programs scan the music
from left to right, keeping note of the analyses
that best satisfy the preference rules so far at
each point. - Ambiguity Two or more best analyses at a given
point in the music. - Revision The best analysis at some point in the
music does not form part of the best analysis at
some later point. - Expectation The most expected events are those
that will lead to an analysis that best satisfies
the preference rules. - Style A passage is in the style defined by a set
of preference rules if the analysis that best
satisfies the preference rules achieves a score
that is not too high (boring) and not too low
(incomprehensible). - Composition Choices guided by goal to produce
piece that satisfies preference rules to just the
right extent. - Performance Temporal and dynamic expression
geared towards conveying structure in accordance
with analysis that best satisfies the preference
rules.
18Summing up
- We can attempt to model the perception and
cognition of musical structure by constructing
algorithms that take representations of musical
passages as input and generate structural
descriptions of those passages as output - We can evaluate such algorithms by comparing
their output with expert human analyses and
authoritative scores - Can express a theory of musical structure as a
preference rule system consisting of - Well-formedness rules that define the class of
legal structural descriptions - Preference rules the legal structural
descriptions that best satisfy the preference
rules are predicted to be the ones that listeners
are most likely to hear