Title: Discovering Non-Trivial Repeating Patterns in Music Data
1Discovering Non-Trivial Repeating Patterns in
Music Data
- Jia-Lien Hsu, Chih-Chin Liu, and Arbee L.P. Chen,
Member, IEEE
2organization
- Introduction
- Repeating patterns
- correlative matrix
- string-join
- suffix tree
- discussion
3Retrieval music data
- initial stages
- Information form raw data
- Loudness, pitch, brightness.
- Classify
- Speech, music, silence..
- medium
- Music data is transformed into a string
- Ex. U, D, S
- String matching
4Lately research
- lately
- string matching VS the length of music objects to
be matched. - If the music objects are large, the execution
time for query processing may become
unacceptable!!!! - Repeating patterns
- A sequence of notes which appears more than once
in a music object - repetition is a universal characteristic in music
structure modeling
5Repeating patterns
- Melody string C-D-E-F-C-D-E-C-D-E-F
RP C-D-E-F C-D-E D-E-F C-D
RPF 2 3 2 3
RP D-E E-F C D
RPF 3 2 3 3
RP E F
RPF 3 2
6non-trivial Repeating patterns
- A repeating pattern X is non-trivial if and only
if there does not exist another repeating pattern
Y such that freq(X) freq(Y) and X is a
substring of Y. - freq(C-D-E-F) freq(D-E-F) freq(E-F)
freq(F) 2 - freq(C-D-E) freq(C-D) freq(D-E)
freq(C) freq(D) freq(E) 3 - C-D-E-F and C-D-E are non-trivial.
7Find all repeating patterns
- To generate all substrings of S. Then, each
substring P of S will be compared with S to
decide the number that P appears in S. - correlative matrix
- string-join
- suffix tree
8correlative matrix
C6 Ab5 Ab5 C6 C6 Ab5 Ab5 C6
C6
Ab5
Ab5
C6
C6
Ab5
Ab5
C6
1
1
1
1
1
2
1
3
4
1
1
1
9candidate set
- To find all repeating patterns and their
repeating frequencies - candidate set, denoted CS
- CS is of the form (pattern,rep_count, sub_count)
- Pattern
- repeating pattern
- rep_count
- the count of matching to the repeating pattern
- sub_count
- the number of the repeating pattern being a
proper substring of the other repeating patterns
10CASE 1
- (Ti,j 1 and T(i1),(j1) 0)
- CS(C6, 1, 0)
C6 Ab5 Ab5 C6 C6
C6 1
Ab5 1
1
11CASE 2
- (Ti,j 1 and T(i1),(j1) ? 0)
- CS(C6, , )
C6 Ab5 Ab5 C6 C6 Ab5
C6 1
Ab5 1
1
2
1
2
1
0
12CASE 3
- (Ti,j gt 1 and T(i1),(j1) ? 0)
- CS(C6, 2, 1)
C6 Ab5 Ab5 C6 C6 Ab5 Ab5
C6 1 1
Ab5 1 1
Ab5 1
2
3
(C6-Ab5, 1, 0) (Ab5, 1, 0)
(C6-Ab5, 1, 1) (Ab5, 1, 1)
13CASE 4
- (Ti,j gt 1 and T(i1),(j1) 0)
- CS
C6 Ab5 Ab5 C6 C6 Ab5 Ab5 C6 Db5
C6 1 1 1
Ab5 1 2 1
Ab5 1 3
C6 1
C6 1
4
(C6, 6, 1)
(C6, 7, 2)
(C6-Ab5-Ab5-C6, 1, 0)
(Ab5-Ab5-C6, 1, 1) (Ab5-C6, 1, 1)
14calculate repeating frequency
- for a repeating pattern whose repeating frequency
is f , there will be - Cf2 f (f-1) / 2 matchings associated with
this repeating pattern when constructing the
correlative matrix - repeating frequency f
- f ( 1 1 8 rep_count ) / 2
15String Join
- melody string C-D-E-F-C-D-E-C-D-E-F
- find all repeating patterns of length one
- form X, freq(X),(position1, position2, )
- C, 3, (1, 5, 8), D, 3, (2, 6, 9), E,
3,(3, 7, 10), and F, 2, (4, 11)
16length two repeating patterns
- C-D-E-F-C-D-E-C-D-E-F
- repeating pattern of length two can be found by
joining (denoted as 8) two repeating patterns
of length one - C, 3, (1, 5, 8) 8 D, 3, (2, 6, 9)
C-D, 3, (1, 5,8) - D, 3, (2, 6, 9) 8 E, 3, (3, 7,10)
D-E, 3, (2, 6,9) - E, 3, (3, 7, 10) 8 F, 2, (4, 11)
E-F, 2, (3, 10)
17length four repeating patterns
- C-D-E-F-C-D-E-C-D-E-F
- C-D, 3, (1, 5, 8) 8 E-F, 2, (3, 10)
C-D-E-F, 2,(1, 8) - freq(C-D-E-F) freq(E-F) 2
- E-F D-E-F are trivial
- Check C-D-E ----join C-D and D-E
- C-D-E, 3, (1, 5, 8)
- Therefore, the non-trivial repeating patterns
C-D-E-F and C-D-E
18suffix tree(music feature string S abbabb)
7
b
a
4
7
2
b
a
b
2
3
2
6
b
2
a
5
2
a
4
1
19performance
- Factors which dominate the performance
- the object size of music objects
- the note count of music objects
- the length of the longest repeating patterns
- the number of non-trivial repeating patterns
20discussion
- suffix tree approach is the worst since it
enumerates all suffixes - Correlative-Matrix approach is inefficient since
most substrings of the refrain are trivial. - the song Five Hundred Miles
- Its longest repeating pattern is its refrain
- C-C-C-E-E-D-C-E-E-D-C-D-E-D-C-A-A-C-D-E-D-C-A
-G-G-A-C-C
21aaaa
a a a a
a
a
a
a
1
1
1
2
2
3
( a,2,2 )
( a,4,3 )
( a,5,3 )
( a,6,3 )
( a,1,1 )
( a,3,2 )
( aa,1,0 )
( aa,2,0 )
( aaa,1,0 )
( aa,3,1 )
CS
f ( 1 1 8 rep_count ) / 2
22aaaa
- repeating patterns of length one
- a, 4, (1, 2, 3, 4)
- repeating patterns of length two
- aa, 3, (1, 2, 3)
- repeating patterns of length three
- aaa, 2, (1, 2)
23aaaa
5
a
4
5
a
4
3
a
2
3
a
2
1
24acdacda f ( 1 1 8 rep_count ) / 2
a c d a c d a
a
c
d
a
c
d
a
1
1
2
3
4
CS
(a,1,1)
(a,2,1)
(ac,1,1)
(c,1,1)
(acd,1,1)
(cd,1,1)
(d,1,1)
(a,3,2)
(acda,1,0)
(cda,1,1)
(da,1,1)
25acdacda
- repeating patterns of length one
- a, 3, (1, 4, 7), c, 2, (2, 5), d, 2,
(3, 6) - repeating patterns of length two
- ac, 2, (1, 4), cd, 2, (2, 5), da, 2,
(3, 6) - repeating patterns of length four
- acda, 2, (1, 4)
- non-trivial repeating patterns acda
268
a
c
d
8
2
2
3
c
a
d
2
2
2
7
c
d
a
6
3
2
2
c
a
5
2
2
acdacda
c
4
1