CUIDADO UPF status report - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

CUIDADO UPF status report

Description:

Develop music description schemes that can be used for the different CUIDADO ... Funk. M2 meeting. Other contributions: F0 estimation ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 38
Provided by: perfecto4
Category:
Tags: cuidado | upf | report | status

less

Transcript and Presenter's Notes

Title: CUIDADO UPF status report


1
CUIDADO UPF status report
  • M2 meeting
  • March 2002
  • Perfecto Herrera
  • Universitat Pompeu Fabra

2
Outline
  • Music Description Schemes (Deliverable 2.2.1)
  • Modules Developed (Deliverable 2.2.2)
  • Melody description extraction module
  • Rhythm description extraction module
  • Interesting segments extraction module
  • Timescaling module
  • Other contributions
  • Whats next?

3
Music Description Schemes (Deliverable 2.2.1)
  • Segment description schemes
  • Melody description schemes
  • Rhythm description schemes
  • Instrument description schemes
  • Some useful DSs currently in MPEG-7

4
Music Description Schemes Goals
  • Develop music description schemes that can be
    used for the different CUIDADO prototypes and
    partners
  • Keep MPEG-7 XML-schema compatibility
  • Address different musical layers
  • Address different abstraction levels (from lower
    to higher levels)

5
Segment description
  • An AudioSegment is the basic unit for audio
    description.
  • Segments can be decomposed into other segments.
  • Descriptors and other audio DSs can be included
    in segment descriptions

6
Segment description
Some segment boundaries can be more clear than
others, therefore we need a low-level descriptor
for indicating their reliability
7
Melody description
8
Melody description pending issues
  • Mid-level descriptors derived from pitch contour
    scale, tonality, interval distributions,
    tessiture...
  • Mid-level descriptors derived from score or grid
    deviations
  • Higher-level subjective terms (energetic, solemn,
    etc.)

Input from other partners is needed
9
Rhythm description
10
Rhythm description pending issues
  • Tempo variation
  • Meter extraction
  • Deviations from structure
  • Connections with instrument labelling
  • Rhythm pattern taxonomies
  • Higher-level subjective terms

Input from other partners is needed
11
Instrument description
ClassificationScheme DS can be used for building
conceptual taxonomies (ex genre, subjective
ratings, etc...) SoundCategory DS can be used
specifically for instrument taxonomies
12
Instrument description
Current SoundModel DS assumes a SpectrumBasis
decomposition other possible modeling
strategies should be included
13
Auxiliary DSs (no new proposals here)
  • Creation and Production DSs are the right
    structures for describing the information about
    the creation and production of the multimedia
    content (examples who was the producer, where
    was recorded, how good it is subjectively
    rated.)
  • Media DSs are the structures for describing the
    media-specific information of the multimedia data
    (examples physical format, profiles of coding
    parameters)
  • Usage DSs are intended for describing the
    possible way of using or having used the content
    (examples owners of the rights, usage
    permissions where, when, how, by whom-, cost of
    the creation, user settings...)

14
Music Description Extractors and Transformers
(Deliverable 2.2.2)
  • Interesting segments extractor
  • Melody description extractor
  • Rhythm description extractor
  • Timescaling module

15
Interesting segments extractor
  • Implementation of Footes (2000) algorithm
  • ( low self similarity means novelty)
  • Self-similarity scale is user-controlled
  • Interestingness is defined by choosing a set of
    low-level descriptors
  • Best run in supervised mode (user gets graphs,
    then select better parameter values)
  • Output segment boundary points novelty measure
  • Functional dependency on Low-level descriptors
    (D2.1.1.) Integration not yet solved
  • Enhanced functionalities when combined with some
    post-processing (i.e. solo location)

16
Interesting segments extractor
17
Interesting segments extractor
  • Note segmentation using energy, kurtosis, and F0

18
Interesting segments extractor
  • Section segmentation using energy, centroid and
    kurtosis

19
Melody description extractor
  • Currently it is only intended for monophonic
    phrases
  • Limitations
  • Note segmentation needs improvements (better
    transient detection)

20
Melody description extractor
21
Rhythm description extraction module
  • Extracts timing and rhythmic data of drum loops
  • Consists of sub-modules
  • onset detector
  • pulses (i.e. rhythmic levels) detector
  • Tick
  • Tempo
  • Instrument labeler (yet to be implemented)

22
Rhythm description extraction module
23
Instrument description instrument labelling
  • Towards automatic labelling of percussive slices
  • First step automatic labelling of isolated
    percussive sounds
  • Next proceed with mixtures of 2 and 3 sounds

24
Rhythm description instrument labelling(submitte
d to ICMC2002)
Kick-Snare-OpenHH-ClosedHH-HiTom-MidTom-LoTom-Cras
h-Ride
Membranes-Plates
Kick-Snare-Hihat-Tom-Cymbals
SKEWNESS gt 4.619122 AND B40HZ70HZ gt 7.784892
AND MFCC3 lt 1.213368 Kick (105.0/0.0)   KURTOSIS
gt 26.140138 AND TEMPORALCE lt 0.361035
AND ATTZCR gt 1.478743 Tom (103.0/0.0)   B710KHZ
lt 0.948147 AND KURTOSIS lt 26.140138 AND ATTZCR
lt 22.661397 Snare (133.0/0.0)   SPECCENTROID gt
11.491498 AND B1015KHZ gt 0.791702 HH
(100.0/2.0)   SKEWNESS lt 4.485531
AND B160HZ190HZ lt 5.446338 AND MFCC3VAR gt
0.212043 AND MFCC4 gt -0.435871 Cymbal
(110.0/3.0)
  • Database with gt 600 of rock drum sounds
  • Initial set of 50 descriptors reduced to 20-30
    after selection

25
Timescaling module
  • Time compression and expansion of audio content
  • Very high quality (without timbre or pitch
    alteration), transient and stereo image
    preservation
  • Usable for content-based transformation (provided
    content descriptions), not only as it is

26
Timescaling module
  • Some examples
  • Vocal
  • Orchestral
  • Jazz
  • Funk

27
Other contributions F0 estimation
  • Monophonic F0 detector based on Two-Way Mismatch
    (Maher Beauchamp, 1993)
  • Integration into 2.1.1 modules is going to use
    single frame estimations this may not be
    optimal, as context (previous F0, next F0,
    instrument, etc.) is not considered
  • Polyphonic F0 detector (Klapuri 2000) using
    bandwise processing
  • Intended mainly for polyphonic-monotimbral
    instruments, or small ensembles, not for dense
    mixtures of sounds
  • Estimation of candidates is performed for each
    analysis frame. Several candidates are obtained
  • The tracking of candidates is not still
    implemented
  • Our main interest is in deriving a predominant F0
    only

28
Other contributions polyphonic F0 detection
29
UPF CUIDADO team
  • Xavier Amatriain
  • Lars Fabig
  • Emilia Gómez
  • Günter Geiger
  • Fabien Gouyon
  • Gilles Peterschmitt
  • Julien Ricard
  • Perfecto Herrera

30
Whats next? (beyond retrieval)
  • Examples of achievable functionalities
  • Music structure visualization
  • Melody description visualization and manipulation
    (content-based timescaling)
  • Rhythm loops processing
  • Matching songs in playlists by tempo

31
Whats next? Music Structure visualization
32
Whats next? Melody description visualization
and manipulation (content-based timescaling)
33
Whats next? Visualizing, navigating and editing
with rhythm marks
  • Pulse marks visualization
  • Pulse-based edition (duplicating parts, snap to
    pulse mark) and navigation (skip to next tick)

34
Whats next? Combining rhythm and instrument
descriptions
  • Navigation by instrument occurrence (skip to next
    snare)
  • Muting an instrument
  • Processing (applying an effect to) an instrument

35
Whats next? Combining rhythm and instrument
descriptions
  • Building MIDI maps, re-constructing the loop,
    and generating timbral variations

Original audio file
Sound Database
Tick extraction
Labeling
Instrument Category and/or Timbre
similarity Query
Reconstruction
MIDI map
MIDI file
new audio file
36
Whats next? Combining rhythm descriptions and
timescaling
Transition from one pattern to another that do
not match in tempo
37
Whats next? Matching songs by tempo for
playlist generation-
Write a Comment
User Comments (0)
About PowerShow.com