Title: Pratt version 4.4.31:
1Extending Audacity for Audio Annotation
Beinan Li, John Ashley Burgoyne, Ichiro
Fujinaga Music Technology Area, Schulich School
of Music, McGill University, and CIRMMT,
Montreal, Canada
Manual Audio Annotation
Region Selection
In the original Audacity
C
O
Chorus
The previously selected opening boundary O is
lost when user selects C to start a trial
listening for locating the closing boundary.
Audio classification systems calls for custom
manual annotation software.
Existing tools lack features for audio
classification purposes and are not customizable,
e.g.
In our extended Audacity
Project Pad No waveform viewer
C
O
Cache a location (e.g. O) as a candidate opening
boundary.
Cache a location (e.g. C) as a candidate closing
boundary.
Choice of Software Framework
- Need open source base software with
- Full playback control
- Support for creating text notes
- Audio signal visualization
- Audio format compatibility
- Cross-platform
Finalize both candidate boundaries and create a
label for the region.
Label Tracks and Auto-completion
In the original Audacity
- Pratt (version 4.4.31)
- Written in C
- Mainly for speech analysis
- No support for MP3, and many other popular
compressed audio formats. - Self-implemented GUI, hard to extend
- Audacity (version 1.3beta)
- Written in C
- General audio editing
- Label Track support
- Support for popular uncompressed and compressed
audio formats - GUI based on open source framework wxWidgets,
easy to extend
Manually create and name only one label at a time.
In our extended Audacity
In a binary classification, only one category
needs to be labeled and the other one is
automatically labeled.
Export Results in ACE XML Format
Limitations of Current Audacity
Future Work
- Limitations related to audio annotation
- Region selection cannot temporarily store
unfinalized boundaries. - Label tracks No automatic label creation and
naming.
Provide more visual cues by visualizing various
audio features to human annotators.