Jiazhi Ou jzou@cs.cmu.edu - PowerPoint PPT Presentation

About This Presentation

Title:

Jiazhi Ou jzou@cs.cmu.edu

Description:

... probably due to a change in the dolphins direction Mapping from Labels to Models Label Statistics Previous Work Dolphin-ID Project by Tanja, ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 28

Provided by: SchoolofC55

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Jiazhi Ou jzou@cs.cmu.edu

1
Wild Dolphin Project 11-751 Speech Final Project

by
Jiazhi Ou jzou_at_cs.cmu.edu
Tal Blum blum_at_cs.cmu.edu

2
Outline

Wild Dolphin Project, Dolphin Speech
Data, Labeling, Labeling problems
Previous work
Models training
Experiments Results
Conclusions

3
The Wild Dolphin Project (WDP)

The Wild Dolphin Project (WDP), founded by Dr.
Denise Herzing in 1985, is engaged in an
ambitious, long-term scientific study of a
specific pod of Atlantic spotted dolphins that
live 40 miles off the coast of the Bahamas, in
the Atlantic Ocean. For about 100 days each year,
Phase I research has involved the photographing,
videotaping, and audio taping of a group of
resident dolphins, aiming to learn about their
lives.
http//www.wilddolphinproject.org/index.cfm

4
Dolphins Speech

Dolphins Speech is very different than mans
speech

Range of frequencies is wider
Two mechanisms for producing sound simultaneously
Directionality of some of the frequencies
Carried in water
Can travel large distances

5
Dolphins Speech(2)

Is used for
Identification
Communicating
Fighting
Defending
Courting
Warning
Calling
Hunting

6
Dolphins Speech(3)

3 main types
Whistles
Signature
Non-signature
Clicks
Spike trains

7
What do we know

Not much
We know that each dolphin has a unique whistle
called signature whistle.
The signature whistle is similar to those that
are in close contact with the baby dolphin

8
Data

164 files containing sounds of one dolphin whose
name is known.
Average file length is 7 sec
Total data length less than 20 minutes out of
which about half is silence
The data does not contain all of the relevant
frequencies

9
Labeling

Dolphin Names
Dolphin ID project
Pause, Noise, Dolphin Signature Whistles, Dolphin
Non-Signature whistles.

10
Labeling Problems

How do we distinguish between those 2 whistles?
How to distinguish between whistles and
non-whistles?
They co-occur
How to determine the duration of the label?
Should close labels be labeled as one label?
This has an effect on the model
Some signals are weak, probably due to a change
in the dolphins direction

11
Mapping from Labels to Models
Label Model
d Signature Whistles
dp, md Non-Signature Whistles
click, electnoise, electricnoise, h, H, MachineSpike, s GARBAGE
pau PAUSE (Water)
12
Label Statistics
PAUSE SIGWHISTLE GARBAGE DOLPHIN
occurrences 756 633 13 24
Accumulated time (in secs) 466 320 7.1 11.3
Average time per occurrence 0.6 0.5 0.55 0.47
13
Previous Work

Dolphin-ID Project by Tanja, Alan and Yue
Task To identify dolphin ID using their
signature whistles
51 labeled files by Alan
13 HMMs 10 for each dolphin DOLPHIN, PAUSE,
and GARBAGE
Use Janus to do training and testing
Try different kinds of features

14
Our Work

Model Generalized Signature Whistles
Label More Files
Create HMMs for signature whistles, non-signature
whistles, garbage, and pause
Train and test the HMMs using Janus
Evaluate the test results with our own method
Compare different model selections

15
Signal Processing

Tanja scripts
Down sampling
High Pass Filter
FFT
LDA

16
HMM Topologies
Signature Whistles
Non-Signature Whistles
Garbage
Pause (Water)
17
Model Selection

Scheme 1
Signature Whistles, Non-Signature Whistles,
GARBAGE, PAUSE
Scheme 2
Signature Whistles, GARBAGE, PAUSE
Scheme 3
10 HMMs (one for each dolphin), GARBAGE, PAUSE

18
Evaluation

We can not use WER here since there are no words,
just segments.
The method we used was to compute a confusion
matrix over hidden states.
Janus treat silence differently and doesnt show
silence classification which complicates the
evaluation.

19
Experiments

Data
162 labeled files were used
Half of the data for training, half for testing
Swap the training set and test set
162 test results all together
Features
The same as those in dolphin-ID project
Model Selection
3 different schemes

20
Results Scheme 1
Sig Non-Sig Garbage Pause
Sig 58 6 18 34
Non-Sig 33 8 37 22
Garbage 77 0 5 18
Pause 31 6 27 34
21
Results Scheme 2
Sig Garbage Pause
Sig 79 9 21
Garbage 52 21 27
Pause 48 14 38
22
Results Scheme 3
Sig Garbage Pause
Sig 91 0.6 8
Garbage 80 10 10
Pause 69 1 30
23
Analysis of Results

You can only get as good as your labels
Scheme 3 is the best to align signature whistles
-- speaker dependent
Scheme 1 is the worst Not enough data to model
non-signature whistles and garbage
Scheme 2 is in the middle speaker independent
Pause is the most difficult to model It
contains all different things. We modeled it with
only 1 state

24
Conclusion

Analyzing dolphin sounds is quite different than
analyzing human speech. The methods used have to
be adjusted to the characteristics of the dolphin
sounds.
There is a lot of work to be done in the signal
processing stage
Partly supervised training
It might be better just to construct a model for
the labels we are sure and let the model learn
what are signature whistles or units that
discriminate between different labels.

25
We also tried