Title: A Sticky HDPHMM for Systems with State Persistence
1A Sticky HDP-HMM for Systems with State
Persistence
- Emily Fox, Erik Sudderth, Michael Jordan, and
Alan Willsky - ICML 2008
- Helsinki, Finland
2Application Speaker Diarization
Total of people
High probability of self-transition
Multi-modal emissions
3Application Maneuvering Target Tracking
- HMM emissions exogenous input driving dynamical
system - Unknown number of maneuver modes
Dynamical System
4HDP Prior on Infinite HMM
- Nonparametric Bayesian prior on HMMs with unknown
state space cardinality - Encourages use of sparse subset of infinite state
space - Allows new states to be created as more data are
observed
- Inadequately captures temporal state persistence
Infinite HMM Beal, et.al., NIPS 2002HDP-HMM
Teh, et. al., JASA 2006
5Outline
- Background HDP-HMM
- Sticky HDP-HMM
- Capturing multimodal emissions
- Speaker diarization
6Hidden Markov Models
states
Time
observations
State
7Hidden Markov Models
states
observations
8Hidden Markov Models
states
observations
9Hidden Markov Models
states
observations
10HDP-HMM
Time
State
- Dirichlet process (DP)
- State space of unbounded size
- Model complexity adapts to observations
- Hierarchical
- Ties state transition distributions
- Shared sparsity
11HDP-HMM
Stick-breaking construction for DP(g, H)
- Average transition distribution
Stick of unit probability mass
12HDP-HMM
- Average transition distribution
- State-specific transition distributions
13Sensitivity to Noise
- HDP-HMM inadequately models temporal persistence
of states - DP bias insufficient to prevent unrealistically
rapid dynamics - Reduces predictive performance of inferred model
14Sticky HDP-HMM Part I
State-specific base measure
Increased probability of self-transition
15Direct Assignment Sampler
- Marginalize
- Transition densities
- Emission parameters
- Sequentially sample
Conjugate base measure Þ closed form
16Blocked Resampling
HDP-HMM weak limit approximation
HDP-HMM weak limit approximation
17Hyperparameters
- Place priors on hyperparameters and learn them
from data - Weakly informative priors
- All results use the same settings
hyperparameters
can be set using the data
Related self-transition parameter Beal, et.al.,
NIPS 2002
18Results Gaussian Emissions
19Results Fast Switching
Observations
True statesequence
20Outline
- Background
- Sticky HDP-HMM
- Capturing multimodal emissions
- Speaker diarization
21Issues with Multimodal Emissions
22Sticky HDP-HMM Part II
- Approximate multimodal emissions with infinite
Gaussian mixture - Temporal state persistence disambiguates model
23Results Mixture Emissions
24Speaker Diarization
25Processing of Features
- Features 19-dim MFCCs
- Features similar between speakers gt challenging
problem - Speakers look different over time
- Note
- No training data
- Just input the raw features
26Results 21 meetings
27Results Meeting 1
Sticky DER 1.26 ICSI DER 7.56
28Results Meeting 2
Sticky DER 24.06 ICSI DER 22.00
29Conclusion
- Examined limitations of original HDP-HMM
- Presented sticky HDP-HMM with
- Parameter allowing bias towards self-transitions
- DP emission densities for each HMM state
- Simple and effective addition to the original
HDP-HMM - Able to learn a wide range of dynamics, even when
state persistence is not present in the data
30Results Fast Switching
Observations
True statesequence
31NIST Rich Transcription Evaluations
- Competition for past 6 years
- Many teams compete
- Highly engineered systems
- Large team
- Years to develop
WOW ICSI has lt 10DER