Spectral Clustering and Embedding with Hidden Markov Models - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Spectral Clustering and Embedding with Hidden Markov Models

Description:

Tony Jebara, Columbia University. Spectral Clustering and Embedding with Hidden Markov Models ... Next: sneak peek at some new applications... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 27

Provided by: jeb83

Category:

more less

Transcript and Presenter's Notes

Title: Spectral Clustering and Embedding with Hidden Markov Models

1
Spectral Clustering and Embedding with Hidden
Markov Models
Tony Jebara, Yingbo Song, Kapil
Thadani Department of Computer Science,
Columbia University
2
Outline

Unsupervised learning parametric vs.
nonparametric
Density estimation parametric vs nonparametric
Semi-parametric likelihood (NIPS07)
Clustering parametric nonparametric
Expectation maximization
Spectral clustering
Semi-parametric clustering (ECML07)
Probability product kernels (PPK)
Hidden Markov model kernel
Spectral clustering on PPK and Results
Multidimensional scaling on PPK and Results
Future/Upcoming Work

3
Unsupervised Learning

Parametric methods (sufficient stats, e-family)
do not grow with data
Density Estimation Maximum likelihood
Clustering Expectation maximization
Visualization hidden variables (GTM)
Models mixtures, Bayes nets, hidden Markov
models
Nonparametric frequentist methods
grow with data
Density Estimation Parzen, l1 fitting, infinite
mixture
Clustering spectral clustering
Visualization kNN, multidimensional scaling,
LLE
Models kernels, distance metrics, graphs on data

4
Density Estimation

Density estimation most generally, given samples
find
Nonparametric assumes independently distributed
id
Parametric assumes independent identically
distributed iid
Can we combine the two? Semiparametric density
(NIPS)
kernel pulls models together

5
Density Estimation

Nonparametric estimate
Parametric estimate
Semi-parametric estimate
probability kernel pulls models together

6
Probability Product Kernel

Natural similarity measure between 2
distributions
To compute the kernel for a pair of inputs
1) Estimate Densities (maximum likelihood ML)
2) Kernel
Probability Product Kernel uses either
Non-negative, latter is 1 if pp
Measures overlap of two distributions, pulls
pairs together

c
c
7
Probability Product Kernel

For exponential family
The kernel is
For Gaussian case, get RBF

8
Probability Product Kernel

For hidden Markov models
The brute-force kernel is exponential work

9
Probability Product Kernel

Instead of brute force cross product, use
forward-backward
Only compute sub-kernels y for common parents
Forms clique functions and sum via junction tree

10
Probability Product Kernel

PPK for 2 Gaussian HMMs with states S U
Get SxU interaction table between all pairs
of emissions
Then simple pseudo-code

state prior
transition
11
Clustering

Parametric clustering (EM mixture model)
local minima
strict shape assumptions
Nonparametric clustering (spectral cut, maxcut)
global optimum
no parametric assumptions
instead kernel tweaking
Semiparametric clustering (probabilty kernel
pulls models)
makes parametric (Markov)
assumption about each
datum but not about
overall cluster shapes

12
Parametric EM Clustering

Parametric clustering
E Given two models (one per class), get
responsibility for xn
M Maximize expected complete likelihood
What if each x is a sequence? Cluster two HMM
models.
Just extend EM to HMM mixture with hidden state
trellis
(Alon Sclaroff)
E
M

13
Parametric EM Clustering

EM clustering works well if we have a true
mixture
Problem
what if we dont have a mixture of 2 Gaussians
or HMMs?
example sequences are from two slowly drifting
HMMs

14
Nonparametric Spectral Clustering

Spectral clustering is agnostic about shape of
clusters!
Popular one is stabilized clustering (Ng, Weiss,
Jordan)
Get top eigenvectors of normalized Laplacian
LD-1/2AD-1/2
Usually use RBF affinity
What if each datum is timeseries? Can use
Yin-Yang kernel
But how to use parametric assumptions on each
datum?
For example extend so each datum is a 2-state
HMM?

15
Motion Capture

Rotating walk/run motion data
Each sequence is a 2-state HMM
But each cluster shape is circlular

16
Spectral Clustering with PPK

For each time series,
parametrically learn an HMM
2) Compute kernel between all pairs of HMMs
3) Nonparametric spectral clustering or embedding
(MDS)

17
Spectral Clustering with PPK

Algorithm for spectral clustering HMM models

18
Clustering MOCAP

Starting with a single movie of walk and run
Generate several rotated versions of each
Two clusters of sequences walk and run
Used 2-state Gaussian HMMs in SC-PPK
Get 2 circlular clusters better than EM Time
Series Kernel

19
Clustering MOCAP

Built dataset from sequences of motion
Two motion categories mixed with several
sequences
of each (1 sequence 123 dimensional time
series)
Used 2-state Gaussian emission HMMs
Spectral cluster to predict classes
Number in parentheses is the subject

20
Clustering Arabic Characters

Dataset is example sequences of two different
characters
About 20-30 examples per class
Each sequence is a 2 dimensional time series
Used 2-state Gaussian emission HMMs

21
Clustering Sign Language

Sign language dataset, each sign is a time series
Have two categories of expressions
Used multi-state HMMs with Gaussian emissions

22
Clustering Network Traces

Clustering network hosts in Columbia CS
department
Features of packets per port per hour over 24
hours
Fit an HMM to each host and cluster them
Example cluster (hosts in cluster their packet
volume)
All are web servers, NFS or database servers.

( 1) 128.59.20.66 zinc.cs.columbia.edu.
num packets 75707059 ( 2) 128.59.20.227
planetlab2.cs.columbia.edu. num packets
43710510 ( 3) 128.59.21.157
bagpipe.cs.columbia.edu. num packets
42139618 ( 4) 128.59.16.20 cs.columbia.edu.
num packets 39047751 ( 5) 128.59.16.108
hellfire.cs.columbia.edu. num packets
39019003 ( 6) 128.59.23.17
manycore.cs.columbia.edu. num packets
38135241 ( 7) 128.59.22.220
nemo.cs.columbia.edu. num packets 26873532
( 8) 128.59.18.100 ober.cs.columbia.edu.
num packets 25070903 ( 9) 128.59.22.184
db-pc03.cs.columbia.edu. num packets
24431779 ( 10) 128.59.16.101
ground.cs.columbia.edu. num packets 23581185
( 11) 128.59.16.145 flame.cs.columbia.edu.
num packets 19861350 ( 12) 128.59.21.33
bosch.cs.columbia.edu. num packets 17715535
23
Clustering MOCAP ? Runtime