Clustering with kmeans and mixture of Gaussian densities - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Clustering with kmeans and mixture of Gaussian densities

Description:

Divide space of region descriptors in a collection of non-overlapping cells ... Assignment of data point xn, to one of the clusters: zn ... – PowerPoint PPT presentation

Number of Views:66

Avg rating:3.0/5.0

Slides: 25

Provided by: Verb6

Category:

more less

Transcript and Presenter's Notes

Title: Clustering with kmeans and mixture of Gaussian densities

1
Clustering with k-means and mixture of Gaussian
densities

Jakob Verbeek
December 4, 2009

2
Plan for this course

Introduction to machine learning
Clustering techniques
k-means, Gaussian mixture density
Gaussian mixture density continued
Parameter estimation with EM, Fisher kernels
Classification techniques 1
Introduction, generative methods, semi-supervised
Classification techniques 2
Discriminative methods, kernels
Decomposition of images
Topic models,

3
Clustering

Finding a group structure in the data
Data in one cluster similar to each other
Data in different clusters dissimilar
Map each data point to a discrete cluster index
flat methods find k groups (k known, or
automatically set)
hierarchical methods define a tree structure
over the data

4
Hierarchical Clustering

Data set is partitioned into a tree structure
Top-down construction
Start all data in one cluster root node
Apply flat clustering into k groups
Recursively cluster the data in each group
Bottom-up construction
Start with all points in separate cluster
Recursively merge closest clusters
Distance between clusters A and B
Min, max, or mean distance
between x in A, and y in B

5
Clustering example

Learn face similarity from training pairs labeled
as same/different
Cluster faces based on identity
Example picasa web albums, label face clusters

Guillaumin, Verbeek, Schmid, ICCV 2009
6
Clustering example visual words
7
Clustering for visual vocabulary construction

Clustering of local image descriptors
Most often done using k-means or mixture of
Gaussians
Divide space of region descriptors in a
collection of non-overlapping cells
Recap of the image representation pipe-line
Extract image regions at different locations and
scales randomly, on a regular grid, or using
interest point detector
Compute descriptor for each region (eg SIFT)
Assign each descriptor to a cluster center
Or do soft assignment or multiple assignment
Make histogram for complete image
Possibly separate histograms for different image
regions

8
Definition of k-means clustering

Given data set of N points xn, n1,,N
Goal find K cluster centers mk, k1,,K
Clustering assign each point to closest center
Error criterion sum of squared distances to
closest cluster center for each data point

9
Examples of k-means clustering

Data uniformly sampled in R2

Data non-uniformly sampled in R3

10
Minimizing the error function

The error function is
non-differentiable due to the min operator
Non-convex, i.e. there are local maxima
Minimization can be done with iterative algorithm
Initialize cluster centers
Assign each data point to nearest center
Update the cluster centers as mean of associated
data
If cluster centers changed return to step 2)
Return cluster centers
Iterations monotonically decrease error function

11
Iteratively minimizing the error function

Introduce latent variable zn, with value in
1,, K
Assignment of data point xn, to one of the
clusters zn
Upper-bound on error function, without min
operator
Error function and bound equal for the min
assignment
Minimize the bound w.r.t. cluster centers
Update the cluster centers as mean of associated
data

12
Iteratively minimizing the error function

Minimization can be done with iterative algorithm
Assign each data point to nearest center
Construct tight bound on error function
Update the cluster centers as mean of associated
data
Minimize bound
Example of Iterative bound optimization
EM algorithm another example

1
2
13
Examples of k-means clustering

Several iterations with two centers

Error function
14
Examples of k-means clustering

Clustering RGB vectors of pixels in images
Compression of image file N x 24 bits
Store RGB values of cluster centers K x 24 bits
Store cluster index of each pixel N x log K bits

16.7
8.3
4.2
15
Clustering with Gaussian mixture density

Each cluster represented by Gaussian density
Center, as in k-means
Covariance matrix cluster spread around center

Determinant of covariance matrix C
Quadratic function of point x and mean m
Data dimension d
16
Clustering with Gaussian mixture density

Mixture density is weighted sum of Gaussians
Mixing weight importance of each cluster
Density has to integrate to 1, so we require

17
Clustering with Gaussian mixture density

Given data set of N points xn, n1,,N
Find mixture of Gaussians (MoG) that best
explains data
Assigns maximum likelihood to the data
Assume data points are drawn independently from
MoG
Maximize log-likelihood of fixed data set X
w.r.t. parameters of MoG
As with k-means objective function has local
minima
Can use Expectation-Maximization (EM) algorithm
Similar to the iterative k-means algorithm

18
Assignment of data points to clusters

As with k-means zn indicates cluster index for xn
To sample point from MoG
Select cluster index k with probability given by
mixing weight
Sample point from the k-th Gaussian
MoG recovered if we marginalize of unknown index

19
Soft assignment of data points to clusters

Given data point xn, infer value of zn
Conditional probability of zn given xn

20
Maximum likelihood estimation of Gaussian

Given data points xn, n1,,N
Find Gaussian that maximizes data log-likelihood
Set derivative of data log-likelihood w.r.t.
parameters to zero
Parameters set as data covariance and mean

21
Maximum likelihood estimation of MoG

Use EM algorithm
Initialize MoG parameters or soft-assign
E-step soft assign of data points to clusters
M-step update the cluster parameters
Repeat EM steps, terminate if converged
Convergence of parameters or assignments
E-step compute posterior on z given x
M-step update Gaussians from data points
weighted by posterior

22
Maximum likelihood estimation of MoG

Example of several EM iterations

23
Clustering with k-means and MoG

Hard assignment in k-means is not robust near
border of quantization cells
Soft assignment in MoG accounts for ambiguity in
the assignment
Both algorithms sensitive for initialization
Run from several initializations
Keep best result
Nr of clusters need to be set
Both algorithm can be generalized to other types
of distances or densities

Images from Gemert et al, IEEE TPAMI, 2010
24
Plan for this course

Introduction to machine learning
Clustering techniques
k-means, Gaussian mixture density
Reading for next week
Neal Hinton A view of the EM algorithm that
justifies incremental, sparse, and other
variants, in Learning in graphical
models,1998.
Part of chapter 3 of my thesis
Both available on course website
http//lear.inrialpes.fr/verbeek/teaching
Gaussian mixture density continued
Parameter estimation with EM, Fisher kernels
Classification techniques 1
Introduction, generative methods, semi-supervised
Classification techniques 2
Discriminative methods, kernels