Clustering Algorithms

About This Presentation

Title:

Clustering Algorithms

Description:

Introduction to Hierarchical Clustering Analysis Dinh Dong Luong Introduction Data clustering concerns how to group a set of objects based on their similarity of ... – PowerPoint PPT presentation

Number of Views:502

Avg rating:3.0/5.0

Slides: 25

Provided by: Johan171

Category:

more less

Transcript and Presenter's Notes

Title: Clustering Algorithms

1
Introduction to Hierarchical Clustering Analysis
Dinh Dong Luong
2
Introduction

Data clustering concerns how to group a set of
objects based on their similarity of attributes
and/or their proximity in the vector space.
Main methods
Partitioning K-Means
Hierarchical BIRCH,ROCK,
Density-based DBSCAN,
A good clustering method will produce high
quality clusters with
high intra-class similarity cohesive within
clusters
low inter-class similarity distinctive between
clusters

3
Stages in clustering
4
Clustering Algorithms

A. Distance and Similarity Measures
B. Hierarchical Clustering
Agglomerative
Single linkage, complete linkage, group average
linkage, median linkage, centroid linkage,
balanced iterative reducing and clustering using
hierarchies (BIRCH), clustering using
representatives (CURE), robust clustering using
links (ROCK)
Divisive
divisive analysis (DIANA), monothetic analysis
(MONA)

5
Distance and Similarity Measures
6
Similarity Measurements

Pearson Correlation

Two profiles (vectors)
and
1 ? Pearson Correlation ? 1
7
Similarity Measurements

Pearson Correlation Trend Similarity

8
Similarity Measurements

Euclidean Distance

9
Similarity Measurements

Euclidean Distance Absolute difference

10
Similarity Measurements

Cosine Correlation

1 ? Cosine Correlation ? 1
11
Similarity Measurements

Cosine Correlation Trend Mean Distance

12
Similarity Measurements
13
Similarity Measurements
Similar?
14
Taxonomy of Clustering Approaches
15
Hierarchical Clustering

Agglomerative clustering treats each data point
as a singleton cluster, and then successively
merges clusters until all points have been merged
into a single remaining cluster. Divisive
clustering works the other way around.

16
Hierarchical Clustering
Calculate the similarity between all possible
combinations of two profiles

Keys
Similarity
Clustering

Two most similar clusters are grouped together to
form a new cluster
Calculate the similarity between the new cluster
and all remaining clusters.
17
General agglomerative clustering
18
Clustering
C1
Merge which pair of clusters?
C2
C3
19
Clustering
Single Linkage
Dissimilarity between two clusters Minimum
dissimilarity between the members of two clusters

C2
C1
20
Clustering
Complete Linkage
Dissimilarity between two clusters Maximum
dissimilarity between the members of two clusters

C2
C1
21
Clustering
Average Linkage
Dissimilarity between two clusters Averaged
distances of all pairs of objects (one from each
cluster).

C2
C1
22
Clustering
Average Group Linkage
Dissimilarity between two clusters Distance
between two cluster means.

C2
C1
23
My Idea Presentation
24
Future Work

Step 1 Use a simple hierarchical algorithms with
moment features to run and evaluate clustering
results.
Step 2 Find out good features for clustering on
our dataset by trying some feature variance
(Haar-like, shape quantization,).
Step 3 Choose an optimal hierarchical clustering
algorithm

Write a Comment

User Comments (0)