Clustering Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

Clustering Algorithms

Description:

Introduction to Hierarchical Clustering Analysis Dinh Dong Luong Introduction Data clustering concerns how to group a set of objects based on their similarity of ... – PowerPoint PPT presentation

Number of Views:498
Avg rating:3.0/5.0
Slides: 25
Provided by: Johan171
Category:

less

Transcript and Presenter's Notes

Title: Clustering Algorithms


1
Introduction to Hierarchical Clustering Analysis
Dinh Dong Luong
2
Introduction
  • Data clustering concerns how to group a set of
    objects based on their similarity of attributes
    and/or their proximity in the vector space.
  • Main methods
  • Partitioning K-Means
  • Hierarchical BIRCH,ROCK,
  • Density-based DBSCAN,
  • A good clustering method will produce high
    quality clusters with
  • high intra-class similarity cohesive within
    clusters
  • low inter-class similarity distinctive between
    clusters

3
Stages in clustering
4
Clustering Algorithms
  • A. Distance and Similarity Measures
  • B. Hierarchical Clustering
  • Agglomerative
  • Single linkage, complete linkage, group average
    linkage, median linkage, centroid linkage,
    balanced iterative reducing and clustering using
    hierarchies (BIRCH), clustering using
    representatives (CURE), robust clustering using
    links (ROCK)
  • Divisive
  • divisive analysis (DIANA), monothetic analysis
    (MONA)

5
Distance and Similarity Measures
6
Similarity Measurements
  • Pearson Correlation

Two profiles (vectors)
and
1 ? Pearson Correlation ? 1
7
Similarity Measurements
  • Pearson Correlation Trend Similarity

8
Similarity Measurements
  • Euclidean Distance

9
Similarity Measurements
  • Euclidean Distance Absolute difference

10
Similarity Measurements
  • Cosine Correlation

1 ? Cosine Correlation ? 1
11
Similarity Measurements
  • Cosine Correlation Trend Mean Distance

12
Similarity Measurements
13
Similarity Measurements
Similar?
14
Taxonomy of Clustering Approaches
15
Hierarchical Clustering
  • Agglomerative clustering treats each data point
    as a singleton cluster, and then successively
    merges clusters until all points have been merged
    into a single remaining cluster. Divisive
    clustering works the other way around.

16
Hierarchical Clustering
Calculate the similarity between all possible
combinations of two profiles
  • Keys
  • Similarity
  • Clustering

Two most similar clusters are grouped together to
form a new cluster
Calculate the similarity between the new cluster
and all remaining clusters.
17
General agglomerative clustering
18
Clustering
C1
Merge which pair of clusters?
C2
C3
19
Clustering
Single Linkage
Dissimilarity between two clusters Minimum
dissimilarity between the members of two clusters


C2
C1
20
Clustering
Complete Linkage
Dissimilarity between two clusters Maximum
dissimilarity between the members of two clusters


C2
C1
21
Clustering
Average Linkage
Dissimilarity between two clusters Averaged
distances of all pairs of objects (one from each
cluster).


C2
C1
22
Clustering
Average Group Linkage
Dissimilarity between two clusters Distance
between two cluster means.


C2
C1
23
My Idea Presentation
24
Future Work
  • Step 1 Use a simple hierarchical algorithms with
    moment features to run and evaluate clustering
    results.
  • Step 2 Find out good features for clustering on
    our dataset by trying some feature variance
    (Haar-like, shape quantization,).
  • Step 3 Choose an optimal hierarchical clustering
    algorithm
Write a Comment
User Comments (0)
About PowerShow.com