Title: Classification
1Classification Clustering
- Pieter Spronck
- http//www.cs.unimaas.nl/p.spronck
2Binary Division of Marbles
3Big vs. Small
4Transparent vs. Opaque
5Marble Attributes
- Size (big vs. small)
- Transparency (transparent vs. opaque)
- Shininess (shiny vs. dull)
- Colouring (monochrome vs. polychrome)
- Colour (blue, green, yellow, )
6Grouping of Marbles
7Marbles
8Honouring All Distinctions
9Colour Coding
10Natural Grouping
11Types of Clusters
- Uniquely classifying clusters
- Overlapping clusters
- Probabilistic clusters
- Dendrograms
12Uniquely Classifying Clusters
13Overlapping Clusters
14Probabilistic Clustering
15Dendrogram
transparent
opaque
not clear
clear
16Classification
- Ordering of entities into groups based on their
similarity - Minimisation of within-group variance
- Maximisation of between-group variance
- Exhaustive and exclusive
- Principal technique clustering
17Reasons for Classification
- Descriptive power
- Parsimony
- Maintainability
- Versatility
- Identification of distinctive attributes
18Typology vs. Taxonomy
- Typology conceptual
- Taxonomy empirical
19Typology
- Define conceptual attributes
- Select appropriate attributes
- Create typology matrix (substruction)
- Insert empirical entities in matrix
- Extend matrix if necessary
- Reduce matrix if necessary
20Defining Conceptual Attributes
- Meaningful
- Focus on ideal types
- Order of importance
- Exhaustive domains
21Conceptual Marble Attributes
22Typology Matrix
23Matrix Extension
24Reduction
- Functional reduction
- Pragmatic reduction
- Numerical reduction
- Reduction by using criterion types
25Functional Reduction
26Functionally Reduced Matrix
27Pragmatic Reduction
28Pragmatically Reduced Matrix
29Criticising Typological Classification
- Reification
- Resilience
- Problematic attribute selection
- Unmanageability
30Taxonomy
- Define empirical attributes
- Select appropriate attributes
- Create entity matrix
- Apply clustering technique
- Analyse clusters
31Empirical Attributes
32Selecting Attributes
- Size (big/small)
- Colour (yellow, green, blue, red, white)
- Colouring (monochrome/polychrome)
- Shininess (shiny/dull)
- Transparency (transparent/opaque)
- Glass colour (clear, green, )
33Entity Matrix
34Automatic Clustering Parameters
- Agglomerative vs. divisive
- Monothetic vs. polythetic
- Outliers permitted
- Limits to number of clusters
- Form of linkage (single, complete, average)
-
35Automatic Clustering
36Polythetic to Monothetic
NNN polychrome, dull, opaque
NYYN small, monochrome,shiny, opaque
NYYY small, monochrome,shiny, transparent
NYY polychrome, shiny, transparent
37Analysing Clusters
small, monochrome,shiny, transparent
small, monochrome,shiny, opaque
polychrome, dull, opaque
Stone
polychrome, shiny, transparent
Vanilla
Classic
Tiger
38Criticising Taxonomical Classification
- Dependent on specimens
- Difficult to generalise
- Difficult to label
- Biased towards academic discipline
- Not the last word
39Typology vs. Taxonomy
40Operational Classification
- Typology
- (conceptual)
- Taxonomy
- (empirical)
- Operational typology
- (conceptual
- empirical)
41Automated Clustering Methods
- Iterative distance-based clustering the k-means
method - Incremental clusteringthe Cobweb method
- Probability-based clusteringthe EM algorithm
42k-Means Method
- Iterative distance-based clustering
- Divisive
- Polythetic
- Predefined number of clusters (k)
- Outliers permitted
43k-Means (pass 1)
k 2 attributes size (big/small), colouring
(monochrome/polychrome), shininess (shiny/dull),
transparency (transparent/opaque)
?
?
44k-Means (pass 2)
k 2 attributes size (big/small), colouring
(monochrome/polychrome), shininess (shiny/dull),
transparency (transparent/opaque)
Cluster average small, polychrome, dull, opaque
Cluster average small, monochrome, shiny,
transparent.
45k-Means (pass 3)
k 2 attributes size (big/small), colouring
(monochrome/polychrome), shininess (shiny/dull),
transparency (transparent/opaque)
Cluster average big, polychrome, dull, opaque
?
Cluster average small, monochrome, shiny,
transparent.
46Cobweb Algorithm
- Incremental clustering
- Agglomerative
- Polythetic
- Dynamic number of clusters
- Outliers permitted
47Cobweb Procedure
- Builds a tree by adding instances to it
- Uses a Category Utility function to determine the
quality of the clustering - Changes the tree structure if this positively
influences the Category Utility (by merging nodes
or splitting nodes) - Cutoff value may be used to group sufficiently
similar instances together
48Category Utility
- Measure for quality of clustering
- The better the predictive value of the average
attribute values of the instances in the clusters
for the individual attribute values, the higher
the CU will be
49Category Utility for Size (1)
C1
C2
CU (d((a2c2)(e2g2))h((b2c2)(f2g2)))/2 0
50Category Utility for Size (2)
C1
C2
CU (d((a2c2)(e2g2))h((b2c2)(f2g2)))/2
((1/2)((1/3)(1/3))(1/2)((1/9)(5/9)))/
2 1/9
51Category Utility for Size (3)
C1
C2
a) PrsizebigC1 1 b) PrsizebigC2 0 c)
Prsizebig 1/3 d) PrC1 1/3
e) PrsizesmallC1 0 f) PrsizesmallC2
1 g) Prsizesmall 2/3 h) PrC2 1/2
CU (d((a2c2)(e2g2))h((b2c2)(f2g2)))/2
((1/3)((8/9)(4/9))(2/3)((1/9)(5/9)))/
2 2/9
52Cobweb Example
attributes size (big/small), colouring
(monochrome/polychrome), shininess (shiny/dull),
transparency (transparent/opaque)
53Cobweb Result Example
attributes size (big/small), colouring
(monochrome/polychrome), shininess (shiny/dull),
transparency (transparent/opaque)
54Cobweb Numerical
- Probability of values of attributes of instances
in a cluster is based on the standard deviation
from the estimate for the mean value - Acuity is presumed variance in attribute values
55Disadvantages of Previous Methods
- Fast and hard to judge
- Dependent on initial setup
- Ad-hoc limitations
- Hard to escape from local minima
56Probability-based Clustering
- Finite mixture models
- Each cluster is defined by a vector of
probabilities for instances to have certain
values for their attributes, and a probability
for instances to reside in the cluster. - Clustering equals searching for optimal sets of
probabilities for a sample set
57Expectation-Maximisation (EM)
- Probability-based clustering
- Divisive
- Polythetic
- Predefined number of clusters (k)
- Outliers permitted
58EM Procedure
- Select k cluster vectors randomly
- Calculate cluster probabilities for each instance
(under the assumption that the instance
attributes are independent) - Use calculations to re-estimate values
- Repeat until increase in quality becomes
negligible
59EM Result Example
pC10.2 pbig0.6 pmonochrome0.3 pshiny0.4 ptrans
parent0.4
pC20.8 pbig0.2 pmonochrome0.8 pshiny0.9 ptran
sparent0.5
60The Essence of Classification
- A successful classification defines fundamental
characteristics - A classification can never be better than the
attributes it is based upon - There is no magic formula