Cluster Analysis - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Cluster Analysis

Description:

When we don't have 'some variables', we can still form groups using ... Theoretical model: Multinomial logit. Conceptual appeal being grounded in economic theory ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 10
Provided by: jeann173
Category:

less

Transcript and Presenter's Notes

Title: Cluster Analysis


1
Cluster Analysis
  • Objectives
  • ADDRESS HETEROGENEITY
  • Combine observations into groups or clusters such
    that groups formed are homogeneous (similar)
    within the group and heterogeneous (different)
    from other groups on some variables (?).
  • When we dont have some variables, we can still
    form groups using Multidimensional Scaling (MDS)
    Techniques.
  • MDS - continuous Space
  • Cluster - discrete groups
  • Main Application in Marketing Market
    Segmentation
  • Data requirement generally interval or ratio
    (ordinal and nominal ??)
  • Steps
  • Decide on measures of distance (similarity or
    dissimilarity)
  • Hierarchical Cluster decide on how to combine
    observations
  • Non-hierarchical cluster (K-means or quick
    cluster)
  • Interpretation of clusters
  • How many clusters
  • Cluster validation

2
Cluster AnalysisMeasures of Distance
Similarity or Dissimilarity
  • Two types of measures of distance ( or proximity,
    similarity)
  • Direct we shall use in MDS
  • Indirect
  • Derived from original variables or factor scores
  • Indirect Measures of distance
  • Non-metric we shall use in MDS
  • Metric Data
  • Euclidean Distance
  • Minkowski Distance
  • Mahalanobis Distance
  • Distance between BMW and Ford

iBMW jFord k nos. variables
Euclidean
Minkowski
v2
Mahalanobis
ED
v1
3
Cluster AnalysisHierarchical Clustering
  • Methods to combine observations
  • Centroid
  • Nearest Neighbor or single linkage
  • Farthest-neighbor or complete linkage
  • Average linkage
  • Wards
  • Centriod Method

Dendogram
distance
Data should be scaled?
s1
s2
s3
s4
s5
s6
Nearest neighbor
4
Cluster AnalysisNon-Hierarchical Clustering
  • K-Means Cluster/ Quick Cluster
  • The data are divided into k-groups each group
    representing a cluster
  • STEPS
  • Select k initial cluster centroids, the number of
    cluster desired
  • Assign each observation to the cluster to which
    it is closest
  • Reassign or relocate each observation to one of
    the k clusters according to predetermined
    stopping rule

Say we want 3 clusters and first 3 observations
are centroids
Change criterion Continue if gt 2
Which Clustering Method is Best? 1. Hierarchical
Which one to use? Advantage no prior
knowledge of nos. of clusters, Disadvantage
Once assigned, no reassignment 2. K-Means / Quick
Cluster require prior knowledge, how many
clusters? Complementary Run Hierarchical,
decide on no of clusters, Run K -Means
5
Interpretation of Clusters
  • .
  • Pseudo F

6
Cluster AnalysisValidation
Cross-validation
  • .

S1 assignment based on cluster on 1-14 cases S2
assignment based on separate cluster
Example from Text
Hit rate 112/151 74
7
  • Latent Segments Model to Incorporate
    Heterogeneity

8
Introduction
  • Customer segmentation - partition consumers into
    homogeneous groups that differ in purchasing
    behavior
  • It provides information about consumer
    preferences and market structure at segment level
  • Consumers with similar socio-demographics have
    different purchasing behavior
  • Brand choice probabilities can be used to define
    both market segment and market structure
  • Theoretical model Multinomial logit
  • Conceptual appeal being grounded in economic
    theory
  • Analytical tractability and ease of econometric
    estimation
  • Excellent Empirical performance

9
  • Kamakura and Russell (1989) propose and test
    latent segmentation.
  • Number of applications and numerous citation,
    200
  • Discrete interpretation of continuous
    distribution.
  • Number of useful applications in Marketing and
    other areas.
  • In our own work used to determine size of price
    sensitive segment (25 to 35).
Write a Comment
User Comments (0)
About PowerShow.com