Segmentation and Profiling using SPSS for Windows

1 / 44

About This Presentation

Title:

Segmentation and Profiling using SPSS for Windows

Description:

Trying to make sense of the data or find patterns. Iterative techniques. If it does not make business sense then it is not a good model! Segmentation in SPSS ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 45

Provided by: kategr9

more less

Transcript and Presenter's Notes

Title: Segmentation and Profiling using SPSS for Windows

1
Segmentation and Profiling using SPSS for
Windows

Kate Grayson

2
Why Segmentation?

Used by e.g. retail and consumer product
companies
Trying to learn about and describe their
customers' buying habits, gender, age, income
level, etc.
These companies tailor their marketing and
product development strategies to each consumer
group to increase sales and build brand loyalty.
A valuable approach in Market Research, and SPSS
offers some useful tools to facilitate this
commercial process

3
Segmentation in SPSS

Most of the techniques for segmentation and
profiling are exploratory
There is no right or wrong answer, and the
results are open to interpretation
Trying to make sense of the data or find patterns
Iterative techniques
If it does not make business sense then it is not
a good model!

4
Segmentation in SPSS

Techniques include
Factor Analysis / Principal Components Analysis
Hierarchical Clustering
K-Means Cluster
Non-Linear Principal Components Analysis
(PRINCALS/CATPCA)
The new Two-Step Cluster

5
Which Technique to Use?
Cluster Analysis
Categories
Factor Analysis
Exploratory
Confirmatory
Discriminant Analysis
AnswerTree
6
Which Test to use?

Factor Analysis - to find patterns within
variables
Categories - use if data doesnt fit assumptions
for Factor Analysis
Cluster Analysis - to find patterns between
individuals
Two-Step Cluster To use with both categorical
and continuous variables
Discriminant Analysis - to look for differences
between groups, try to predict target variable
AnswerTree - combinations of data, to predict
target

7
Multivariate Analysis

These techniques are inter-related, but dont
have to use all of them
Can use a combination of these techniques to
segment the data

8
Main Considerations

Looking for patterns or trying to make
predictions?
Levels of Measurement of the data (categorical or
continuous)
Sample size
Missing values
Does data fulfil assumptions for test?

9
Before you start. .. Check
your data!
10
Handling Missing Data

Check before analysis for any patterns within
missing data
Check before analysis that missing values are
defined as missing - otherwise may compromise the
model
Be aware that most segmentation techniques ignore
any cases with missing values - so may have less
usable data than you think!

11
Variable and Value Labels.

It is worth checking the labels on your file
SPSS may truncate long variable and value labels
in the output, making it difficult to interpret
the output
Make sure all the useful information is at the
beginning of the variable and value labels - so
even if they are truncated, the output is still
easy to read

12
Data Coding

Check the direction of the coding scheme, and
maybe consider re-coding the data if the codes
are counter-intuitive
e.g. if have a rating scale that ranges from high
to low, rather than low to high
... it can be difficult to interpret output and
factor scores etc. once the data has been through
several transformations

13
Sample Data

Data usage of underarm deodorants for men
Three brands tested
Rambo the current market leader
Brad second most popular
Clint recently launched product

14
Profiling the Customers..

Clint isnt selling as well as was hoped, so
the research aims to find out
Who is buying Clint?
What sort of characteristics do they share?
Who is buying the other deodorants tested?
How might the marketing campaign be changed to
ensure that the correct market is targeted?

15
Data Collected

Ratings of a range of lifestyle attribute
questions, e.g. I tend to own the most
up-to-date products, My family is most
important thing in my life, I prefer to dress
and entertain casually etc. (34 of these)
Demographics age, type of work, exercise etc.
Brand of D/O usually use
How see yourself in relation to others, e.g.
What makes you distinctive from your friends

16
Segmentation the steps

Run Principal Components Analysis on attribute
rating questions, to see if any underlying
dimension in the variables
Check using Discriminant Analysis to see if these
dimensions help predict brand used
Run Cluster Analysis to see if can find
similarities between cases
Decide if other variables need to be included,
e.g. categorical demographics
Run Two-Step Cluster using all variables

17
Factor Analysis
18
Factor Analysis what is it?

Looks for relationships between continuous
variables (based on correlations), in this case
attribute rating questions
Derives underlying constructs or dimensions in
the data
Tries to reduce a large number of variables to a
small number of factors which explain most of the
variance in the data
If cant interpret the resulting solution then no
good!

19
Run Principal Components Analysis on 34 rated
attributes
20
Factor Analysis Results

The best solution produced 9 factors, interpreted
below
F1 High computer use
F2 Rules, need to conform
F3 Party animal
F4 Family man
F5 Likes new products, experiments
F6 Likes pampering, pays more for trusted brands
F7 Cautious, follower rather than leader for new
products
F8 Relaxed, casual
F9 Home loving

21
Do these factors help?

Run Discriminant Analysis to see if can predict
D/O used

22
Factor Analysis Results

The factors are good at predicting Rambo usage,
but not at differentiating between Brad and
Clint
So try instead investigating relationships
between cases using Cluster Analysis
Options for clustering are
Hierarchical Cluster
K-Means Cluster
Two-Step Cluster

23
Hierarchical Cluster

This is often thought of as the proper cluster
method
Looking for natural groupings within the data
Bases groupings upon the similarity or
dissimilarity between cases, rather than
variables
Very iterative technique time consuming!

24
Clustering Data - Diagram
data point one case
25
Decisions before Cluster

Which variables to use?
Which distance measures between cases to use?
Which criteria for creating clusters to choose?
NB
The quality of the analysis will always depend
upon the variables used
Cluster Analysis will always find a solution!
It is not possible to assess in the analysis
itself how appropriate a variable is

26
Stages of Hierarchical Cluster

Select variables for analysis (carefully!)
Build and assess model
Save cluster membership
If required, create cluster matrix for K-Means
NB
Because based on cases, need to make sure data is
measured on same scale - if not, data should be
standardized

27
Run Hierarchical Cluster Analysison Saved
Factor Variables
28
Decision with D/O Data

I cant get a very good (i.e. useful to the
business) model from Hierarchical Cluster
analysis
Also, I want to be able to include both
categorical and continuous variables in the same
model
So I decide to use Two-Step Cluster instead

29
Two-Step Cluster
30
Two-Step Cluster

The TwoStep Cluster Analysis procedure is an
exploratory tool designed to reveal natural
groupings (or clusters) within a data set that
would otherwise not be apparent.
The algorithm employed by this procedure has
several features that differentiate it from
traditional clustering techniques
The ability to create clusters based on both
categorical and continuous variables.
Automatic selection of the number of clusters.
The ability to analyze large data files
efficiently.

31
TwoStep Cluster

Uses scalable cluster analysis algorithm
This algorithm can handle both continuous and
categorical variables or attributes and requires
only one data pass in the procedure
The first step of the procedure pre-clusters the
records into many small sub-clusters
Then it clusters the sub-clusters created in the
pre-cluster step into the desired number of
clusters
If the desired number of clusters is unknown,
TwoStep Cluster analysis automatically finds the
proper number of clusters

32
Two-Step Cluster

This is unlike other clustering methods in SPSS -
if the desired number of clusters is unknown,
TwoStep Cluster analysis automatically finds the
proper number of clusters
Or you can pre-specify the number of clusters
required - flexibility

33
Run Two-Step Cluster Analysison Saved Factor
Variablesand Categorical Variables
34
(No Transcript)
35
(No Transcript)
36
Link to more information

More useful information about Two-Step Cluster
can be found at the following websites
http//www.rrz.uni-hamburg.de/RRZ/Software/SPSS/Al
gorith.120/twostep_cluster.pdf
NB This was the handout for the talk, with
algorithm etc.
Also useful
http//www.spss.com/pdfs/S115AD8-1202A.pdf
http//www.norusis.com/pdf/SPC_v13.pdf

37
Some of the output producedby the Two-Step
Cluster Analysis is reproduced in thenext few
slides
38
Brand usually use by Cluster

Clint spray seems to be associated with Cluster
6, with the roll-on version being associated with
Clusters 4 and 2

39
Employment Status by Cluster

Cluster 2 (Clint roll-on) is largely made up of
part-time, retired and not working respondents,
Cluster 4 also has a high number of retired
respondents, while Cluster 6 Clint spray) also
has a high percentage of part-time and unemployed.

40
Age Group by Cluster

Cluster 2 (Clint roll-on) is largely made up of
the younger and older age groups, Cluster 4 also
has a high percentage of older respondents.
Cluster 6 is more from 25 years upwards

41
Cluster 4 (Clint roll-on) has below average
computer use and need to conform, above
average on Home Loving Family Man
42
Cluster 6 (Clint spray) has above average
scores on Relaxed, Casual but not much else
this is Mr Laid Back!
43
Summary of Findings

Profiling of this data suggests that Clint is
not targeting the expected market
Clint is often not seen as sufficiently
different from Brad, it has no perceived USP
Clint is being used by a high percentage of
older, retired, and part-time or not employed
consumers, which may be a result of the
aggressive product launch campaign with free
samples, discounted prices etc.
Clint marketing needs some more work!

44
Summary of Segmenting and Profiling this data
using SPSS

Principal Components Analysis helped investigate
relationships between the rated attribute
variables
Hierarchical Cluster was used to try and find
similarities between cases, using the factors
derived from PCA
Two-Step Cluster was then used to enable
clustering of both continuous and categorical
variables in the same model
Useful conclusions were drawn about the market
positioning of Clint deodorant