Segmentation and Profiling using SPSS for Windows

1 / 44
About This Presentation
Title:

Segmentation and Profiling using SPSS for Windows

Description:

Trying to make sense of the data or find patterns. Iterative techniques. If it does not make business sense then it is not a good model! Segmentation in SPSS ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 45
Provided by: kategr9

less

Transcript and Presenter's Notes

Title: Segmentation and Profiling using SPSS for Windows


1
Segmentation and Profiling using SPSS for
Windows
  • Kate Grayson

2
Why Segmentation?
  • Used by e.g. retail and consumer product
    companies
  • Trying to learn about and describe their
    customers' buying habits, gender, age, income
    level, etc.
  • These companies tailor their marketing and
    product development strategies to each consumer
    group to increase sales and build brand loyalty.
  • A valuable approach in Market Research, and SPSS
    offers some useful tools to facilitate this
    commercial process

3
Segmentation in SPSS
  • Most of the techniques for segmentation and
    profiling are exploratory
  • There is no right or wrong answer, and the
    results are open to interpretation
  • Trying to make sense of the data or find patterns
  • Iterative techniques
  • If it does not make business sense then it is not
    a good model!

4
Segmentation in SPSS
  • Techniques include
  • Factor Analysis / Principal Components Analysis
  • Hierarchical Clustering
  • K-Means Cluster
  • Non-Linear Principal Components Analysis
    (PRINCALS/CATPCA)
  • The new Two-Step Cluster

5
Which Technique to Use?
Cluster Analysis
Categories
Factor Analysis
Exploratory
Confirmatory
Discriminant Analysis
AnswerTree
6
Which Test to use?
  • Factor Analysis - to find patterns within
    variables
  • Categories - use if data doesnt fit assumptions
    for Factor Analysis
  • Cluster Analysis - to find patterns between
    individuals
  • Two-Step Cluster To use with both categorical
    and continuous variables
  • Discriminant Analysis - to look for differences
    between groups, try to predict target variable
  • AnswerTree - combinations of data, to predict
    target

7
Multivariate Analysis
  • These techniques are inter-related, but dont
    have to use all of them
  • Can use a combination of these techniques to
    segment the data

8
Main Considerations
  • Looking for patterns or trying to make
    predictions?
  • Levels of Measurement of the data (categorical or
    continuous)
  • Sample size
  • Missing values
  • Does data fulfil assumptions for test?

9
Before you start. .. Check
your data!
10
Handling Missing Data
  • Check before analysis for any patterns within
    missing data
  • Check before analysis that missing values are
    defined as missing - otherwise may compromise the
    model
  • Be aware that most segmentation techniques ignore
    any cases with missing values - so may have less
    usable data than you think!

11
Variable and Value Labels.
  • It is worth checking the labels on your file
  • SPSS may truncate long variable and value labels
    in the output, making it difficult to interpret
    the output
  • Make sure all the useful information is at the
    beginning of the variable and value labels - so
    even if they are truncated, the output is still
    easy to read

12
Data Coding
  • Check the direction of the coding scheme, and
    maybe consider re-coding the data if the codes
    are counter-intuitive
  • e.g. if have a rating scale that ranges from high
    to low, rather than low to high
  • ... it can be difficult to interpret output and
    factor scores etc. once the data has been through
    several transformations

13
Sample Data
  • Data usage of underarm deodorants for men
  • Three brands tested
  • Rambo the current market leader
  • Brad second most popular
  • Clint recently launched product

14
Profiling the Customers..
  • Clint isnt selling as well as was hoped, so
    the research aims to find out
  • Who is buying Clint?
  • What sort of characteristics do they share?
  • Who is buying the other deodorants tested?
  • How might the marketing campaign be changed to
    ensure that the correct market is targeted?

15
Data Collected
  • Ratings of a range of lifestyle attribute
    questions, e.g. I tend to own the most
    up-to-date products, My family is most
    important thing in my life, I prefer to dress
    and entertain casually etc. (34 of these)
  • Demographics age, type of work, exercise etc.
  • Brand of D/O usually use
  • How see yourself in relation to others, e.g.
    What makes you distinctive from your friends

16
Segmentation the steps
  1. Run Principal Components Analysis on attribute
    rating questions, to see if any underlying
    dimension in the variables
  2. Check using Discriminant Analysis to see if these
    dimensions help predict brand used
  3. Run Cluster Analysis to see if can find
    similarities between cases
  4. Decide if other variables need to be included,
    e.g. categorical demographics
  5. Run Two-Step Cluster using all variables

17
Factor Analysis
18
Factor Analysis what is it?
  • Looks for relationships between continuous
    variables (based on correlations), in this case
    attribute rating questions
  • Derives underlying constructs or dimensions in
    the data
  • Tries to reduce a large number of variables to a
    small number of factors which explain most of the
    variance in the data
  • If cant interpret the resulting solution then no
    good!

19
Run Principal Components Analysis on 34 rated
attributes
20
Factor Analysis Results
  • The best solution produced 9 factors, interpreted
    below
  • F1 High computer use
  • F2 Rules, need to conform
  • F3 Party animal
  • F4 Family man
  • F5 Likes new products, experiments
  • F6 Likes pampering, pays more for trusted brands
  • F7 Cautious, follower rather than leader for new
    products
  • F8 Relaxed, casual
  • F9 Home loving

21
Do these factors help?
  • Run Discriminant Analysis to see if can predict
    D/O used

22
Factor Analysis Results
  • The factors are good at predicting Rambo usage,
    but not at differentiating between Brad and
    Clint
  • So try instead investigating relationships
    between cases using Cluster Analysis
  • Options for clustering are
  • Hierarchical Cluster
  • K-Means Cluster
  • Two-Step Cluster

23
Hierarchical Cluster
  • This is often thought of as the proper cluster
    method
  • Looking for natural groupings within the data
  • Bases groupings upon the similarity or
    dissimilarity between cases, rather than
    variables
  • Very iterative technique time consuming!

24
Clustering Data - Diagram
data point one case
25
Decisions before Cluster
  • Which variables to use?
  • Which distance measures between cases to use?
  • Which criteria for creating clusters to choose?
  • NB
  • The quality of the analysis will always depend
    upon the variables used
  • Cluster Analysis will always find a solution!
  • It is not possible to assess in the analysis
    itself how appropriate a variable is

26
Stages of Hierarchical Cluster
  • Select variables for analysis (carefully!)
  • Build and assess model
  • Save cluster membership
  • If required, create cluster matrix for K-Means
  • NB
  • Because based on cases, need to make sure data is
    measured on same scale - if not, data should be
    standardized

27
Run Hierarchical Cluster Analysison Saved
Factor Variables
28
Decision with D/O Data
  • I cant get a very good (i.e. useful to the
    business) model from Hierarchical Cluster
    analysis
  • Also, I want to be able to include both
    categorical and continuous variables in the same
    model
  • So I decide to use Two-Step Cluster instead

29
Two-Step Cluster
30
Two-Step Cluster
  • The TwoStep Cluster Analysis procedure is an
    exploratory tool designed to reveal natural
    groupings (or clusters) within a data set that
    would otherwise not be apparent.
  • The algorithm employed by this procedure has
    several features that differentiate it from
    traditional clustering techniques
  • The ability to create clusters based on both
    categorical and continuous variables.
  • Automatic selection of the number of clusters.
  • The ability to analyze large data files
    efficiently.

31
TwoStep Cluster
  • Uses scalable cluster analysis algorithm
  • This algorithm can handle both continuous and
    categorical variables or attributes and requires
    only one data pass in the procedure
  • The first step of the procedure pre-clusters the
    records into many small sub-clusters
  • Then it clusters the sub-clusters created in the
    pre-cluster step into the desired number of
    clusters
  • If the desired number of clusters is unknown,
    TwoStep Cluster analysis automatically finds the
    proper number of clusters

32
Two-Step Cluster
  • This is unlike other clustering methods in SPSS -
    if the desired number of clusters is unknown,
    TwoStep Cluster analysis automatically finds the
    proper number of clusters
  • Or you can pre-specify the number of clusters
    required - flexibility

33
Run Two-Step Cluster Analysison Saved Factor
Variablesand Categorical Variables
34
(No Transcript)
35
(No Transcript)
36
Link to more information
  • More useful information about Two-Step Cluster
    can be found at the following websites
  • http//www.rrz.uni-hamburg.de/RRZ/Software/SPSS/Al
    gorith.120/twostep_cluster.pdf
  • NB This was the handout for the talk, with
    algorithm etc.
  • Also useful
  • http//www.spss.com/pdfs/S115AD8-1202A.pdf
  • http//www.norusis.com/pdf/SPC_v13.pdf

37
Some of the output producedby the Two-Step
Cluster Analysis is reproduced in thenext few
slides
38
Brand usually use by Cluster
  • Clint spray seems to be associated with Cluster
    6, with the roll-on version being associated with
    Clusters 4 and 2

39
Employment Status by Cluster
  • Cluster 2 (Clint roll-on) is largely made up of
    part-time, retired and not working respondents,
    Cluster 4 also has a high number of retired
    respondents, while Cluster 6 Clint spray) also
    has a high percentage of part-time and unemployed.

40
Age Group by Cluster
  • Cluster 2 (Clint roll-on) is largely made up of
    the younger and older age groups, Cluster 4 also
    has a high percentage of older respondents.
    Cluster 6 is more from 25 years upwards

41
Cluster 4 (Clint roll-on) has below average
computer use and need to conform, above
average on Home Loving Family Man
42
Cluster 6 (Clint spray) has above average
scores on Relaxed, Casual but not much else
this is Mr Laid Back!
43
Summary of Findings
  • Profiling of this data suggests that Clint is
    not targeting the expected market
  • Clint is often not seen as sufficiently
    different from Brad, it has no perceived USP
  • Clint is being used by a high percentage of
    older, retired, and part-time or not employed
    consumers, which may be a result of the
    aggressive product launch campaign with free
    samples, discounted prices etc.
  • Clint marketing needs some more work!

44
Summary of Segmenting and Profiling this data
using SPSS
  • Principal Components Analysis helped investigate
    relationships between the rated attribute
    variables
  • Hierarchical Cluster was used to try and find
    similarities between cases, using the factors
    derived from PCA
  • Two-Step Cluster was then used to enable
    clustering of both continuous and categorical
    variables in the same model
  • Useful conclusions were drawn about the market
    positioning of Clint deodorant
Write a Comment
User Comments (0)