A MixedInteger Programming Approach to Customer Segmentation Problem - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

A MixedInteger Programming Approach to Customer Segmentation Problem

Description:

Burcu Saglam, F.Sibel Salman, Metin T rkay {bsaglam,ssalman,mturkay}_at_ku.edu.tr ... optimization setting, an objective function can be defined such as the ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 29

Provided by: bur101

Category:

more less

Transcript and Presenter's Notes

Title: A MixedInteger Programming Approach to Customer Segmentation Problem

1
A Mixed-Integer Programming Approach to
Customer Segmentation Problem

Burcu Saglam, F.Sibel Salman, Metin Türkay
bsaglam,ssalman,mturkay_at_ku.edu.tr
Dept. of Industrial Engineering
Serpil Sayin
ssayin_at_ku.edu.tr
Dept. of Business Administration
June 20, 2004
ESI 2004, METU, Ankara

2
Koç University, Istanbul
www.ku.edu.tr www.eng.ku.edu.tr
3
Outline

Introduction
Clustering Problem
Clustering Approaches
Motivation of the Study
Proposed Model
Illustrative Example
Evaluation in Real World Scenario
Conclusions and Future Work

4
Introduction

This study presents one new mathematical
programming-based segmentation model that is
applied to a digital platform companys customer
database

5
Digiturk

Private digital platform
Eager to find out opportunities in customer
relationship management, such as onetoone
marketing

6
Digiturk

Pay-Per-View Services
Vision halls
Football matches
Erotic channels
Interactive Events
Banking
TV Games, etc...

Products
Standard package
Sports package
Cinema package
Super package
Mega package

7
Why Data Mining, Segmentation?

Analysis of large data collections, huge
databases
Understanding needs, desires, and expectations of
the customers
Grouping the ongoing and potential customers
Hidden patterns and knowledge within the data
Segmentation is applied when there is a need to
partition the instances into natural groups

8
Clustering Analysis

A data mining technique developed for the purpose
of identifying groups of entities that are
similar to each other with respect to certain
characteristics
Dividing heterogeneous sets of data into smaller
and homogeneous ones
Evaluating the result and performance of a
supervised learning model
Analyzing the set of input attributes
Determining outliers

9
Clustering Problem

Given a data set with n data items in
m-dimensions
Partition the data into k clusters
In an optimization setting, an objective function
can be defined such as the minimization of the
sum of 1-norm distances between each data point
and the center of the cluster which it belongs to
(Bradley et al., 1997)

10
Main Considerations

The term similarity
Exclusive, overlapping, probabilistic or fuzzy
clusters
Iterative or non-iterative
Hierarchical or non-hierarchical
Distance-based, probability-based approaches,
graph theoretic methods, continuous-discrete
optimization, ...

11
Analytical Clustering Methods

Hierarchical - number of clusters is not assumed
to be known a priori
Divisive and agglomerative methods
Once an assigment is made, it is irrevocable
Well known BIRCH algorithm

12
Analytical Clustering Methods

Nonhierarchical - number of clusters is known a
priori
Initially data is divided into k partitions where
each partition represents a cluster
Two main decisions
Selection of the initial cluster centroids
Assignment of the instances to clusters
Sensitive to initial partitions
Too many local minima
K-Means, K-Medoids, CLARANS, etc. ...

13
Classical K-Means

Iterative distance-based
Works in numeric domains
Partitions instances into disjoint clusters
Two steps
Assignment
Updating the cluster centers
Works well when the candidate clusters are
approximately equal size

14
Shortcomings of K-Means

Solution is local minima
Convergence to a local optima is proved
Sensitivity to initially selected cluster centers
Worst case time complexity is stated to be
exponential
To find a global minima, the algorithm has to be
repeated several times
Impossible to interpret which attributes are
significant

15
Motivation of the Study

Considering the limitations of existing
clustering approaches and algorithms, an exact
non-hierarchical distance-based clustering
algorithm is proposed

16
Proposed Approach

Given a data set of n data items in m-dimensions
Aim is to find the optimum partitioning of the
data set into k exclusive clusters
Objective function Minimization of the maximum
diameter of the generated clusters
Number of clusters is known a priori

17
MIP-Max Model
Minimize
s.t.
18
MIP-Max

O(kn) variables and O(kn2) constraints
Non-hierarchical
Not iterative
No need for an initial solution
Global optimum

19
Illustrative Example
20
Comparisons with the Results of the K-Means
21
Comparisons with the Results of the K-Means
22
Evaluation in Real World Scenario

Data set includes demographic and transactional
information
Each row represents a unique customer
18 real-valued and categorical attributes

23
Experiments with MIP-Max Model
k 2
CPU times are reported for a computer with a
Pentium IV processor at 2.56 GHz and 1GB memory.
24
Comparison of MIP-Max with K-Means and
Interpretations

The result of the approach is compared with the 3
cluster solution and 100 data items
MIP-Max model grouped 39 instances in the first
cluster, 34 instances in the second cluster and
27 instances in the third cluster
K-Means generated clusters 1, 2 and 3
respectively with 59, 22, and 19 instances
Interpretations based on predictiveness score

25
Predictiveness Score

Given class C and attribute A with values v1,
v2, v3vn, an attribute-value predictiveness
score for vi is defined as the probability an
instance resides in C given the instance has
value vi for A.
Between-class measure
Actually for categorical attibutes, most of our
attributes are in nominal form
An attribute has a distinguishing power in one
cluster if its predictiveness scores are higher
than 75

26
Predictiveness Score
27
Conclusions and Future work

The sensitivity of the K-Means to initial
solution is analyzed
The interpretations of MIP-Max model is more
meaningful than K-Means, sports package
subscriber grouped in one group, etc. ...
MIP-Max is significantly better than K-Means in
terms of quality and stability of the solutions
Future Work
Improvement of run times
Determination of number of clusters

28
Thanks ? Welcome to Any Questions ?
29
Clustering Approaches

Hierarchical and non-hierarchical clustering
methods
Classical K-Means
Cobweb
Clarans
Birch
Advantages and disadvantages
Motivation of the study

30
COBWEB

Conceptual clustering technique
Forms a hierarchy to capture knowledge
Deals only with categorical (nominal) data
Cluster quality namely Category Utility
Expensive to compute this measurement
Instance ordering have impact on the resulting
clustering.

31
CLARANS

A type of K-Medoids algorithm, differs by its
randomized partial search strategy,
Clustering problem is represented by graph,
Limitations
Convergence to a definite local minimum,
Efficiency considerations.

32
BIRCH