Based on the book

About This Presentation

Title:

Based on the book

Description:

Data Mining Applications for CRM Based on the book Building Data Mining Applications for CRM By Alex Berson Stephen Smith Kurt Thearling Data Mining ... – PowerPoint PPT presentation

Number of Views:826

Avg rating:3.0/5.0

Slides: 169

Provided by: circusofl

Category:

Tags: based | book

more less

Transcript and Presenter's Notes

Title: Based on the book

1
Data Mining Applications for CRM

Based on the book Building Data Mining
Applications for CRM
By
Alex Berson
Stephen Smith
Kurt Thearling

2
Data Mining Applications for CRM

Summary of Topics
1. Customer Relationship Management-Framework
and
Architecture
2. Reinforcing CRM with Data Mining
3. Data Mining An Overview
4. Key Terms
5. Data Mining Methodology
6. Classical Techniques Statistics,
Neighborhoods, and Clustering
7. Next Generation Techniques Trees,
Networks, and Rules
8. CRM -The Business Perspective
9. Deploying Data Mining for CRM
10. Data Quality
11. Next Generation of Information Mining and
Knowledge Discovery for Effective
CRM
12. CRM in the e-Business World

3
Topic 1 Customer Relationship Management-Framewor
k and Architecture

CRM is an enterprise approach to customer service
that uses meaningful communication to understand
and influence consumer behavior. The purpose of
the process is twofold
a To impact all aspects to the consumer
relationship (e.g., improve customer
satisfaction, enhance customer loyalty, and
increase profitability) and
B To ensure that employees within an
organization are using CRM tools. The need for
greater profitability requires an organization to
proactively pursue its relationships with
customers.

4
Customer Relationship Management-Framework and
Architecture

Which customers are most profitable to me? Why?
What promotions are most effective? For which
customers?
What kind of customers will be interested in my
new product?
What customers are at risk to defect to my
competitor?
How do I identify prospects with the greatest
profit potentials?
Customer information is rapidly becoming a
companys most
important asset to answer these questions.
However, to answer these
questions in broad generalities is not enough.
Each customer must be
analyzed and potentially treated uniquely.
Customer Relationship
Management provides the framework for analyzing
customer
profitability and improving marketing
effectiveness.

5
Customer Relationship Management -Framework and
Architecture

Many organizations have collected and stored a
wealth of data about their
customers, suppliers, and business partners.
However, the inability to
discover valuable information hidden in the data
prevents these organizations
from transforming this data into knowledge. The
business desire is, therefore, to
extract valid, previously unknown, and
comprehensible information from large
databases and use it for profits. To fulfill
these goals, organizations need to
follow these steps
- Capture and integrate both the internal and
external data into a
comprehensive view that encompasses the whole
organization.
- Mine the integrated data for information.
- Organize and present the information with
knowledge for decision-making.

6
Data, Information, and Decision

Data Resource Management (DRM)
MIS (OLTP) OOAD
KM (Knowledge Mgt), KWS (Knowledge Work Systems)
DSS ESS, EIS (Executive Information Systems)
Data Warehousing/Data Mart/Data Mining/OLAP
(Executive, Collaborative and individual
levels)
Business Intelligence

Data
Information (Data Process)
Knowledge/Business Intelligence
Decision (Information
Knowledge)
Data/Information/Decision /Business Intelligence

7
Customer Relationship Management --Framework and
Architecture

From the architecture point of view, the entire
CRM framework can
be classified into three key components
Operational CRM The automation of horizontally
integrated business processes, including customer
touch-points, channels, and front-back office
integration.
Analytical CRM- The analysis of data created by
the Operational CRM
Collaborative CRM- Applications of Collaborative
services including e-mail, personalized
publishing, e-communities, and similar vehicles
designed to facilitate interactions between
customers and organizations.

8
CRM Architecture
Business Rules and Metadata Management
Data Sources
Market Data Store
Decision Support Applications
Communication Channels
Contact History
Direct Mails
Campaign Mgt
Campaign Mgt
Call Center Call Center
Contact Mgt
Transaction History
ETL Tools
Customer Service Center
Analytics Data Mart
Data Mining Analytics
Marketing Data Marts
Customer Profile And account
Internet

E-mail
Reporting Data Mart
Reporting Data Mart
Other
External Data
Workflow Management
9
Campaign MGT Software-Managing Campaigns

Accommodation of many new touch points besides
direct mail, for ex., the Web, direct TV ad.,
hard copy advertising customer services, street
brochure dispatch, and signage.
Focus on profitability (not only on which
customer was most profitable, but also on what
was the most profitable promotion that could be
sent., e.g., send .025 postcard rather than the
25 rebate if both have the same effect).
Optimization of the sequence of promotion
delivery.
Tools for constructing experiments that allow the
marketing professionals to test out the
effectiveness of new promotions and new
segmentation techniques, for ex., using different
contents and timing for signage advertising.
Accommodation by the system of predictive
modeling from data mining , which provides
insights into future customer behavior and future
customer profitability.

10
Web-Enabled Information Delivery
Structured Content
Query Engine Analytics Drill Down Agents
Web Browser
Web Server
SQL
CGI
HTML
HTML
Unstructured Content

How about the web Log, or blog which has become
a popular source for information acquisition.
11
Topic 2 Reinforcing CRM with Data Mining

Companies worldwide are beginning to realize that
surviving an intensively competitive and global
marketplace requires closer relationships with
customers. In turn, enhanced customer
relationships can boost profitability three ways
a) by reducing costs by attracting more suitable
customers, b) by generating profits through
cross-selling and up-selling activities, and c)
by extending profits through customer retention.
Slightly expanded explanations of these
activities follow

12
Reinforcing CRM with Data Mining

Attracting more suitable customers Data
mining can help firms understand which customers
are most likely to purchase specific products and
services, thus enabling businesses to develop
targeted marketing programs for higher response
rates and better returns on investment.
Better cross-selling and up-selling
Businesses can increase their value proposition
by offering additional products and services that
are actually desired by customers, thereby
raising satisfaction levels and reinforcing
purchasing habits.
Better retention Data-mining techniques can
identify which customers are more likely to
defect and why. A company can use this
information to generate ideas that allow them to
maintain these customers.

13
DW Technologies and Tools-An Overview
Data Acquisition
Data Storage
Information Delivery
OLAP
Source Systems
Data Modeling
DW/ Data Marts
Extraction
Report Writer
Data Loading
Transformation
Staging Area
Quality Assurance
Load Image Creation
Alert Systems
Data Mining
14
DW Information Flow
15
Data Warehouse Database

The central data warehouse database is a
cornerstone of data warehousing
environment. On the architecture diagram, the
database is almost always
implemented on the relational database management
system (RDBMS)
technology. Now the now approaches include the
following
Multidimensional database (MDDBs)- This is
tightly coupled with the online analytical
processing (OLAP) tools that act as clients to
the multidimensional data stores.
An innovative approach to speed up a traditional
RDBMs by using new index structures to bypass
relational table scans.
Parallel relational database designs that require
a parallel computing platforms, for ex.,
symmetric multiprocessor (SMP), massively
parallel processors (MPPs), and or clusters of
uni-or multiprocessors.

16
Information Delivery Tool Taxonomy

Tools are generally divided into five main groups
Data query and reporting tools.
Application development tools.
Executive Information System (EIS) tools.
Online analytical processing tools.
Data mining tools.

17
Topic 3 Data Mining An Overview

Data mining can help reduce information overload
and improve decision making. This is achieved by
extracting and refining useful knowledge through
a process of searching for relationships and
patterns from the extensive data collected by
organizations. The extracted information is used
to predict, classify, model, and summarize the
data being mined. Data-mining technologies, such
as rule induction, neural networks, genetic
algorithms, fuzzy logic, and rough sets, are used
for classification and pattern recognition in
many industries.

18
Data Mining An Overview

A supermarket organizes its merchandise stock
based on shoppers' purchase patterns.
An airline reservation system uses customers'
travel patterns and trends to increase seat
utilization.
Web pages alter their organizational structure or
visual appearance based on information about the
person who is requesting the pages.
Individuals perform a Web-based query to find the
median income of households in Iowa.

19
Data Mining An Overview

Data mining builds models of customer
behavior by using established statistical and
machine-learning techniques. The basic objective
is to construct a model for one situation in
which the answer or output is known and then
apply that model to another situation in which
the answer or output is sought. The best
applications of the above techniques are
integrated with data warehouses and other
interactive, flexible business analysis tools.
The analytic data warehouse can thus improve
business processes across the organization in
areas such as campaign management, new product
rollout, and fraud detection.

20
Data Mining An Overview

Data mining integrates different
technologies to populate, organize, and manage
the data store. Because quality data is crucial
to accurate results, data-mining tools must be
able to clean the data, making it consistent,
uniform, and compatible with the data store. Data
mining employs several techniques to extract
important information. Operations are the actions
that can be performed on accumulated data,
including predictive modeling, database
segmentation, link analysis, and deviation
detection.

21
Taxonomy of Data Mining Tools

We can divided the entire data mining tool market
into three main
groups General-purpose tools, integrated
DSS/OLAP/data mining
tools, and rapidly growing, application-specific
tools.
The General-purpose tools which occupy the larger
and more mature
segment of the market include the following
SAS Enterprise Minor
IBM Intelligent Minor
Unica PRW
SPSS Clementine
SGI Mineset
Oracle Darwin
Angoss KnowledgeSeeker

22
Taxonomy of Data Mining Tools

The integrated data mining tool segment addresses
a very real and
compelling business requirement of having a
single multi-function,
decision-support tool that can provide management
reporting, online
analytical processing, and data mining
capabilities within a common
framework. Examples of these integrated tools
include Cognos
scenario and Business Objects.
The application-specific tools segment is rapidly
gaining momentum.
Among these tools are the following

KDI (focuses on retail)
Options Choices (focuses on insurance
industries)
HNC (focuses on fraud detection)
Unica Model 1 (focuses on marketing)

23
Database Mining Workstation (HNC)

HNC is one of the most successful data mining
companies. Its Database
Mining workstation (DMW) is a neural network tool
that is widely-accepted
For credit card fraud analysis applications. DMW
consists of Windowsbased
software applications and a custom processing
board. Other HNC products
include Falcon and ProfitMax processing
applications for financial services,
and the Advanced Telecommunications Abuse Control
System (ATACS)
fraud-detection solution that HNC plans to deploy
in the Telecommunications
Industries.

24
Taxonomy of Data Mining Tools

There are specific tools, for example, for the
following applications
Financial Data Analysis neural networks have
been used in forecasting stock prices, option
trading, rating bonds, portfolio management,
commodity-price prediction, and mergers and
acquisition analysis. Using IBM Intelligent
minor, Mellon Bank developed a credit
card-attrition model to predict which customers
will stop using Mellons credit card in the next
few months.
Telecommunications Industry The
hyper-competitive nature of the industry has
created a need to understand customers, to keep
them, to model effective ways to market new
products.

25
Taxonomy of Data Mining Tools

Retail Industry Retail data mining can help
identify customer-buying behaviors, discover
consumer-shopping patterns and trends.
Healthcare and biomedical research The analysis
of large quantities of time-stamped data will
provide doctors with important information
regarding the progress of the decease. For ex.,
NeuroMedicalSystems used neural networks to
perform a pap smear diagnostic aid.
Science and engineering To improve its
manufacturing process. Boeing has successfully
applied machine-learning algorithms to the
discovery of informative and useful rules from
its plant data.

26
Data Mining vs. Data Warehouse

Major challenge to exploit data mining is
identifying suitable data to mine.
Data mining requires single, separate, clean,
integrated, and self-consistent source of data.
A data warehouse is well equipped for providing
data for mining.
Data quality and consistency is a pre-requisite
for mining to ensure the accuracy of the
predictive models. Data warehouses are populated
with clean, consistent data.

27
Data Mining vs. Data Warehouse

Data Mining does not require that a Data
Warehouse be built. Often, data can be downloaded
from the operational files to flat files that
contain the data ready for the data mining
analysis.
Data Mining can be implemented rapidly on
existing software and hardware platforms. Data
Mining tools can analyze massive databases to
deliver answers to questions such as, Which
customers are most likely to respond to my next
promotional mailing, and why?

28
Data Mining vs. Data Warehouse

Advantageous to mine data from multiple sources
to discover as many interrelationships as
possible. Data warehouses contain data from a
number of sources.
Selecting relevant subsets of records and fields
for data mining requires query capabilities of
the data warehouse.
Results of a data mining study are useful if
there is some way to further investigate the
uncovered patterns. Data warehouses provide
capability to go back to the data source.

29
Data Mining vs. OLAP

They are two separate breeds of analysis with
entirely different objectives, not to mention
tools, skill sets, and implementation methods.

30
Data Mining vs. OLAP

With canned reports, ad hoc querying, and OLAP,
the
end user defines a hypothesis and determines
which data
to examine. With data mining, the tool identifies
the
hypothesis, and it actually tells the user where
in the data
to start the exploration process.

31
Data Mining vs. OLAP

Rather than using SQL to filter out values and
methodically
reduce the data into a concise answer set, data
mining uses
algorithms that exhaustively review the
relationships among
data elements to determine if any patterns exist.
The whole
purpose of data mining is to yield new business
information
that a business person can act on.

32
OLAP vs. Data Mining Tools
OLAP Tools
Data Mining Tools

Are ad hoc, shrink wrapped tools that provide an
interface to data
Are used when you have specific known questions
Looks and feels like a spreadsheet that allow
rotation, slicing and graphic
Can be deployed to large number of users

Methods for analyzing multiple data types
-- Regression Trees
-- Neural networks
-- Genetic algorithms
Are used when you dont know what the questions
are
Usually textual in nature
Usually deployed to a small number of analysts

33
Topic 4 Key Terms

Application Service Providers
Offer outsourcing solutions that supply,
develop, and manage application specific software
and hardware so that customers' internal
information technology resources can be freed up.
Business Intelligence
The type of detailed information that
business managers need for analyzing sales
trends, customers' purchasing habits, and other
key performance metrics in the company.

34
Key Terms

Categorical Data
Fits into a small number of distinct
categories of a discrete nature, in contrast to
continuous data, and can be ordered (ordinal),
for example, high, medium, or low temperatures,
or nonordered (nominal), for example, gender or
city.
Classification
The distribution of things into classes or
categories of the same type, or the prediction of
the category of data by building a model based on
some predictor variables.

35
Key Terms

Clustering
Groups of items that are similar as
identified by algorithms. For example, an
insurance company can use clustering to group
customers by income, age, policy types, and prior
claims. The goal is to divide a data set into
groups such that records within a group are as
homogeneous as possible and groups are as
heterogeneous as possible. When the categories
are unspecified, this may be called unsupervised
learning.
Genetic Algorithm
Optimization techniques based on
evolutionary concepts that employ processes such
as genetic combination, mutation, and natural
selection in a design.

36
Key Terms

Online Profiling
The process of collecting and analyzing data
from Web site visits, which can be used to
personalize a customer's subsequent experiences
on the Web site. Network advertisers, for
example, can use online profiles to track a
user's visits and activities across multiple Web
sites, although such a practice is controversial
and may be subject to various forms of
regulation.
Rough Sets
A mathematical approach to extract knowledge
from imprecise and uncertain data.

37
Key Terms

Rule Induction
The extraction of valid and useful
if-then-else rules from data based on their
statistical significance levels, which are
integrated with commercial data warehouse and
OLAP platforms.
Visualization
Graphically displayed data from simple
scatter plots to complex multidimensional
representations to facilitate better
understanding.

38
Topic 5 Data Mining Methodology

The methodology used today in data mining, when
it is well thought
out and well executed, consists of just a few
very important concepts.
Finding a pattern in the data and building a
model. In general, it means any sequence or
pattern of data that occurs more often than one
would it to if it were a random event.
Sampling or not having to use all of the data in
order to make significant conclusions about what
might be happening with other parts of the data.
Validating the predictive models that arise out
of data mining algorithm.
Finally, coming down to finding the pattern or
model that is the beat.
The four parts of data mining technology
patterns, sampling, validation,
and choosing the model.

39
Pattern and Model

Pattern An event or combination of events in a
database that occurs more
often than expected. Typically, this means that
its actual occurrence is
significantly different than what would be by
random chance. (for ex.,
121212?
Model A description that adequately explains and
predicts relevant data but
is generally much smaller than the data itself.
For real-world applications, a
model can be anything from a mathematical
Equation, to a set of rules that
describes customer segments, to the computer
representation of a complex
neural network architecture, which translates to
several sets of mathematical
equations.
Predictive model A model created or used to
perform prediction. In contrast
to models created solely for pattern detection,
exploration or general
organization of the data.

40
Types Of Models
Descriptive The dealer sold 200 cars last month.
Operational
(OLTP)
Explanatory For every increase in 1 in the
interest, auto sales decrease by 5 .
Traditional DW
OLAP
Predictive predictions about future buyer
behavior.
Data Mining
41
A high-level View of Modeling Process
Historical Data
Model Building
Prediction
Record ???
123
Model
42
The Needs for Sampling

Containing costs
Speeding up the data gathering
Improving effectiveness
Reducing bias

43
Sampling Design

Four steps
Determine the data to be collected or described
Determine the population to be sampled
Choose the type of sample
Decide on the sample size

44
Two Types of Data Mining Modeling- Verification
and Discovery

The verification model utilizes a process that
looks in a database to detect trends and patterns
in data that will help answer some specific
questions about the business.
In this mode, the user generates a hypothesis
about the data, issues a query against the data
and examines the results of the query looking for
verification of the hypothesis or the user
decides that the hypothesis is not valid.

45
Verification Model

In this model, very little information is created
in this extraction process either the hypothesis
is verified or it is not.
Common tools used in this mode are queries,
multidimensional analysis and visualization. What
all have in common are that the user is
essentially guiding the exploration of the data
being inspected.

46
Discovery Model

A more popular model is the Discovery Model that
utilizes a process that looks in a database to
discover and/or predict future patterns. The
discovery model is divided into two modes
Descriptive and Predictive.

47
Discovery Model- Descriptive Mode

The Descriptive mode finds hidden patterns
without a predetermined idea or hypothesis about
what the patterns may be. In other words, the
Data Mining software or program takes the
initiative in finding what the interesting
patterns are, without the user thinking of the
relevant questions first. In this mode
information is created about the data with very
little or guidance from the user. The exploration
of the data is done in such a way as to yield as
large a number of useful facts about the data in
the shortest amount of time.

48
Discovery Model- Predictive Mode

In the Predictive mode patterns discovered from
the database are used to predict the future
patterns or trends. Predictive modeling allows
the user to submit records with some unknown
field values, and the system will guess the
unknown values based on previous patterns
discovered from the database.
In comparing the two models, one can state that
Verification can be very inefficient, timely
and costly. Whereas, Discovery modeling can be
very efficient, cost effective, less dependent on
user input and increases modeling accuracy.

49
Predictive Modelling

Similar to the human learning experience
uses observations to form a model of the
important characteristics of some phenomenon.
Uses generalizations of real world and ability
to fit new data into a general framework.
Can analyze a database to determine essential
characteristics (model) about the data set.

50
Predictive Modelling

Model is developed using a supervised learning
approach, which has two phases training and
testing.
Training builds a model using a large sample of
historical data called a training set.
Testing involves trying out the model on new,
previously unseen data to determine its accuracy
and physical performance characteristics.

51
Predictive Modelling

Applications of predictive modelling include
customer retention management, credit approval,
cross selling, and direct marketing.
Two techniques associated with predictive
modelling
A. classification
B. value prediction, distinguished by nature
of the variable being predicted.

52
Predictive Modelling - Classification

Used to establish a specific predetermined class
for each record in a database from a finite set
of possible, class values.
Two specializations of classification tree
induction and neural induction.

53
Example of Classification using Tree Induction
54
Example of Classification using Tree Induction
Customer renting property gt 2 years
No
Yes
Rent property
Customer agegt45
No
Yes
Rent property
Buy property
55
Example of Classification using Neural Induction
56
Example of Classification Using Neural Induction

Each processing unit (circle) in one layer is
connected to each processing unit in the next
layer by a weighted value, expressing the
strength of the relationship. The network
attempts to mirror the way the human brain works
in recognizing patterns by arithmetically
combining all the variables with a given data
point.
In this way, it is possible to develop nonlinear
predictive models that learn by studying
combinations of variables and how different
combinations of variables affect different data
sets.

57
Predictive Modelling - Value Prediction

Used to estimate a continuous numeric value that
is associated with a database record.
Uses the traditional statistical techniques of
linear regression and non-linear regression.
Relatively easy-to-use and understand.

58
Predictive Modelling - Value Prediction

Linear regression attempts to fit a straight line
through a plot of the data, such that the line is
the best representation of the average of all
observations at that point in the plot.
Problem is that the technique only works well
with linear data and is sensitive to the presence
of outliers (i.e.., data values, which do not
conform to the expected norm).

59
Predictive Modelling - Value Prediction

Although non-linear regression avoids the main
problems of linear regression, still not flexible
enough to handle all possible shapes of the data
plot.
Statistical measurements are fine for building
linear models that describe predictable data
points, however, most data is not linear in
nature.

60
Predictive Modelling - Value Prediction

Data mining requires statistical methods that can
accommodate non-linearity, outliers, and
non-numeric data.
Applications of value prediction include credit
card fraud detection or target mailing list
identification.

61
Database Segmentation

Aim is to partition a database into an unknown
number of segments, or clusters, of similar
records.
Uses unsupervised learning to discover
homogeneous sub-populations in a database to
improve the accuracy of the profiles.

62
Database Segmentation

Less precise than other operations thus less
sensitive to redundant and irrelevant features.
Sensitivity can be reduced by ignoring a subset
of the attributes that describe each instance or
by assigning a weighting factor to each variable.
Applications of database segmentation include
customer profiling, direct marketing, and cross
selling.

63
Example of Database Segmentation using a Scatter
plot
64
Database Segmentation

Associated with demographic or neural clustering
techniques, distinguished by
Allowable data inputs
Methods used to calculate the distance between
records
Presentation of the resulting segments for
analysis.

65
Example of Database Segmentation using a
Visualization
66
Link Analysis

Aims to establish links (associations) between
records, or sets of records, in a database.
There are three specializations
Associations discovery
Sequential pattern discovery
Similar time sequence discovery
Applications include product affinity analysis,
direct marketing, and stock price movement.

67
Link Analysis - Associations Discovery

Finds items that imply the presence of other
items in the same event.
Affinities between items are represented by
association rules.
e.g. When customer rents property for more than
2 years and is more than 25 years old, in 40 of
cases, customer will buy a property. Association
happens in 35 of all customers who rent
properties.

68
Link Analysis - Sequential Pattern Discovery

Finds patterns between events such that the
presence of one set of items is followed by
another set of items in a database of events over
a period of time.
e.g. Used to understand long term customer buying
behaviour.

69
Link Analysis - Similar Time Sequence Discovery

Finds links between two sets of data that are
time-dependent, and is based on the degree of
similarity between the patterns that both time
series demonstrate.
e.g. Within three months of buying property, new
home owners will purchase goods such as cookers,
freezers, and washing machines.

70
Deviation Detection

Relatively new operation in terms of commercially
available data mining tools.
Often a source of true discovery because it
identifies outliers, which express deviation from
some previously known expectation and norm.

71
Deviation Detection

Can be performed using statistics and
visualization techniques or as a by-product of
data mining.
Applications include fraud detection in the use
of credit cards and insurance claims, quality
control, and defects tracing.

72
A Summary Data-Driven Techniques

Data Visualization
Decision Trees
Clustering
Factor Analysis
Neural Network
Association Rules
Rule Induction
Based on Sakhr Younesss book Professional
Data Warehousing with SQL Server 7.0 and OLAP
Services

73
Data Visualization
A pie chart showing the sales of a product by
region is sometimes much more effective than
presenting the same data in a text or tabular
form.
9
11
Northeast
South
North
39
21
West
20
East
74
Decision Tree
75
Cluster Analysis
First segment (high incomegt8,000)
Have Children
Second Segment (8000gtmiddle income gt3000)
Married
Last car is A used one
Third Segment (low income lt 3000)
Own car
76
Factor Analysis

Unlike cluster analysis, factor analysis builds a
model from data. The technique finds underlying
factors, also called latent variables and
provides models for these factors based on
variables in the data. For ex., a software
company is considering a survey to find out the
nine most perceived attributes of one of their
products. They might categorize these products to
categories such as service for technical support,
availability for training and a help system.
Factor analysis is used for grouping together
products based on a similarity of buying patterns
so that vendors may bundle several products as
one to sell them together at a lower price than
their added individual prices..

77
Neural Networks
78
Association Rules

Association models are models that examine the
extent to which values of one field depend on, or
are produced by, values of another field. These
models are often referred to as Market Basket
Analysis when they are applied to retail
industries to study the buying patterns of these
customers, especially in grocery and retail
stores that issue their own credit cards.
Charging against these cards gives the store the
chance to associate the purchases of customers
with their identities, which allows them to study
associations among other things.

79
Rules Induction

This is a powerful technique that involves a
large number of rules using a set of if..then
statements in the pursuit of all possible
patterns in the dataset. For ex., if the customer
is a male then, if he is between 30 and 40 years
of ages, and his income is less than 50,000 and
more than 20,000, he is likely to be driving a
car that was bought as new.

80
A Summary Theory-Driven Techniques

Correlations
T-Tests
Analysis of Variables
Linear Regression
Logistic Regression
Discriminate Analysis
Forecasting Methods

81
Validating Picking the Model

Validating any model that comes out of a data
mining tool is going to be the
most important thing that you can do. The
validation required for data
mining is that after you build the model on some
historical data, you apply
the model to similar historical data from which
the model was not built.
Because the data is historical, you already know
the outcome so that the
accuracy of the predictive model can be measured.
One of the most important things that needs to be
done when you are
building a predictive model is to make sure that
you have picked up the
essential patterns in the data that will hold
true the next time you apply
your model.

82
Three Additional Ways in Which Data mining
Supports CRM Initiatives.

1. Database marketing
2. Customer acquisition
3. Campaign optimization

83
Database Marketing

Data mining helps database marketers develop
campaigns that are closer to the targeted needs,
desires, and attitudes of their customers. If the
necessary information resides in a database, data
mining can model a wide range of customer
activities. The key objective is to identify
patterns that are relevant to current business
problems. For example, data mining can help
answer questions such as "Which customers are
most likely to cancel their cable TV service?"
and "What is the probability that a customer will
spend over 120 from a given store?" Answering
these types of questions can boost customer
retention and campaign response rates, which
ultimately increases sales and returns on
investment.

84
Database Marketing

Database marketing software enables companies to
send customers and prospective customers timely
and relevant messages and value propositions.
Modern campaign management software also monitors
and manages customer communications on multiple
channels including direct mail, telemarketing,
e-mail, the Internet, point of sale, and customer
service. Furthermore, this software can be used
to automate and unify diverse marketing campaigns
at their various stages of planning, execution,
assessment, and refinement. The software can also
launch campaigns in response to specific customer
behaviors, such as the opening of a new account.

85
Database Marketing

Generally, better business results are obtained
when data mining and campaign management work
closely together. For example, campaign
management software can apply the data-mining
model's scores to sharpen the definition of
targeted customers, thereby raising response
rates and campaign effectiveness. Furthermore,
data mining may help to resolve the problems that
traditional campaign management processes and
software typically do not adequately address,
such as scheduling, resource assignment, and so
forth. Although finding patterns in data is
useful, data mining's main contribution is
providing relevant information that enables
better decision making. In other words, it is a
tool that can be used along with other tools
(e.g., knowledge, experience, creativity,
judgment, etc.) to obtain better results. A
data-mining system manages the technical details,
thus enabling decision makers to focus on
critical business questions such as "Which
current customers are likely to be interested in
our new product?" and "Which market segment is
best for the launch of our new product?"

86
Customer Acquisition

The growth strategy of businesses depends
heavily on acquiring new customers, which may
require finding people who have been unaware of
various products and services, who have just
entered specific product categories (for example,
new parents and the diaper category), or who have
purchased from competitors. Although experienced
marketers often can select the right set of
demographic criteria, the process increases in
difficulty with the volume, pattern complexity,
and granularity of customer data. Highlighting
the challenges of customer segmentation has
resulted in an explosive growth in consumer
databases. Data mining offers multiple
segmentation solutions that could increase the
response rate for a customer acquisition
campaign. Marketers need to use creativity and
experience to tailor new and interesting offers
for customers identified through data-mining
initiatives.

87
Campaign Optimization

Many marketing organizations have a variety of
methods to interact with current and prospective
customers. The process of optimizing a marketing
campaign establishes a mapping between the
organization's set of offers and a given set of
customers that satisfies the campaign's
characteristics and constraints, defines the
marketing channels to be used, and specifies the
relevant time parameters. Data mining can elevate
the effectiveness of campaign optimization
processes by modeling customers' channel-specific
responses to marketing offers.

88
Topic 6 Classical Techniques Statistics,
Neighborhoods, and Clustering

Statistics can help to answer several important
questions about the
data
What patterns are there in my database?
What is the chance that an event will occur?
What patterns are significant?
What is a high-level summary of the data that
gives me some idea of what is contained in my
database?

89
Statistics --Histogram

The first step in understanding statistics is to
understand how the
data is collected into a higher-level formone of
the most notable
Ways of doing this is with the histogram.

of customers or Amount of sales
90
Histogram
Number of customers
3000
2500
2000
1500
1000
500
1
11
21
31
41
51
61
71
81
Ages
91
Linear Regression Is Similar to the Task of
Findingthe Line that Minimizes the Total
Distance to a Set of Data.
Prediction (Average Consumer bank balance)
Predictor (Consumer annual income)
92
Linear Regression

The predictive model is the line shown in the
previous chart. The line
will take a given value for a predictor and map
it into a given value
for a prediction. The actual equation would look
something like
Prediction a b predictor. This is just the
equation for a line Y
A bX. As an example for a bank, the predicted
average consumer
bank balance might equal to 1,000 0.01
customers annual
income.

93
Linear Regression

Linear regression attempts to fit a straight line
through a plot of the data, such that the line is
the best representation of the average of all
observations at that point in the plot.
Problem is that the technique only works well
with linear data and is sensitive to the presence
of outliers (i.e.., data values, which do not
conform to the expected norm).

94
Linear Regression

Although non-linear regression avoids the main
problems of linear regression, still not flexible
enough to handle all possible shapes of the data
plot.
Statistical measurements are fine for building
linear models that describe predictable data
points, however, most data is not linear in
nature.

95
Linear Regression

Data mining requires statistical methods that can
accommodate non-linearity, outliers, and
non-numeric data.
Applications of value prediction include credit
card fraud detection or target mailing list
identification.

96
The Nearest Neighbor Prediction

One of the classic areas that nearest neighbor
has been used for
prediction has been in text retrieval. The end
user defines a document
(for ex., a Wall Street Journal) to be retrieved,
then the nearest
neighbor characteristics with these documents
that have been
marked are more likely to be retrieved.
Another good example is that the supermarkets
tend to put similar
produces in the same area, for ex., an apple
closer to an orange than
to tomato. Thus, if you know the predictive
value of one of the
objects, you can predict it for the nearest
neighbors.

97
Data Clustering

Clustering analysis is an important means of
processing multimedia
data. It is basically the organization of a
collection of patterns into
clusters of similar objects. Patterns within
valid cluster are more
similar to each other than they are to a pattern
in a different cluster.

98
Data Clustering

Clustering can allow us to carry out the
following activities
that can help in query processing
Representing patterns in the data so that we can
reduce the size of the media
Defining a way of measuring the proximity of
different patterns in the data so that we can
find the instances that match our example.
Clustering or grouping the data in preparation
for matching
Data abstraction, particularly of features that
we can store as metadata
Assessing the output by estimating how good the
selection is.

99
Clustering and Nearest Neighbor

A simple example of clustering would be the
clustering that most
people perform when they do the laundry- grouping
the permanent
press, dry cleaning, whites, and brightly colored
clothes is important
because they have similar characteristics.
A simple example of the nearest neighbor
prediction algorithm Is
when you look at the people in your neighborhood.
You may notice
that, in general, you all have somewhat similar
income.

100
Statistical Analysis of Actual Sales (dollars and
quantities) relative to these Signage Variables-a
predictive modeling example.

Content
Frequency
Depth
Focus
Depth
Scale
Length
Location
Statistical Analysis Correlation, Regression,
Experiment Design,
Optimization. Now it goes into real time
analysis.

101
Signage
102
Signage
103
Topic 7 Next Generation Techniques Decision
Trees, Networks, and Rules
A Decision Tree
Customer renting property gt 2 years
No
Yes
Rent property
Customer agegt45
No
Yes
Rent property
Buy property
104
A Decision Tree
105
CART and CHAID

CART, which stands for Classification and
Regression Trees, is a data
exploration and prediction algorithm developed by
Leo Breiman,
Jerome Friedman, Richard Olshen and Charles
Stone. It is nicely
detailed in their 1984 book, Classification and
Regression Trees (
Breiman, Friedman, Olshen, and Stone, 1984. These
researchers from
Standard University and the University of
California at Berkeley
Showed how this new algorithm could be used on a
variety of
different problems from the detection of chlorine
from the data
contained in a mass spectrum. One of the great
advantages of CART
is that the algorithm has the validation of the
model and the discovery
of the optimally general model built deeply into
the algorithm.
Another popular decision tree technology is CHARD
(Chi-Square
Automatic Interaction Detector). CHARD is similar
to CART in that
it builds a decision tree, but it differs in the
way that it chooses its
splits.

106
B Neural Networks

A neural network is loosely based on the way some
people believe
That the human brain is organized and how it
learns. There are two
Main structures of consequence in the neural
networks
The node, which loosely corresponds to the neuron
in the human brain
The link, which loosely corresponds to the
connections between neutrons (axons, dendrites,
and synapses) in the human brain.

107
Neural Networks
When customer rents property for more than 2
years and is more than 25 years old, in 40 of
cases, customer will buy a property. Association
happens in 35 of all customers who rent
properties.
108
Example of Classification using Neural Induction

Each processing unit (circle) in one layer is
connected to each processing unit in the next
layer by a weighted value, expressing the
strength of the relationship. The network
attempts to mirror the way the human brain works
in recognizing patterns by arithmetically
combining all the variables with a given data
point.
In this way, it is possible to develop nonlinear
predictive models that learn by studying
combinations of variables and how different
combinations of variables affect different data
sets.

109
How Does a Neural Induction Make a prediction?

The value age of 47 is normalized to fall between
0.0 and 1.0, it has the value of 0.47, and the
income is normalized to the value of 0.65. This
simplified neural network makes the prediction of
no default for a 47-year old making 65,000. The
links are weighted at 0.7 and 0.1, and the
resulting value, after multiplying the node
values by the link weights, is 0.39.

Age
Weighted 0.7
0.47
default
0.39
0.65
Income
Weighted 0.1
0.47(0.7) 0.65(0.1) 0.39
110
C Rule Induction

This is a powerful technique that involves a
large number of rules using a set of if..then
statements in the pursuit of all possible
patterns in the dataset. For ex., if the customer
is a male then, if he is between 30 and 40 years
of ages, and his income is less than 50,000 and
more than 20,000, he is likely to be driving a
car that was bought as new.

111
What Is A Rule?
Rule
Accuracy
Coverage

If breakfast cereal purchased, the
85 20
milk is purchased.
If bread purchased, then Swiss choose
15 6
will be purchased.
If 42 years old and purchased pretzels
95 0.01
and dry roasted peanuts, then beer will
be purchased.

112
Topic 8 CRM -The Business Perspective

Tools and technologies will be applied to real
business problems
across a variety of industries. They are
Customer Profitability provides a blueprint for
how to define and use customer profitability as
the bedrock for your CRM processes.
Customer Acquisition shows how to use data
mining to acquire new customers in the most
profitable way possible.
Customer Cross-selling details how the
technology architecture can be used to increase
the value of existing customers by applying more
to them.
Customer Retention uses a case study from the
telecommunications industry to show how to
execute successful CRM systems to retain your
profitable customers.
Customer Segmentation provides the business
methodology of how to segment and manage your
customers in a consistent and repeatable way
across the enterprise.

113
The Business-Centric View of Data Mining Process
Business Problem
Data
Understand
Define Value
Data Definition
ROI Definition
Data Mining
Define Value
Predictive Model
Predicted ROI
Application
Display
ROI
114
Customer Profitability

Customer profitability is the bedrock of data
mining. Data mining
earns its keep by helping you to understand and
improve Customer
Profitability. How does the organization define
what a profitable
customer is versus an unprofitable customer?
Keeping a customer
loyal can have profound effects on per-customer
profitability. The
compounding effect of customer loyalty on
customer profitability also
increases because sales costs are lower and
revenue generally has
increased. Data Mining can be used to predict
customer profitability,
Under a variety of different marketing campaigns.

115
A Customer Value Matrix Showing Recommended
Service Level
Current Value Lifetime Value Potential Value Potential Lifetime Value Customer Service Level Best Service Level
1 High High High High Gold Gold
2 High Low High High Gold Gold
3 High Low High Low Gold Bronze
4 Low Low Low High Bronze Gold
5 Low Low High High Bronze Gold
6 Low Low Low Low Bronze Bronze
Segment
116
A Customer Value Matrix

This should be one of the first things that we
should do
with data mining.
Segment 1 is our best customers. They will remain
your best customers through their lives and their
current value matches their potential.
Segment 2 is similar, except that they are likely
to have low lifetime value, despite their high
value today, probably because they are not loyal
and likely switch to a competitor at some time in
their customer life.
Segments 4 and 5 represent customers who, with
the right care and service, can be transitioned
to high-value customers, either short-term or
long-term .
Segments 6 represents your low-value customers
that you will treat with some of your least
expensive services.

117
Customer Acquisition

The traditional approach to customer acquisition
involved a
marketing manager developing a combination of
mass marketing
(magazine advertisements, billboards, etc.) and
direct marketing (
Telemarketing, mail, etc.) campaigns based on
their knowledge of the
Particular customer base that was being targeted.
A marketing manager selects the demographics
(Age, Gender,
interest in particular subjects, etc.) and then
works with a data
vendor (sometimes known as a service bureau) to
obtain Lists of
customers who meet those characteristics.
Although a marketer with a wealth pf experience
can often choose relevant
demographic selection criteria, the process
becomes more difficult as the
amount of data increases.
Data Mining can help this process.

118
Defining Some Key Customer Acquisition Concepts

The responses that come in as a result of a
marketing campaign are called
response behaviors. Binary response behaviors
(either a yes or no) are the
simplest kind of response.
Beyond binary response behaviors are a type of
categorical response
behaviors which allows for multiple behaviors to
be defined. The rules that
define the behaviors are based on the kind of
business you are involved in.
There are usually several different kinds of
positive response behaviors that
can be associated with an acquisition marketing
campaign. They are
Customer inquiry Purchase of the offered product
or products
Purchase of a product different from the one
offered.

119
Response Analysis Broken Down By Behaviors
Behavior Measures 12/1/05 12/5/05 12/7/05 12/9/05 Total
Inquiry of Responses 1,556 1,340 328 352 3,576
Purchase A of Responses 210 599 128 167 1.104
Purchase B of Responses 739 476 164 97 1,476
Purchase C of Responses 639 647 113 105 1,504
120
Cross-Selling

Cross-selling is the process by which you offer
your existing customers new
products and services. Customers who purchase
baby diapers might also be
interested in hearing about your other baby
products.
One form of cross-selling, sometimes called up
selling, takes place when the
new offer is related to existing purchases by the
customer. For., ex., an up-
sell opportunity might exist for a telephone
company to market a premium
long-distance service to existing long-distance
customers who currently have
the standard service.

121
How Cross-Selling Works

Assume that you are a marketing manager for a
mid-size bank. You
have the following products available for your
customers
Value checking account
Standard checking account
Gold credit card
Platinum credit card
Primary mortgage
Secondary mortgage
Of these products, youre responsible for
marketing the mortgage products to
your Customers. Your goal is to find out which
customers might be interested
in a mortgage offering at least 60 days before
they would apply for the loan.
It is important that any predictions are made
with sufficient lead time (in this
case, two months), so that any Interactions with
the customers take place
before they are committed to a relationship with
your competition.

122
How Cross-Selling Works

You have already done some thinking about your
customers and their
motivations in this area and came up with several
scenarios, which you
presented to your boss when pitching this new
campaign
Customer preparing to buy a new home. These
customers might be building up cash reserves in
their checking and/or savings account in order to
put together a down payment.
Customer preparing to refinance an existing home.
These customers might be paying off credit card
debt (thus making them more acceptable from a
risk point of view), and hold a mortgage whose
interest rate is higher than the current interest
rate.
Customer preparing to add a second mortgage.
These customers might have increasing credit card
debt, an on-time payment history for their credit
cards and existing mortgage (which means that
they are a good risk), and enough equity in their
house to cover the outstanding credit card
balance.

123
Data Mining Process for Cross-Selling

The actual data mining process contains three
distinct steps when
doing cross-selling process
Modeling of individual behaviors
Scoring data with predictive models
Optimization of the scoring matrices
Model A description that adequately explains and
predicts relevant data that
but is generally much smaller than the data
itself. For real-world
applications, a model can be anything from a
mathematical Equation, to a set
of rules that describes customer segments, to the
computer representation of
a complex neural network architecture, which
translates to several sets of
mathematical equations.
Predictive model A model created or used to
perform prediction. In contrast
to models created solely for pattern detection,
exploration or general
organization of the data.

124
Customer Retention

As industries become more competitive and the
cost of acquiring new
customers increases, the value of retaining
current customers also increases.
for instance, in the cellular phone industry, it
is estimated that the cost of
attracting and signing up a new customer is 300
or more when the costs of
disconnected hardware and sales commissions are
included. The cost of
retaining a current customer, however, can be as
low as the price of a phone
call or the cost of updating their cellular phone
to the latest technology
offering. Although expensive, this is still
significantly cheaper than signing
up a wholly new customer.

125
A Case Study- Cellular Phone Industry

Customer churn is the term used in the cellular
telephone industry to denote
the movement of cellular telephone customers from
one provider to another.
In many industries, this is called customer
attrition, but because of the highly

Write a Comment

User Comments (0)