Title: Jerry%20Held
1Open World 2003
2Data Warehousing for the Communications
Industry A Data Mining Approach to
Customer Churn Analysis in Wireless Industry
Session id 40332
- Shyam Varan NathSenior Database Engineer
- Daleen Technologies
3Introduction
- Oracle Data Mining
- JDeveloper
- DM4J
- Wireless Industry and Customer Churn
- Data Modeling for Churn Management
4WLNP Threatens to Significantly Impact Wireless
Churn Rates.
Source In-Stat 2002
5Churn
North American Wireless industry monthly churn
rate in Q4-02
2.8
2.4
Canadian Average
U.S. Average
Monthly Churn () - 4Q-02
- Source Company analyst reports
6Wireless Industry Some Facts
- Wireless Local Number Portability (WLNP) from Nov
2003 - Average Cost to Acquire a New Wireless Customer
400 to 500 - Data Mining as a Solution to the Business Problem
7facts
Source Duke Teradata 2002
8facts
9Reasons for Churn
- Many companies to choose from
- Similarity of their Offerings
- Cheap prices of the handsets
- The biggest current barrier to churn
- the lack of phone number portability!
10A Dilemma
- Cross-Selling Through Database Marketing
- cross-selling is effective for customer retention
by increasing switching costs and enhancing
customer loyalty - on the other hand, cross-selling can also
potentially weaken the firms relationship with
the customer, because frequent attempts to
cross-sell can render the customer non-responsive
or even motivated to switch to a competitor
11Role of Data Mining
- Business Issues in a Wireless Industry
12Some Definitions
- Data Warehousing Data warehousing is a database
or a collection of databases designed to give
business decision-makers instant access to
information - Data Mining The Data Mining is the process of
using raw data to infer important business
relationships that can then be used for business
advantage -
13- Simply put, data mining is used to discover
hidden patterns and relationships in your data
in order to help you make better business
decisions.
Source Oracle9i Data Mining 2001
14Choice of Tools
15Justification for Data Mining
- Reporting Tools Good at drilldowns into the
details -
- OLAP/Statistical Tools Used to draw conclusions
from representative samples - Data Mining Goes deep into the data. It uses
machine-learning algorithms to automatically sift
through each record and variable to uncover
patterns and information that may have been
hidden. -
16Predictive Modeling
Visual Representation of Predictive Modeling
17Benefits Of Data Warehousing And Predictive
Modeling
- Immediate Information Delivery
- Data Integration from acrossand even outsidethe
Organization - Future Vision from Historical Trends
- Tools for Looking at Data in New Ways
18What is ODM?
Connected to Oracle9i Enterprise Edition Release
9.2.0.1.0 - Production With the Partitioning,
OLAP and Oracle Data Mining options JServer
Release 9.2.0.1.0 - Production SQLgt Oracle9i Data
Mining, an option to Oracle9i Enterprise Edition,
that allows users to build advanced business
intelligence applications that mine corporate
databases to discover new insights, and integrate
those insights into business applications.
19Why Oracle?
Integrated Environment of Oracle Relational
Database
20Supervised v/s Unsupervised Learning
- Supervised learning requires identification of a
target field or dependent variable. The
supervised-learning technique then sifts through
data trying to find patterns and relationships
between the independent variables and the
dependent variable. (ODM provides the Naïve Bayes
data mining algorithm for supervised-learning
problems.) - Unsupervised learning allows the user not to
indicate the objective to the data mining
algorithm. Associations and clustering algorithms
make no assumptions about the target field.
Instead, try to find associations and clusters in
the data independent of any a priori defined
business objective Market-basket analysis etc.
(ODM provides the Association Rules data mining
algorithm for unsupervised-learning problems.)
21Naive Bayes algorithm
- The Naive Bayes algorithm uses the mathematics of
Bayes' Theorem to make its predictions. The
algorithm is typically used for - Identifying which customers are likely to
purchase a certain product - Identifying customers who are likely to churn
- Predicting the likelihood that a part will be
defective - Adaptive Bayes Network
- Human readable rules
IF RELATIONSHIP "Husband" AND EDUCATION_NUM
"13-16" THEN CHURN "TRUE"
22Bayes Theorem
According to the Bayesian rule, the probability
of an example E being in class c is P(C ca1,
a2 , an) p(a1, a2 , anC c) p(C c)
p(a1, a2 , an) The classification is taken as
the Cs value with the largest
probability Assume all attributes are
independent given the class p(a1, a2 , anc)
p(a1c) p (a2c) .p(anc) The resulting
Bayesian classifier is called the Naïve Bayesian
classifier.
23Major Steps Of Data Mining
- Build Model Models are built in the data-mining
server - Test Model Model testing gives an estimate of
model accuracy - Compute Lift ODM supports computing lift for a
binary classification model (confidence of
prediction) - Apply Model Applying a supervised learning model
to data results in scores or predictions with an
associated probability - computing lift for a binary classification
model,
24Build Model
25 Apply Process
26Data For Modeling
Nature of Dataset Used for Study (real Wireless
Customer Data)
27System Setup
- Database
- Java Environment
- Data Mining Wizard
28Database Oracle 9.2.0.1.0
Installation of Oracle Database Software
9.2.0.1.0 with Oracle Data Mining Option, with
the database patch for version 9.2.0.2.1 .
29Java Environment JDeveloper
Installation of JDeveloper 9.0.3
30Data Mining Wizard DM4J
31Question
32Getting Started
- Unlock odm user
- Grants on the tables for wizard to display
- Odm_mtr schema
33Working with the DM4J Wizard
Creating a new Workspace
Configuring a Database Connection
34DM4J
Selecting a model type in the DM4J wizard.
35Algorithm for Data Modeling
Selecting the Algorithm
Fine tuning the algorithm
36DM4J
The DM4J wizard generates the Java code that is
compiled and executed to create the model.
37DM4J
Here is the Java Code!
38Our Study
The input data was stored in a table called
CALIBRATION.
Our target variable for prediction is CHURN.
39study
We pick all the input predictor variables (except
customer Id) from the list of 171 to predict
churn.
40study
compilation and execution of the Java code
containing the ODM model.
The program runs in an asynchronous mode and we
can monitor the progress of the task. The screen
shot shows the successful completion of the
model.
41study
The Adaptive Bayes Network also generates the
rules for the model in human readable form.
42study
Confusion Matrix
Testing the Model using the data from table
PRESENT
Cumulative Lift Chart
43study
The last step is to apply the tested model to the
data set where we want to predict the CHURN
44study
After the Apply task is run
When we apply the model, the predictions are
obtained and stored in an output table
45study
Rating the importance of the various predictor
variables.
46Top Ten Variables
- DUALBAND type of phone set
- CARTYPE dominant vehicle lifestyle
- EDUC1 education level of first house hold member
- ETHNIC ethnicity
- TOT_ACPT total offers accepted from retention
team - OCCU1 occupation of the first household member
- AREA geographic area
- INCOME estimated household income
- DWLLSIZE dwelling size
- PROPTYPE property type details
47Cost Savings Based on Churn Data
savings per churnable subscriber net(no
intervention) net(incentive) / L NL
net(no intervention) L NL X Cl
net(incentive) L LS Ci Pi L NL
Cl
To estimate cost savings, the parameters Ci (cost
of incentive per customer), Pi (reduction in
probability to churn due to incentive Ci), and Cl
(lost-revenue cost when a subscriber churns) are
combined with four statistics obtained from a
predictor model L number of subscribers who
are predicted to leave (churn) and who actually
leave barring Intervention. NL number of
subscribers who are predicted to stay (nonchurn)
and who actually leave barring Intervention. LS
number of subscribers who are predicted to leave
and who actually stay SS number of subscribers
who are predicted to stay and who actually stay
48Churn Management
Expected Saving to Carrier / Churnable Subscriber
Source Mozer 2000
49Future Trends and Conclusion
- Real time Analytics and Text Mining (Oracle 10G)
can take Data Mining to next level. - Oracle Data Mining can resolve a Business
problem. - Churn Prediction and Churn Management can yield
significant savings to the wireless provider.
50Daleen at a Glance
- Founded in 1989 with a mission to build custom
software for finance telecom sectors - Worldwide base of over 80 billing customer care
contracts since 1997 - Innovator in deployment of convergent billing,
event management revenue assurance solutions
for next-generation services - Long term focus on delivering exceptional
customer service through a site license or
service bureau relationship - Offices in Boca Raton, St. Louis, Amsterdam
Sydney
51A
52References Useful Links
- Technet http//technet.oracle.com/products/bi/odm/
9idm4j - Armstrong, G., and P. Kotler. 2001. Principles of
Marketing. Prentice Hall New Jersey. - Duke Teradata 2002. Teradata Center for Customer
Relationship Management. On-line. Retrieved on
Nov 7, 2002. Availablehttp//www.teradataduke.org
/news_t_2.html - In-Stat. 2002. WLNP Threatens to significantly
impact wireless churn rates. Online. Retrieved
on Sep 2002. - Available http//www.instat.com/newmk.asp?ID312
- Mozer, Michael, Richard Wolniewicz, Eric Johnson
and Howard Kaushansky. 1999. Churn
reduction in the wireless industry, Proceedings
of the Neural Information Processing Systems
Conference, San Diego, CA. - Oracle9i Data Mining 2001. An Oracle white paper
December 2001. Online. - Retrieved on Nov 8, 2002.
- Available http//otn.oracle.com/products/bi/pdf/
o9idm_bwp.pdf) - Skedd, Kirsten 2002. WLNP threatens to
significantly impact wireless churn rates
On-line. Retrieved on Sep 14, 2002. - Available http//www.instat.com/press.asp?ID311
skuIN020258WP
53Acknowledgements
- Dr Ravi Behara, Faculty (Florida Atlantic
University) - David Eastlund and Jennifer from Oracle
- Cohorts at Daleen Technologies
54Reminder Please complete the OracleWorld
online session survey.Session id 40332 Data
Warehousing for the Communications Industry
Thank you.
55Contact Information
- Email snath_at_daleen.com
- Cell Phone (954) 609-2402
- Test Message 9546092402_at_mobile.att.net
56(No Transcript)