How to Mine Your Data - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

How to Mine Your Data

Description:

Are you looking to identify potential new website customers? ... Noise i.e. how well is it coped with? Paradigms: Split data for training/testing? Customisation ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 18
Provided by: kend88
Category:
Tags: coped | data | mine

less

Transcript and Presenter's Notes

Title: How to Mine Your Data


1
How to Mine Your Data
  • K Dyson
  • School of Business

2
The Crucial Elements
  • Planning
  • The salient objective

3
Identifying Your Business Objective
  • What information do you wish to capture?
  • Need to involve your business, sales and
    marketing teams
  • What data should be purchased?
  • What are your business marketing goals- what
    data do you need to achieve them?

4
Possible Objectives
  • Are you looking to identify potential new website
    customers?
  • Are you looking to find specific website product
    trends?
  • Are you looking to identify buying patterns over
    time in your website?
  • Are you trying to increase response rate to a web
    campaign?
  • Are you trying to identify the features of your
    most profitable clients?

5
Select Data
  • Defined objective next data requirements
  • Is the data adequate to describe the phenomena?
  • Can the data from your web-site be consolidated
    with your data warehouse?
  • What external/internal information is available?
  • Are the datasets being merged consistent?

6
Data Requirements Transactional Data
  • Clear link data objectives
  • The more the better
  • Want a good sample of sales/no-sales
  • Plenty of descriptive information about customers
  • Full involvement retail, sales, marketing
    shipping departments

7
Data Preparation
  • Clean-up time
  • What condition is the data in, and what steps are
    needed to prepare it for analysis?
  • What strategies must be taken for handling
    missing data and noise or outliers?
  • How skewed is the data transform?
  • Format Y/N 1/0?
  • Missing data?

8
Evaluate Your Data
  • Perform a structural evaluation aids choice of
    data mining tool
  • Neural networks large number of numeric
    attributes
  • Rule based Large number of records attributes
  • Highly skewed /or binary response rule based

9
Format the Solution
  • What approach rule based, neural network etc?
  • Goal of the solution?
  • How will the knowledge gained be distributed?
  • What does management need?
  • Multiple tool solutions
  • Cluster identification Kohonen
  • Rules expert system

10
Tool Selection
11
The Fourteen Crucial Points (Part 1)
  • Scalability Larger dataset improved
    performance
  • Accuracy error rate
  • Format of solution breakdown of rules/solution
  • Solutions sensitivity analysis
  • Pre-processing, i.e. tool should allow you to
    carryout data
  • Cleaning
  • Selection
  • Transformation
  • Description

12
The Fourteen Crucial Points (Part 2)
  • Connectivity i.e. file formats, consolidation of
    datasets etc
  • Import/Export i.e. Excel, SPSS etc
  • Memory management i.e. time to process
  • Performance i.e. speed/problems
  • Noise i.e. how well is it coped with?
  • Paradigms
  • Split data for training/testing?
  • Customisation

13
The Fourteen Crucial Points (Part 3)
  • Efficiency i.e. time to derive a solution
  • Non-technical i.e. price, training support
  • Outsourcing?

14
Constructing the Model Solution
  • Data mining begins
  • Search for patterns
  • Choose your methodology

15
Questions to be Asked?
  • Is more data needed?
  • What are the error rates for the model are they
    acceptable?
  • Would a different methodology improve results?
  • Supervised or unsupervised learning?

16
Validate the Findings
  • Complete analysis
  • Share discuss the results i.e. sales, marketing
    etc
  • Brief domain experts
  • Do the findings make sense?
  • What important relationships were discovered?
  • Prepare a data mining report
  • Integrate the solution
  • Be dynamic i.e. business is in flux

17
Summary
  • Business Objectives
  • Data selection
  • Formulation of solution
  • Methodology selection
  • Delivery of results
Write a Comment
User Comments (0)
About PowerShow.com