CRISPDM - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

CRISPDM

Description:

... mining market: DaimlerChrysler (then Daimler-Benz), SPSS (then ISL), and NCR. CRISP-DM succeeds because it is soundly based on the practical, real-world ... – PowerPoint PPT presentation

Number of Views:325
Avg rating:3.0/5.0
Slides: 10
Provided by: dly3
Category:
Tags: crispdm | benz

less

Transcript and Presenter's Notes

Title: CRISPDM


1
CRISP-DM
  • Cross-Industry Standard Process
  • For Data Mining
  • CRISP-DM was conceived in late 1996 by three
    veterans of the young and immature data mining
    market DaimlerChrysler (then Daimler-Benz), SPSS
    (then ISL), and NCR. CRISP-DM succeeds because it
    is soundly based on the practical, real-world
    experience of how people do data mining projects.

2
CRISP-DM Process
  • It is an iterative, adaptive process
  • Communication of the business problem
  • Data collection and management
  • Data preprocessing
  • Model building
  • Model evaluation
  • Model deployment

3
The CRISP-DM Process Model An overview of the
life cycle of a data mining project
  • The model contains
  • the corresponding phases of a project,
  • their respective tasks, and
  • relationships between these tasks
  • depending on goals, background and interest of
    the user, and most importantly depending on the
    data.
  • The Version 1.0 Process Guide and User Manual
    contains step-by-step directions, tasks and
    objectives for each phase of the Data Mining
    Process.

4
Phases of the CRISP-DM Process Model
5
http//www.crisp-dm.org/index.htm
  • The sequence of the phases is not strict.
  • Moving back and forth between different phases is
    always required.
  • It depends on the outcome of each phase which
    phase, or which particular task of a phase, that
    has to be performed next. (adaptive)
  • The arrows indicate the most important and
    frequent dependencies between phases.
  • The outer circle in the figure symbolizes the
    cyclic nature of data mining itself. (iterative)
  • A data mining process continues after a solution
    has been deployed.
  • The lessons learned during the process can
    trigger new, often more focused business
    questions.
  • Subsequent data mining processes will benefit
    from the experiences of previous ones.

6
Outline of the 6 phases
  • Business Understanding
  • Focuses on understanding the project objectives
    and requirements from a business perspective
  • Convert this knowledge into a data mining problem
    definition, and a preliminary plan designed to
    achieve the objectives.
  • Data Understanding
  • Start with an initial data collection
  • Proceed with activities in order to get familiar
    with the data, to identify data quality problems,
    to discover first insights into the data, or to
    detect interesting subsets to form hypotheses for
    hidden information.

7
  • Data Preparation
  • Cover all activities to construct the final
    dataset to be fed into the modeling tool(s).
  • Preparation tasks are likely to be performed
    multiple times, and not in any prescribed order.
    Tasks include table, record, and attribute
    selection as well as transformation and cleaning
    of data for modeling tools.
  • Modeling
  • Various modeling techniques are selected and
    applied, with optimal parameter values.
  • There are several techniques for the same data
    mining problem type. Some have specific
    requirements on the form of data. Going back to
    the last phase is often needed.

8
  • Evaluation
  • Review the steps executed to construct the model,
    to be certain it properly achieves the business
    objectives.
  • Check if there is some important business issue
    not being sufficiently considered.
  • At the end of this phase, a decision on the use
    of the data mining results should be reached.
  • Deployment
  • Organize and present the result in a way that
    customers can use it.
  • The deployment phase can be as simple as
    generating a report or as complex as implementing
    a repeatable data mining process.
  • If the customers carry out the deployment steps,
    they must understand up front the actions needed
    to use the created models.

9
????
?????????
Web ???
???
???
????? ????/?? ????/?? ???? ?????

????
???/????
????? ??????
??????
????
...
??? ???
??? ???
??????
??? ???
Write a Comment
User Comments (0)
About PowerShow.com