Title: CSEngMtCpEng 404 Data Mining
1CS/EngMt/CpEng 404Data Mining Knowledge
Discovery
- Daniel C. St. Clair, PhD Christopher Merz, PhD
- University of MO Rolla Mastercard International
- Lect 1b Intro. to CRISP-DM
2Joe (Data) Miner
3Lecture 1 Contents
- Intro to CS/EMgt/CpE 404
- What is data mining KD?
- Data sources
- Data mining tasks
- Introduction to CRISP-DM
DSC
CM
4What is CRISP-DM?
- The Cross Industry Standard Process for Data
Mining - Consortium Members include
- SPSS
- Teradata
- DaimlerChrysler
- URL www.crisp-dm.org
- Product CRISP-DM version 1.0
5Advantages of CRISP-DM?
- Industry Neutral
- Tool Neutral
- Closely Related to KDD Process Model
- Anchors the Data Mining Process
- Attach data mining goals to business or
scientific goals - Prevents requirements drift
- Follow through to deployment
6Disadvantages of CRISP-DM?
- Less emphasis on addressing scientific problems
- Assumes knowledge of tools and / or modeling
methods - Does not fit all problems well
7How Will We Use CRISP-DM?
- See modified template on class web site
CRISP-DM-UMR-template.doc - Required for class project
- Major sections correspond to class project
milestones - Class schedule specifies when each major section
will be discussed
8The CRISP-DM Process Model
9The Knowledge Discovery Process
Source Fayyad, U., Piatetsky-Shapiro, G.,
Smyth, P, From Data Mining To Knowledge Discovery
In Databases, AI Magazine, Fall 1996.
10Relating the CRISP-DM Process to the Knowledge
Discovery Process
6. Evaluation
7. Deployment
5. Modeling
4.3 Construct Data
4.2 Clean Data
4.1 Select Data
3. Data Understanding
2. Business Understanding
11Relating the CRISP-DM and KDD Process Models
- CRISP-DM subsumes KDD Process
- Up front anchoring of data mining goals to
business / scientific goals - Tail end emphasis on deployment
- More emphasis on Data Understanding
12Peruse CRISP-DM Template
- Section 1 contains guidelines for completing
document
13Lecture 2 Covers
- CRISP-DM Section 2 - Business / Scientific
Understanding - Data Mining project description
14Assignment for Week 2
- CRISP-DM Class Template
- Retrieve template
- Display hidden text
- Review Sections 1 and 2
- CRISP-DM Reference
- Visit www.crisp-dm.org
- Retrieve CRISP-DM 1.0 Reference Guide
- NOTE Section numbers do not line up
15CS 404 Class Information
Instructors Daniel C. St. Clair, PhD Christopher
Merz, PhD University of MO Rolla Mastercard
International Phone (573) 341-6352 Phone
(636) 722-2143 e-mail stclair_at_umr.edu e-mail
merzc_at_umr.edu CS 404 web page www.umr.edu/s
tclair or http//web.umr.edu/stclair/class/cl
assfiles/cs404_ws04/
16CS/EngMt/CpEng 404Data Mining Knowledge
Discovery
- Daniel C. St. Clair, PhD Christopher Merz, PhD
- University of MO Rolla Mastercard International
- Lect 1 Intro. to Data Mining