Title: Data Mining and Decision Support Integration
1Data Mining andDecision SupportIntegration
ACAI05/SEKT05 ADVANCED COURSE ON KNOWLEDGE
DISCOVERY
- Marko Bohanec
- Joef Stefan Institute
- Department of Knowledge Technologies
-
- University of Ljubljana
- Faculty of Administration
2Data Mining vs. Decision Support
Data Mining
knowledge discovery from data
- Use of models
- classification
- clustering
- evaluation
- analysis
- visualization
- explanation
- ...
model
data
3Overview
- 1. Decision Support
- Decision problem
- Decision-making
- Decision support
- Decision analysis
- Multi-attribute modeling
- 2. Decision Support and Data Mining
- How to combine and integrate DS and DM?
- DS for DM
- DM for DS
- DM, then DS
- DS, then DM
- DM and DS
- DS for DM ROC space
- DM and DS Combining DEX and HINT
4Literature
- Part I Basic Technologies
- Chapter 3 Decision Support
- Chapter 4 Integration ofData Mining and
Decision Support - Part II Integration Aspects of DM and DS
- Chapter 7 DS for DM ROC Analysis
- Part III Applications of DM and DS
- Chapter 15 Five Decision Support Applications
- Chapter 16 Large and Tall Buildings
- Chapter 17 Educational Planning
5Decision SupportDecision ProblemDecision-Making
Decision SupportDecision AnalysisMulti-Attribut
e Modeling
Chapter 3 M. Bohanec Decision Support
6Decision-Making
- Decision
- The choice of one among a number of alternatives
- Decision-Making
- A process of making the choice that includes
- Assessing the problem
- Collecting and verifying information
- Identifying alternatives
- Anticipating consequences of decisions
- Making the choice using sound and logical
judgment based on available information - Informing others of decision and rationale
- Evaluating decisions
7Decision Problem
options(alternatives)
goals
- FIND the option that best satisfies the goals
- RANK options according to the goals
- ANALYSE, JUSTIFY, EXPLAIN, , the decision
8Types of Decisions
- Easy (routine, everyday) vs. Difficult (complex)
- One-Time vs. Recurring
- One-Stage vs. Sequential
- Single Objective vs. Multiple Objectives
- Individual vs. Group
- Structured vs. Unstructured
- Tactical, Operational, Strategic
9Characteristics of Complex Decisions
- Novelty
- Unclearness Incomplete knowledge about the
problem - Uncertainty Outside events that cannot be
controlled - Multiple objectives (possibly conflicting)
- Group decision-making
- Important consequences of the decision
- Limited resources
10Decision-Making
- Decision Systems
- Switching circuits
- Processors
- Computer programs
- Systems for routine DM
- Autonomous agents
- Space probes
Decision Sciences
11Decision-Making
Decision Sciences
Normative
Descriptive
Decision Support
Decision Theory Utility Theory Game Theory Theory
of Choice ?
Cognitive Psychology Social and Behavioral
Sciences ?
12Decision Support
- Decision SupportMethods and tools for
supporting people involved in the decision-making
process - Central Disciplines
- Operations Research and Management Sciences
- Decision Analysis
- Decision Support Systems
- Contributing and Related Disciplines
- Decision Sciences (other than DS itself)
- Statistics, Applied Mathematics
- Computer SciencesInformation Systems,
Databases, Data Warehouses, OLAP - Artificial Intelligence Expert Systems, ML, NN,
GA - Knowledge Discovery from Databases and Data
Mining - Other Methods and Tools
- Representation and visualization tools
- Methods and tools for organizing data, facts,
thoughts, ... - Communication technology
- Mediation systems
13Decision-Making
Decision Sciences
Normative
Descriptive
Decision Support
Decision trees
Influence diagrams
Multi-attribute models
14Decision Analysis
- Decision Analysis Applied Decision Theory
- Provides a framework for analyzing decision
problems by - structuring and breaking them down into more
manageable parts, - explicitly considering the
- possible alternatives,
- available information
- uncertainties involved, and
- relevant preferences
- combining these to arrive at optimal (or "good")
decisions
15The Decision Analysis Process
Identify decision situation and understand
objectives
Identify alternatives
- Decompose and model
- problem structure
- uncertainty
- preferences
Sensitivity Analyses
Choose best alternative
Implement Decision
16Evaluation Models
options
EVALUATION MODEL
17Types of Models in Decision Analysis
18Multi-Attribute Models
cars
buying
maint
PRICE
safety
CAR
doors
TECH
COMF
pers
lug
problem decomposition
19Tree of Attributes
- Decomposition of the problem to sub-problems
("Divide and Conquer!")
CAR
The most difficult stage!
20Utility Functions (Aggregation)
- Aggregation bottom-up aggregation of attributes
values
CAR
21Evaluation and Analysis
- direction bottom-up(terminal ? root
attributes) - result each option evaluated
- inaccurate/uncertain data?
22Evaluation and Analysis
- interactive inspection
- what-if analysis
- sensitivity analysis
- explanation
23DEXi Computer Program forMulti-Attribute
Decision Making
- Creation and editing of
- model structure (tree of attributes)
- value scales of attributes
- decision rules (incl. using weights)
- options and their descriptions (data)
- Evaluation of options(can handle missing values)
- What-if analysis
- Reporting
- tables
- charts
http//www-ai.ijs.si/MarkoBohanec/dexi.html
24Some Application Areas
- INFORMATION TECHNOLOGY
- evaluation of computers
- evaluation of software
- evaluation of Web portals
- PROJECTS
- evaluation of projects
- evaluation of proposal and investments
- product portfolio evaluation
- COMPANIES
- business partner selection
- performance evaluation of companies
- PERSONNEL MANAGEMENT
- personnel evaluation
- selection and composition of expert groups
- evaluation of personal applications
- educational planning
- MEDICINE and HEALTH-CARE
- risk assessment
- diagnosis and prognosis
- OTHER AREAS
- assessment of technologies
- assessments in ecology and environment
- granting personal/corporate loans
- choosing sports
25Allocation of Housing Loans
Ownership
Present
Suitability
Solving
Housing
Stage
Work stage
Advantages
Earnings
Priority
Status
Maint/Employ
Health
Family
Soc-Health
-
Age
Social
Children
26MedicineBreast Cancer Risk Assessment
Bohanec, M., Zupan, B., Rajkovic, V.
Applications of qualitative multi-attribute
decision models in health care, International
Journal of Medical Informatics 58-59, 191-205,
2000.
27Evaluation and Analysis of Options
28Selective Explanation of Options
29Diabetic Foot Risk Assessment
- Who
- General Hospital Novo Mesto, Slovenia
- IJS
- Infonet, d.o.o.
- Why
- Reduce the number of amputations
- Improve the risk assessment methodology
- Improve the DSS module of clinical information
system - How
- Develop multi-attribute risk assessment model
- Evaluate it on patient data (about 3400 patients)
- Integrate into the clinical information system
Chapter 15 M. Bohanec, V. Rajkovic, B. Cestnik
5 DS Applications
30Diabetic Foot Risk Assessment
312. Combining Data Mining and Decision
SupportHow to combine DS and DM?DS for DM
ROC spaceDM and DS Combining DEX and HINT
Chapter 4 N. Lavrac, M. Bohanec Integration of
DM and DS
32Data Mining vs. Decision Support
Data Mining
knowledge discovery from data
- Use of models
- classification
- clustering
- evaluation
- analysis
- visualization
- explanation
- ...
model
data
33DM DS Integration ?
Data Mining
Decision Support
34DM DS Integration !
35Combining DM and DS
- DS for DM
- ROC methodology
- meta-learning
- DM for DS
- MS Analysis Services
- model revision (from data)
- DM, then DS (sequential application)
- Decisions-At-Hand approach
- DS, then DM (sequential application)
- using models in data pre-processing for DM
- DM and DS (parallel application)
- combining through models, e.g., DEXi and HINT
- considering different problem dimensions
36DS for DM
Data Mining
Decision Support
Decision support within the DM processe.g., ROC
curves
37ROC space
- True positive rate true pos. / pos.
- TPr1 40/50 80
- TPr2 30/50 60
- False positive rate false pos. / neg.
- FPr1 10/50 20
- FPr2 0/50 0
- ROC space has
- FPr on X axis
- TPr on Y axis
Chapter 7 Slides by Peter Flach
38The ROC convex hull
39The ROC convex hull
40Choosing a classifier
41Choosing a classifier
42DM for DS
Data Mining
Decision Support
- Introducing DM methods into the DS process
- MS SQL Server - Analysis Services
- model revision
43DM for DS Model Revision
44Sequential ApplicationFirst DS, then DM
Decision Support
DataMining
Model 1
Model 2
45First DS, then DMin Data Pre-Processing
Input attributes
Generated attributes
46Sequential ApplicationFirst DM, then DS
DataMining
Decision Support
Model 1
Model 2
47Decisions-At-Hand Schema
Decision Support Shells
on Palm
Data Mining (Model Construction)
on the Web
(Synchronization or Upload)
Bla Zupan et al. http//www.ailab.si/app/palm/
48DM and DSThrough Model Development
Requirements
Expertise
Expertise
Data
Data
Data Mining
Decision Support
Model
Chapter 4 references
49Multi-Attribute Decision Models
Expertise
Data
HINT
DEX
Data Mining
Decision Support
Model
Qualitative Hierarchical Multi-Attribute Decision
Models
50 1. Qualitative Multi-Attribute Models
Model
- Decomposition of the problem to less complex
subproblems - Qualitative attributes
- Decision rules
512. Expertise
Expertise
- Understanding of the decision problem and ways
for its solving by - Decision owner(s)
- Expert(s)
- Decision analyst(s)
- User(s)
3. Data
Data
- Previously solved decision problems
- Attribute-value representation
524. DEX
DEX
- "An Expert System Shell for Multi-Attribute
Decision Making" - Functionality
- Acquisition of attributes and their hierarchy.
- Acquisition and consistency checking of decision
rules. - Description, evaluation and analysis of options.
- Explanation of evaluation results.
- Over 50 real-life applications
- Health-care
- Education
- Industry
- Land-use planning
- Ecology
- Evaluation of enterprises, products, projects,
investments, ...
535. HINT
HINT
Hierarchy INduction Tool Automated development
of hierarchical models from data based on
Function Decomposition
54HINT Further Information
http//magix.fri.uni-lj.si/hint/
55HINT Implementation In ORANGE
http//magix.fri.uni-lj.si/orange/
56Application Housing Loan Allocation
- User Housing Fund of the Republic of Slovenia
- Task Allocating available funds to applicants
for housing loans - MethodUsing a multi-attribute model for
priority evaluation of applications - Supported by a DSS since 1991
- Completed floats of loans 21
- Applications 44378 received, 27813 approved
- Allocated loans 254 million (2/3 of housing
loans in Slovenia)
57Modes of Operation
- DEX only from expertise
- HINT only from data
- Supervised from data under expert supervision
- Serial HINT-developed model subsequently refined
by the expert - Parallel parallel development of model(s) by DEX
and HINT - Combined combining sub-models developed in
different ways
581. DEX-Only Mode
592. HINT-Only Mode (1 of 2)
- Reconstruction of the original model from
unstructured data - Real-life data from one float in 1994
- 1932 applications
- 12 attributes (2 to 5 values)
- 722 unique examples
- 3.7 coverage of the attribute space
- unsupervised decomposition
602. HINT-Only Mode (2 of 2)
- Results
- Relatively good overall structure
- Inappropriate structure around c3
- Excellent classification accuracy
- HINT 94.7 2.5
- C4.5 88.9 3.9
613. Supervised Mode (1 of 4)
Unstructured dataset
Redundant cult_hist, fin_sources
623. Supervised Mode (2 of 4)
- All partitions with b3 and minimal ? (?3) 11
of 120
New concept status
633. Supervised Mode (3 of 4)
- All partitions with b3 and minimal ? (? 4) 3
of 56
New concepts social and then present
643. Supervised Mode (4 of 4)
- Results
- Expert sastified with the structure
- Improved classification accuracy
- supervised 97.8 1.8
- unsupervised 94.7 2.5
654. Serial Mode
- Develop an initial model by HINT from data
- Extend/enhance the model "manually" using DEX
- For example
- Take the model developed by HINT in supervised
mode - Add the attributes cult-hist and fin-sources
- Extend the model structure
- Define the corresponding decision rules
665. Parallel Mode
- Develop two or more independent models by HINT
and DEX for - comparison
- "second opinion"
- flexibility
- For example, in this research we developed
- one DEX model
- two HINT models in supervised and unsupervised
mode
676. Combined Mode
- Develop a single model using sub-models developed
- by different methods and
- from different sources
- Hypothetical example
- Develop subtree for status by HINT
- Develop soc-health by HINT from a different data
set - A real-estate expert develops the house subtree
using DEX - All three models "glued" together in DEX by a
loan-allocation expert
68DEX and HINT Results
- Integration of DM and DS for model-based problem
solving - Requirements
- common model representation
- expertise and data (possibly partial)
- methods for "automatic" (DM) and "manual" (DS)
model development - Offers a multitude of method combinations
- independent, serial, parallel, combined,
- Specific schema
- qualitative hierarchical multi-attribute models
- DEX as a DS method
- HINT as a DM method
- Real-world application Housing loan allocation
- Application of DEX-only, HINT-only, supervised
and parallel modes - Integration of DS and DM through HINT improved
both the classification accuracy and
comprehensibility of the model
69Parallel ApplicationsMultiple DM models, then DS
DataMining
Decision Support
Model 1
Model 3
Model 2
70Problem Prediction of Academic Achievement
Primary School
High School
Chapter 17 S. Gasar, M. Bohanec, V. Rajkovic
71DM DS Integration Academic Achivement
Data
DM HINT
DM Weka
DS DEXi
72Parallel ApplicationEC Harris
Chapter 16 Steve Moyle, Marko Bohanec, Eric
Ostrowski
73Conclusion
- DM DS approaches are
- complementary
- supplementary
- New and developing research area
- Typical combinations
- DS for DM
- DM for DS
- DM, then DS
- DS, then DM
- DM and DS
- Open questions
- formalization (framework) of DMDS integration
- common methodologies and approaches
- standardization