Intelligent Data Analysis IDA - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Intelligent Data Analysis IDA

Description:

Multidimensionality of problems is looking for methods for ... competitive advantage by revealing underlying interactions between factors within the data. ... – PowerPoint PPT presentation

Number of Views:627
Avg rating:3.0/5.0
Slides: 33
Provided by: profd161
Category:

less

Transcript and Presenter's Notes

Title: Intelligent Data Analysis IDA


1
Intelligent Data Analysis (IDA)
  • by
  • Josipa Kern, PhD
  • Andrija Stampar School of Public Health
  • Medical School University of Zagreb
  • Zagreb, Croatia

2
Interest and Excitement for Intelligent Data
Analysis
  • Decision making is asking for information and
    knowledge
  • Data processing can give them
  • Multidimensionality of problems is looking for
    methods for adequate and deep data processing and
    analysis

3
Learning Objectives
  • To understand the concept of the IDA
  • To meet web-sites and literature on IDA
  • To meet some tools for IDA
  • To learn how to use IDA tools and to validate the
    IDA results

4
Performance Objectives
  • Recognize problems asking for IDA
  • Preparing data and making analysis
  • Validating and interpreting results of IDA

5
IDA is
  • an interdisciplinary study concerned with the
    effective analysis of data
  • used for extracting useful information from
    large quantities of online data extracting
    desirable knowledge or interesting patterns from
    existing databases

6
IDA or
  • Data mining
  • Knowledge acquisition from data
  • Genetic algorithm-based rule discovery
  • Knowledge discovery
  • Learning classifier system
  • Machine learning
  • etc.

7
IDA gives knowledge
8
Knowledge is
  • the distillation of information that has been
    collected, classified, organized, integrated,
    abstracted and value-added
  • at a level of abstraction higher than the data,
    and information on which it is based and can be
    used to deduce new information and new knowledge
  • usually in the context of human expertise used in
    solving problems.

9
Knowledge acquisition
  • The process of eliciting, analyzing,
    transforming, classifying, organizing and
    integrating knowledge and representing that
    knowledge in a form that can be used in a
    computer system.

10
Knowledge in a domain can be expressed as a
number of rules
11
Rule is
  • A formal way of specifying a recommendation,
    directive, or strategy, expressed as "IF premise
    THEN conclusion" or "IF condition THEN action".

12
How to discover rules hidden in the data?
13
Some tools for IDA
  • See5 - program for analyzing data and generating
    classifiers in the form of decision trees and/or
    rule sets.
  • http//www.rulequest.com

14
Some tools for IDA
  • Cubist - analyzes data and generates rule-based
    piecewise linear models collections of rules,
    each with an associated linear expression for
    computing a target value..
  • http//www.rulequest.com

15
Some tools for IDA
  • ILLM - the tool constructs classification models
    in the form of rules which represent knowledge
    about relations hidden in data.
  • http//dms.irb.hr

16
Some tools for IDA
  • Magnum Opus - finds association rules providing
    competitive advantage by revealing underlying
    interactions between factors within the data. 
  • http//www.rulequest.com

17
Evaluation of IDA results
  • Absolute relative accuracy
  • Sensitivity specificity
  • False positive false negative
  • Error rate
  • Reliability of rules
  • Etc.

18
Example of IDA
  • Illustration of IDA by using See5

19
See5application
  • application.names - lists the classes to which
    cases may belong and the attributes used to
    describe each case.
  • Attributes are of two types discrete attributes
    have a value drawn from a set of possibilities,
    and continuous attributes have numeric values.

20
See5application
  • application.data - provides information on the
    training cases from which See5 will extract
    patterns.
  • The entry for each case consists of one or more
    lines that give the values for all attributes.

21
See5application
  • application.test - provides information on the
    test cases (used for evaluation of results).
  • The entry for each case consists of one or more
    lines that give the values for all attributes.

22
See5applicationexample
  • Epidemiological study (1970-1990)
  • Sample of examinees died from cardiovascular
    diseases during the period
  • Question Did they know they were ill?
  • 1 they were healthy
  • 2 they were ill (drug treatment, positive
    clinical and laboratory findings)

23
See5applicationexample
  • application.names example
  • Goal.
  • genderM,F
  • activity1,2,3
  • age continuous
  • smoking No,Yes
  • Goal1,2

24
See5applicationexample
  • application.data example
  • M,1,59,Yes,0,0,0,0,119,73,103,86,247,87,15979,?,?,
    ?,1,73,2.5
  • M,1,66,Yes,0,0,0,0,132,81,183,239,?,783,14403,2722
    1,19153,23187,1,73,2.6
  • M,1,61,No,0,0,0,0,130,79,148,86,209,115,21719,1232
    4,10593,11458,1,74,2.5

25
See5applicationexample
  • Results example
  •  
  • Rule 1 (cover 26)
  • gender M
  • SBP gt 111
  • oil_fat gt 2.9
  • -gt class 1 0.929

26
See5applicationexample
  • Results example
  •  
  • Rule 4 (cover 14)
  • smoking Yes
  • SBP gt 131
  • glucose gt 93
  • glucose lt 118
  • oil_fat lt 2.9
  • -gt class 2 0.938

27
See5applicationexample
  • Results example
  •  
  • Rule 15 (cover 2)
  • SBP lt 111
  • oil_fat gt 2.9
  • -gt class 2 0.750

28
See5applicationexample
  • Results example
  •  
  • Evaluation on training data
  • (199 cases)
  •   (a) (b) lt-classified as
  • ---- ----
  • 107 3 (a) class 1
  • 17 72 (b) class 2

29
See5applicationexample
  • Results example (training set)
  •  
  • Sensitivity0.97
  • Specificity0.81

30
See5applicationexample
  • Results example
  •  
  • Evaluation on test data
  • (73 cases)
  •  
  • (a) (b) lt-classified as
  • ---- ----
  • 43 1 (a) class 1
  • 3 26 (b) class 2

31
See5applicationexample
  • Results example (test set)
  •  
  •  Sensitivity0.98
  • Specificity0.90

32
All the suggested IDA tools are available at
mentioned URLs, at least as demo version Try
your own IDA Thank you!
Write a Comment
User Comments (0)
About PowerShow.com