Treatment Learning: Implementation and Application - PowerPoint PPT Presentation

About This Presentation
Title:

Treatment Learning: Implementation and Application

Description:

Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 29
Provided by: Zhu106
Category:

less

Transcript and Presenter's Notes

Title: Treatment Learning: Implementation and Application


1
Treatment LearningImplementation and Application
  • Ying Hu
  • Electrical Computer Engineering
  • University of British Columbia

2
Outline
  • An example
  • Background Review
  • TAR2 Treatment Learner
  • TARZAN Tim Menzies
  • TAR2 Ying Hu Tim Menzies
  • TAR3 improved tar2
  • TAR3 Ying Hu
  • Evaluation of treatment learning
  • Application of Treatment Learning
  • Conclusion

3
First Impression
  • Boston Housing Dataset
  • (506 examples, 4 classes)

4
Review Background
  • What is KDD ?
  • KDD Knowledge Discovery in Database fayyad96
  • Data mining one step in KDD process
  • Machine learning learning algorithms
  • Common data mining tasks
  • Classification
  • Decision tree induction (C4.5) quinlan86
  • Nearest neighbors cover67
  • Neural networks rosenblatt62
  • Naive Bayes classifier duda73
  • Association rule mining
  • APRIORI algorithm agrawal93
  • Variants of APRIORI

5
Treatment Learning Definition
  • Input classified dataset
  • Assume classes are ordered
  • Output Rxconjunction of attribute-value pairs
  • Size of Rx of pairs in the Rx
  • confidence(Rx w.r.t Class) P(ClassRx)
  • Goal to find Rx that have different level of
    confidence across classes
  • Evaluate Rx lift
  • Visualization form of output

6
Motivation Narrow Funnel Effect
  • When is enough learning enough?
  • Attributes lt 50, accuracy decrease 3-5
    shavlik91
  • 1-level decision tree is comparable to C4
    Holte93
  • Data engineering ignoring 81 features result in
    2 increase of accuracy kohavi97
  • Scheduling random sampling outperforms complete
    search (depth-first) crawford94
  • Narrow funnel effect
  • Control variables vs. derived variables
  • Treatment learning finding funnel variables

7
TAR2 The Algorithm
  • Search attribute utility estimation
  • Estimation heuristic Confidence1
  • Search depth-first search
  • Search space confidence1 gt threshold
  • Discretization equal width interval binning
  • Reporting Rx
  • Lift(Rx) gt threshold
  • Software package and online distribution

8
The Pilot Case Study
  • Requirement optimization
  • Goal optimal set of mitigations in a cost
    effective manner

Risks
Cost
relates
Requirements
incur
reduce
achieve
Mitigations
Benefit
  • Iterative learning cycle

9
The Pilot Study (continue)
  • Cost-benefit distribution (30/99 mitigations)

10
Problem of TAR2
  • Runtime vs. Rx size
  • To generate Rx of size r
  • To generate Rx from size 1..N

11
TAR3 the improvement
  • Random sampling
  • Key idea
  • Confidence1 distribution probability
    distribution
  • sample Rx from confidence1 distribution
  • Steps
  • Place item (ai) in increasing order according to
    confidence1 value
  • Compute CDF of each ai
  • Sample a uniform value u in 0..1
  • The sample is the least ai whose CDFgtu
  • Repeat till we get a Rx of given size

12
Comparison of Efficiency
13
Comparison of Results
  • 10 UCI domains, identical best Rx
  • Final Rx TAR219, TAR320

14
External Evaluation
C4.5 Naive Bayes
  • FSS framework

All attributes (10 UCI datasets)
Feature subset selector TAR2less
15
The Results
  • Number of attributes
  • Accuracy using C4.5
  • (avg decrease 0.9)
  • Accuracy using Naïve Bayes

(Avg increase 0.8 )
16
Compare to other FSS methods
  • of attribute selected (C4.5 )
  • of attribute selected (Naive Bayes)
  • 17/20, fewest attributes selected
  • Another evidence for funnels

17
Applications of Treatment Learning
  • Downloading site http//www.ece.ubc.ca/yingh/
  • Collaborators JPL, WV, Portland, Miami
  • Application examples
  • pair programming vs. conventional programming
  • identify software matrix that are superior error
    indicators
  • identify attributes that make FSMs easy to test
  • find the best software inspection policy for a
    particular software development organization
  • Other applications
  • 1 journal, 4 conference, 6 workshop papers

18
Main Contributions
  • New learning approach
  • A novel mining algorithm
  • Algorithm optimization
  • Complete package and online distribution
  • Narrow funnel effect
  • Treatment learner as FSS
  • Application on various research domains

19
  • Some notes follow

20
Rx Definition example
  • Input example
  • classified dataset
  • Output example
  • Rxconjunction of attribute-value pairs
    confidence(Rx w.r.t C) P(CRx)

21
TAR2 in practice
  • Domains containing narrow funnels
  • A tail in the confidence1 distribution
  • A small number of variables that have
    disproportionally large confidence1 value
  • Satisfactory Rx of small size (lt6)

22
Background Classification
  • 2-step procedure
  • The learning phase
  • The testing phase
  • Strategies employed
  • Eager learning
  • Decision tree induction (e.g. C4.5)
  • Neural Networks (e.g. Backpropagation)
  • Lazy learning
  • Nearest neighbor classifiers (e.g. K-nearest
    neighbor classifier)

23
Background Association Rule
ID Transactions
1 A, B, C,E,F
2 B,C,E
3 B,C,D,E
4
  • Representative algorithms
  • APRIORI
  • Apriori property of large itemset
  • Max-Miner
  • More concise representation of the discovered
    rules
  • Different prune strategies.
  • Possible Rule
  • B gt C,E
  • support2, confidence 80
  • Where
  • support(X-gtY) P(X)
  • confidence(X-gtY) P(YX)

24
Background Extension
  • CBA classifier
  • CBA Classification Based on Association
  • XgtY, Y class label
  • More accurate than C4.5 (16/26)
  • JEP classifier
  • JEP Jumping Emerging Patterns
  • Support(X w.r.t D1) 0, Support(X w.r.t D2) gt 0
  • Model collection of JEPs
  • Classify maximum collective impact
  • More accurate than both C4.5 CBA (15/25)

25
Background Standard FSS Method
  • Information Gain attribute ranking
  • Relief
  • Principle Component Analysis (PCA)
  • Correlation based feature selection
  • Consistency based subset evaluation
  • Wrapper subset evaluation

26
Comparison
  • Relation to classification
  • Class boundary / class density
  • Class weighting
  • Relation to association rule mining
  • Multiple classes / no class
  • Confidence-based pruning
  • Relation to change detecting algorithm
  • support P(Xyc1)-P(Xyc2)
  • confidence P(yc1X)-P(yc2X)
  • Bayes rule

27
Confidence Property
  • Universal-extential upward closure
  • R1 Age.young -gt Salary.low
  • R2 Age.young, Gender.m -gt Salary.low
  • R2 Age.young, Gender.f -gt Salary.low
  • Long rule tend to have high confidence
  • Large Rx tend to have high lift value

28
TAR3 Usability
  • Usability more user-friendly
  • Intuitive, default setting
Write a Comment
User Comments (0)
About PowerShow.com