Classification in Complex Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Classification in Complex Systems

Description:

Classification in Complex Systems. Why we should look at the paper: CAEP: Classification by Aggregating Emerging Patterns. G. Dong, X. Zhang, L. Wong, and J Li ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 10
Provided by: DATA86
Category:

less

Transcript and Presenter's Notes

Title: Classification in Complex Systems


1
Classification in Complex Systems
  • Why we should look at the paper
  • CAEP Classification by Aggregating Emerging
    Patterns
  • G. Dong, X. Zhang, L. Wong, and J Li

2
What are Common Problems in Classification?
  • Many variables
  • Graphs that relate tuples
  • Protein-protein interactions (KDD-cup 02)
  • Citations (KDD-cup 03)
  • Anything that violates standard table format

3
Many Variables
  • Solution
  • Naïve Bayes way of multiplying probabilities
  • Other additive models
  • Problems
  • Many factors
  • May be correlated
  • Noise
  • but it gets worse

4
Graphs
  • 2 kinds of attributes
  • Attributes within nodes
  • Attributes of neighbor and more distant nodes
  • How do neighbor attributes count?
  • Take disjunction?
  • At least one neighbor that has a particular
    property
  • Probably preferable
  • Use links or, more general, paths as basis
  • Integration into classification???

5
Idea
  • Get away from strict set of n attributes
  • If an attribute or combination of attributes is
    interesting use them
  • Combining rules?
  • I would have guessed as in Naïve Bayes
  • CAEP adds probabilities!?

6
What is interesting
  • CAEP paper claims growth rate
  • Support of a rule increases significantly from
    one class label to another
  • Note Only increase, not decrease!
  • What does that mean?
  • For pattern e and classes P and N
  • growth_rateP?N (e)
  • suppN (e) / suppP (e)

7
2 Things Worth Investigating
  • Is interestingness measure related to
    information gain?
  • Under certain assumptions Yes
  • Can the score be justified?
  • Sum of P(C)!?

8
Other Issues
  • Normalization
  • Emerging patterns only consider increase in
    support gt different number of relevant patterns
  • How to mine for EPs

9
Conclusions
  • Idea very valuable
  • Classification split into ARM-step and rule
    combination
  • Justification of details?
  • Not great
  • Should be possible to do it right with poorer
    accuracy -)
Write a Comment
User Comments (0)
About PowerShow.com