Stability analysis on rough set based feature evaluation - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Stability analysis on rough set based feature evaluation

Description:

Rough set theory is widely discussed in feature evaluation and attribute reduction ... Rij reflects the correlation between the ith and the jth estimates; ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 20
Provided by: huqin
Category:

less

Transcript and Presenter's Notes

Title: Stability analysis on rough set based feature evaluation


1
Stability analysis on rough set based feature
evaluation
  • Qing-Hua Hu
  • Harbin Institute of Technology
  • May 15, 2008

2
1. How to evaluate a feature evaluation and
selection algorithm?
  • Classification performance of the selected
    features
  • Number of selected features
  • Estimation precision of Bayes error rate
  • Linear or nonlinear
  • Whether deal with heterogeneous features

3
2. Rough set based feature evaluation
  • Rough set theory is widely discussed in feature
    evaluation and attribute reduction
  • Pawlak rough set----nominal features
  • neighborhood rough set--- heterogeneous features
  • Dominance rough set---- heterogeneous features
    in ordinal decision
  • Fuzzy rough set---- heterogeneous features

4
3. Stability problem in feature evaluation
  • Kalousis, J. Prados, M. Hilario. Stability of
    feature selection algorithms a study on
    high-dimensional spaces. Knowledge and
    Information Systems (2007) 12(1)95116
  • In fact, the feature quality is estimated with
    training samples
  • Precision of estimation depends on training
    samples and evaluation function
  • A question how evaluation functions act if the
    samples and parameters used in the functions are
    perturbed?

5
3. Technique for stability analysis (1)
  • perturbing samples like cross validation
  • dividing samples into k subsets S1, S2, , Sk
  • using k-1 of the subsets to compute the quality
  • Q1q11,q12, , q1N
  • producing k estimates of feature quality
  • (Q1,Q2, , Qk)
  • computing the correlative coefficients of k
    estimates

6
3.Technique for stability analysis (1)
  • computing the overall stability of estimates
  • Rij reflects the correlation between the ith and
    the jth estimates
  • If a feature evaluation function is stable,
    Rij?1 otherwise , Rij?0
  • Assume we get the correlative coefficient matrix,
    we then should compute the total stability
  • Algorithm 1

7
3. Technique for stability analysis (1)
  • Algorithm 2 fuzzy entropy

8
3. Technique for stability analysis (2)
  • Feature ranking
  • perturbing samples like cross validation
  • dividing samples into k subsets S1, S2, , Sk
  • using k-1 of the subsets to compute the quality
  • Q1q11,q12, , q1N
  • producing k estimates of feature quality
  • (Q1,Q2, , Qk)
  • Ranking features with the feature quality
  • Computing the Spearmans rank correlation
    coefficient matrix

9
3. Technique for stability analysis (3)
  • Feature subsets
  • perturbing samples
  • dividing samples into k subsets S1, S2, , Sk
  • using k-1 of the subsets to select features f
  • producing k feature subsets (f1,f2, , fk)
  • Computing the similarity matrix of different
    feature subsets
  • Computing the fuzzy entropy of the matrix

10
4. Feature evaluation functions to be compared
  • Pawlak dependency (D)
  • Swiniarski , Skowron. Rough set
    methods in feature selection and recognition.
    pattern recognition letters 24 (6) 833-849, 2003
  • Consistency (C)
  • Dash, Liu. Consistency-based search
    in feature selection. Artificial Intelligence 151
    (2003) 155176
  • Neighborhood dependency (ND)
  • Hu, Yu, Xie. Neighborhood
    classifier. Expert Systems with Applications 34
    (2008) 866876
  • Neighborhood consistency (NC)
  • Hu, Yu, et al. submitted
  • Entropy (E)
  • Slezak. Approximate Entropy Reducts.
    Fundam. Inform. 53(3-4) 365-390 (2002)
  • Fuzzy entropy (FE)
  • Hu, Yu, Xie. Information-preserving
    hybrid data reduction based on fuzzy-rough
    techniques. Pattern recognition letters. 27
    (2006) 414-423
  • Fuzzy rough set based dependency (FRS)
  • Chen, Hu, Wang. A novel feature
    selection method based on fuzzy rough sets for
    Gaussian kernel SVM. Submitted to Neurocomputing,
    2007

11
10 estimates of feature quality in wine (1)
12
Experiment 2
13
Experiment 3
14
Experiment 4
15
Experiment 5
16
Experiment 6
17
Experiment 7
18
Conclusion
  • As to sample perturbation, entropy and fuzzy
    entropy based evaluation functions are more
    stable than neighborhood dependency, neighborhood
    consistency and consistency functions, while
    Gaussian kernel approximation based fuzzy rough
    sets is comparable to entropy functions.
    Moreover, neighborhood consistency and
    consistency functions are the most instable.
  • As to parameter perturbation, neighborhood
    consistency is the most stable one among the four
    evaluation functions which can be directly used
    to evaluate numerical features.
  • In feature selection, we can not get the optimal
    subset of features if we dont carefully select
    the algorithm and specify the parameters
    according to the classification task.

19
Thank you!
Best wishes to the earthquake victims!
Write a Comment
User Comments (0)
About PowerShow.com