Bez tytulu slajdu - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Bez tytulu slajdu

Description:

Title: Bez tytu u slajdu Author: W. Duch Last modified by: W. Duch Created Date: 12/30/1999 5:20:59 PM Document presentation format: Pokaz na ekranie – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 21
Provided by: W94
Category:

less

Transcript and Presenter's Notes

Title: Bez tytulu slajdu


1
Rules for Melanoma Skin Cancer
Diagnosis Wlodzislaw Duch, K. Grabczewski, R.
Adamczak, K. Grudzinski, Department of Computer
Methods, Nicholas Copernicus University, Torun,
Poland. http//www.phys.uni.torun.pl/kmk
Zdzislaw Hippe Department of Computer Chemistry
and Physical Chemistry Rzeszów University of
Technology, zshippe_at_prz.rzeszow.pl
2
Content
  • Melanoma skin cancer data
  • 5 methods GTS, SSV, MLP2LN, SSV, SBL, and
    their results.
  • Final comparison of results
  • Conclusions future prospects

3
Skin cancer
  • Most common skin cancer
  • Basal cell carcinoma (rak podstawnokomórkowy)
  • Squamous cell carcinoma (rak kolczystonablonkowy)
  • Melanoma uncontrolled growth of melanocytes, the
    skin cells that produce the skin pigment melanin.
  • Too much exposure to the sun, sunburn.
  • Melanoma is 4 of skin cancers, most difficult to
    control, 179 Americans will develop melanoma.
  • Almost 2000 percent increase since 1930.
  • Survival now 84, early detection 95.

4
Melanoma skin cancer data summary
  • Collected in the Outpatient Center of Dermatology
    in Rzeszów, Poland.
  • Four types of Melanoma benign, blue, suspicious,
    or malignant.
  • 250 cases, with almost equal class distribution.
  • Each record in the database has 13 attributes.
  • TDS (Total Dermatoscopy Score) - single index
  • 26 new test cases.
  • Goal understand the data, find simple
    description.

5
Melanoma AB attributes
  • Asymmetry symmetric-spot, 1-axial asymmetry,
    and 2-axial asymmetry.
  • Border irregularity The edges are ragged,
    notched, or blurred.Integer, from 0 to 8.

6
Melanoma CD attributes
  • Color white, blue, black, red, light brown, and
    dark brown several colors are possible
    simultaneously.
  • Diversity pigment globules, pigment dots,
    pigment network, branched strikes,
    structureless areas.

7
Melanoma TDS index
  • Combine ABCD attributes to form one index
  • TDS index ABCD formula
  • TDS 1.3 Asymmetry 0.1 Border 0.5 S
    Colors 0.5 S Diversities
  • Coefficients from statistical analysis.

8
Remarks on testing
  • Test only 26 cases for 4 classes.
  • Estimation of expected statistical accuracy on
    276 training test cases with 10-fold
    crossvalidation.Not done with most methods!
  • Risk matrices desirable identification of Blue
    nevus instead Benign nevus carries no risk, but
    with malignant great risk.

9
Methods used GTS
  • GTS covering algorithm (Hippe, 1997) recursive
    reduction of the number of decision rules.
  • Interactive, user guides the development of the
    learning model.
  • Selection of combination of attributes generating
    learning model is based on Frequency and Ranking.
  • GTS allows to create many different sets of
    rules.
  • In a complex situation may be rather difficult to
    use.

10
GTS results.
  • GTS generated a large number (198) of rules.
  • Experimentation allowed to find important
    attributes.
  • Various sets of decision rules were generated
    TDS C-blue Asymmetry Border (4 attributes,
    based on the experience of medical doctors)TDS
    C-blue D-structureless-areas (3 attributes)
    TDS C-Blue (2 attributes)TDS (1 attribute) -
    poor results. Models with 2-4 attributes give
    81-85 accuracy.
  • Combination and generalization of these rules
    allowed to select 4 simplified best rules.
  • Overall 6 errors on training, 0 errors on test
    set.

11
Methods used SSV
  • Decision tree (Grabczewski, Duch 1999)
  • Based on a separability criterion max. index of
    separability for a given split value for
    continuous attribute or a subset of discrete
    values.
  • Easily converted into a set of crisp logical
    rules.
  • Pruning used to ensure the simplest set of rules
    that generalize well.
  • Fully automatic, very efficient, crossvalidation
    tests provide estimation of statistical accuracy.

12
SSV results
  • Pruning degree is the only user-defined
    parameter.
  • Finds TDS, C-BLUE as most important.
  • Rules are easy to understand IF TDS ? 4.85 ?
    C-BLUE is absent gt Benign-nevusIF TDS ? 4.85 ?
    C-BLUE is present gt Blue-nevusIF 4.85 lt TDS lt
    5.45 gt SuspiciousIF TDS ? 5.45 gt Malignant
  • 98 accuracy on training, 100 test.
  • 5 errors, vector pairs from C1/C2 have identical
    TDS C-BLUE.
  • 10xCV on all data 97.50.3

13
Methods used MLP2LN
  • Constructive constrained MLP algorithm, 0, 1
    weights at the end of training.
  • MLP is converted into LN, network performing
    logical function (Duch, Adamczak, Grabczewski
    1996)
  • Network function is written as a set of crisp
    logical rules.
  • Automatic determination of crisp and fuzzy
    "soft-trapezoidal" membership functions.
  • Tradeoff simplicity vs. accuracy explored.
  • Tradeoff confidence vs. rejection rate explored.
  • Almost fully automatic algorithm.

14
MLP2LN results
  • Very similar rules as for the SSV found.
  • Confusion matrix
  • Original class Benign Blue- Malig-
    Suspi-
  • Calculated nevus nevus nant
    cious
  • Benign-nevus 62 5 0 0
  • Blue-nevus 0 59 0 0
  • Malignant 0 0 62 0
  • Suspicious 0 0 0 62

15
Methods used FSM
  • Feature-Space Mapping (Duch 1994)
  • FSM estimates probability density of training
    data.
  • Neuro-fuzzy system, based on separable transfer
    functions.
  • Constructive learning algorithm with feature
    selection and network pruning.
  • Each transfer function component is a
    context-dependent membership function.
  • Crisp logic rules from rectangular functions.
  • Trapezoidal, triangular, Gaussian f. for fuzzy
    logic rules.

16
FSM results
  • Rectangular functions used for C-rules.
  • 7 nodes (rules) created on average.
  • 10xCV accuracy on training 95.51.0, test 100.
  • Committee of 20 FSM networks 95.51.1, test
    92.6.
  • F-rules, Gaussian membership functions 15 fuzzy
    rules, lower accuracy.
  • Simplest solution should strongly be preferred.

17
Methods used SBL
  • Similarity-Based-Methods many models based on
    evaluation of similarity.
  • Similarity-Based-Learner (SBL) software
    implementation of SBM.
  • Various extensions of the k-nearest neighbor
    algorithms.
  • S-rules, more general than C-rules and F-rules.
  • Small number of prototype cases used to explain
    the data class structure.

18
SBL results
  • SBL optimized performing 10xCV on training set.
  • Manhattan distance, feature selection TDS
    C_Blue
  • 97.4 0.3 on training, 100 test.
  • S-rules of the form IF (X sim Pi) THEN
    C(X)C(Pi)IF (TDS(X)-TDS(Pi)C_blue(X)-C_blue
    (Pi))ltT (Pi) THEN C(X)C(Pi) Prototype
    selection left 13 vectors (7 for Benign-nevus
    class, 2 for every other class.97.5 or 6 errors
    on training (237 vectors), 100 test
  • 7 prototypes 91.4 training (243 vectors), 100
    test

19
Results - comparison
Method Rules Training Test SSV Tree,
crisp rules 4 97.50.3 100MLP2LN, crisp
rules 4 98.0 all 100 GTS - final
simplified 4 97.6 all 100 FSM, rectangular f.
7 95.51.0 1000.0 knn prototype
selection 13 97.50.0 100 FSM,
Gaussian f. 15 93.71.0 953.6 GTS initial
rules 198 85 all 84.6knn k1, Manh, 2 feat.
250 97.40.3 100LERS, weighted rules 21 --
96.2
20
Conclusions
  • TDS - most important Color-blue second.
  • Without TDS - many rules.
  • Optimize TDS automatic aggregation of features,
    ex. 2-layered neural network.
  • Very simple and reliable rules have been found.
  • S-rules are being improved - prototypes obtained
    from learning instead of selection.
  • Data base is expanding need for non-cancer data.
Write a Comment
User Comments (0)
About PowerShow.com