Bez tytulu slajdu - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

Bez tytulu slajdu

Description:

Title: Bez tytu u slajdu Author: W. Duch Last modified by: W. Duch Created Date: 12/30/1999 5:20:59 PM Document presentation format: Pokaz na ekranie – PowerPoint PPT presentation

Number of Views:109

Avg rating:3.0/5.0

Slides: 21

Provided by: W94

Category:

more less

Transcript and Presenter's Notes

Title: Bez tytulu slajdu

1
Rules for Melanoma Skin Cancer
Diagnosis Wlodzislaw Duch, K. Grabczewski, R.
Adamczak, K. Grudzinski, Department of Computer
Methods, Nicholas Copernicus University, Torun,
Poland. http//www.phys.uni.torun.pl/kmk
Zdzislaw Hippe Department of Computer Chemistry
and Physical Chemistry Rzeszów University of
Technology, zshippe_at_prz.rzeszow.pl
2
Content

Melanoma skin cancer data
5 methods GTS, SSV, MLP2LN, SSV, SBL, and
their results.
Final comparison of results
Conclusions future prospects

3
Skin cancer

Most common skin cancer
Basal cell carcinoma (rak podstawnokomórkowy)
Squamous cell carcinoma (rak kolczystonablonkowy)

Melanoma uncontrolled growth of melanocytes, the
skin cells that produce the skin pigment melanin.
Too much exposure to the sun, sunburn.
Melanoma is 4 of skin cancers, most difficult to
control, 179 Americans will develop melanoma.
Almost 2000 percent increase since 1930.
Survival now 84, early detection 95.

4
Melanoma skin cancer data summary

Collected in the Outpatient Center of Dermatology
in Rzeszów, Poland.
Four types of Melanoma benign, blue, suspicious,
or malignant.
250 cases, with almost equal class distribution.
Each record in the database has 13 attributes.
TDS (Total Dermatoscopy Score) - single index
26 new test cases.
Goal understand the data, find simple
description.

5
Melanoma AB attributes

Asymmetry symmetric-spot, 1-axial asymmetry,
and 2-axial asymmetry.
Border irregularity The edges are ragged,
notched, or blurred.Integer, from 0 to 8.

6
Melanoma CD attributes

Color white, blue, black, red, light brown, and
dark brown several colors are possible
simultaneously.
Diversity pigment globules, pigment dots,
pigment network, branched strikes,
structureless areas.

7
Melanoma TDS index

Combine ABCD attributes to form one index
TDS index ABCD formula
TDS 1.3 Asymmetry 0.1 Border 0.5 S
Colors 0.5 S Diversities
Coefficients from statistical analysis.

8
Remarks on testing

Test only 26 cases for 4 classes.
Estimation of expected statistical accuracy on
276 training test cases with 10-fold
crossvalidation.Not done with most methods!
Risk matrices desirable identification of Blue
nevus instead Benign nevus carries no risk, but
with malignant great risk.

9
Methods used GTS

GTS covering algorithm (Hippe, 1997) recursive
reduction of the number of decision rules.
Interactive, user guides the development of the
learning model.
Selection of combination of attributes generating
learning model is based on Frequency and Ranking.
GTS allows to create many different sets of
rules.
In a complex situation may be rather difficult to
use.

10
GTS results.

GTS generated a large number (198) of rules.
Experimentation allowed to find important
attributes.
Various sets of decision rules were generated
TDS C-blue Asymmetry Border (4 attributes,
based on the experience of medical doctors)TDS
C-blue D-structureless-areas (3 attributes)
TDS C-Blue (2 attributes)TDS (1 attribute) -
poor results. Models with 2-4 attributes give
81-85 accuracy.
Combination and generalization of these rules
allowed to select 4 simplified best rules.
Overall 6 errors on training, 0 errors on test
set.

11
Methods used SSV

Decision tree (Grabczewski, Duch 1999)
Based on a separability criterion max. index of
separability for a given split value for
continuous attribute or a subset of discrete
values.
Easily converted into a set of crisp logical
rules.
Pruning used to ensure the simplest set of rules
that generalize well.
Fully automatic, very efficient, crossvalidation
tests provide estimation of statistical accuracy.

12
SSV results

Pruning degree is the only user-defined
parameter.
Finds TDS, C-BLUE as most important.
Rules are easy to understand IF TDS ? 4.85 ?
C-BLUE is absent gt Benign-nevusIF TDS ? 4.85 ?
C-BLUE is present gt Blue-nevusIF 4.85 lt TDS lt
5.45 gt SuspiciousIF TDS ? 5.45 gt Malignant
98 accuracy on training, 100 test.
5 errors, vector pairs from C1/C2 have identical
TDS C-BLUE.
10xCV on all data 97.50.3

13
Methods used MLP2LN

Constructive constrained MLP algorithm, 0, 1
weights at the end of training.
MLP is converted into LN, network performing
logical function (Duch, Adamczak, Grabczewski
1996)
Network function is written as a set of crisp
logical rules.
Automatic determination of crisp and fuzzy
"soft-trapezoidal" membership functions.
Tradeoff simplicity vs. accuracy explored.
Tradeoff confidence vs. rejection rate explored.
Almost fully automatic algorithm.

14
MLP2LN results

Very similar rules as for the SSV found.
Confusion matrix
Original class Benign Blue- Malig-
Suspi-
Calculated nevus nevus nant
cious
Benign-nevus 62 5 0 0
Blue-nevus 0 59 0 0
Malignant 0 0 62 0
Suspicious 0 0 0 62

15
Methods used FSM

Feature-Space Mapping (Duch 1994)
FSM estimates probability density of training
data.
Neuro-fuzzy system, based on separable transfer
functions.
Constructive learning algorithm with feature
selection and network pruning.
Each transfer function component is a
context-dependent membership function.
Crisp logic rules from rectangular functions.
Trapezoidal, triangular, Gaussian f. for fuzzy
logic rules.

16
FSM results

Rectangular functions used for C-rules.
7 nodes (rules) created on average.
10xCV accuracy on training 95.51.0, test 100.
Committee of 20 FSM networks 95.51.1, test
92.6.
F-rules, Gaussian membership functions 15 fuzzy
rules, lower accuracy.
Simplest solution should strongly be preferred.

17
Methods used SBL

Similarity-Based-Methods many models based on
evaluation of similarity.
Similarity-Based-Learner (SBL) software
implementation of SBM.
Various extensions of the k-nearest neighbor
algorithms.
S-rules, more general than C-rules and F-rules.
Small number of prototype cases used to explain
the data class structure.

18
SBL results

SBL optimized performing 10xCV on training set.
Manhattan distance, feature selection TDS
C_Blue
97.4 0.3 on training, 100 test.
S-rules of the form IF (X sim Pi) THEN
C(X)C(Pi)IF (TDS(X)-TDS(Pi)C_blue(X)-C_blue
(Pi))ltT (Pi) THEN C(X)C(Pi) Prototype
selection left 13 vectors (7 for Benign-nevus
class, 2 for every other class.97.5 or 6 errors
on training (237 vectors), 100 test
7 prototypes 91.4 training (243 vectors), 100
test

19
Results - comparison
Method Rules Training Test SSV Tree,
crisp rules 4 97.50.3 100MLP2LN, crisp
rules 4 98.0 all 100 GTS - final
simplified 4 97.6 all 100 FSM, rectangular f.
7 95.51.0 1000.0 knn prototype
selection 13 97.50.0 100 FSM,
Gaussian f. 15 93.71.0 953.6 GTS initial
rules 198 85 all 84.6knn k1, Manh, 2 feat.
250 97.40.3 100LERS, weighted rules 21 --
96.2
20
Conclusions

TDS - most important Color-blue second.
Without TDS - many rules.
Optimize TDS automatic aggregation of features,
ex. 2-layered neural network.
Very simple and reliable rules have been found.
S-rules are being improved - prototypes obtained
from learning instead of selection.
Data base is expanding need for non-cancer data.

Write a Comment

User Comments (0)