Title: Testing Predictive Performance of Ecological Niche Models
1Testing Predictive Performance of Ecological
Niche Models
A. Townsend Peterson, STOLEN FROM Richard Pearson
2Niche Model Validation
- Diverse challenges
- Not a single loss function or optimality
criterion - Different uses demand different criteria
- In particular, relative weights applied to
omission and commission errors in evaluating
models - Nakamura which way is relevant to adopt is not
a mathematical question, but rather a question
for the user - Asymmetric loss functions
3(No Transcript)
4Where do I get testing data????
5Model calibration and evaluation strategies
resubstitution
Projection
Calibration
Same region Different region Different
time Different resolution
All available data
100
Evaluation
(after Araújo et al. 2005 Gl. Ch. Biol.)
6Model calibration and evaluation strategies
independent validation
Projection
Same region Different region Different
time Different resolution
All available data
Calibration
100
Evaluation
(after Araújo et al. 2005 Gl. Ch. Biol.)
7Model calibration and evaluation strategies data
splitting
Projection
Calibration
Calibration data
Same region Different region Different
time Different resolution
70
Test data
30
Evaluation
(after Araújo et al. 2005 Gl. Ch. Biol.)
8Types of Error
9The four types of results that are possible when
testing a distribution model
(see Pearson NCEP module 2007)
10Presence-absence confusion matrix
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
11Thresholding
12(No Transcript)
13Selecting a decision threshold (p/a data)
(Liu et al. 2005 Ecography 29385-393)
14Selecting a decision threshold (p/a data)
15Selecting a decision threshold (p/a data)
16Selecting a decision threshold (p-o data)
17Threshold-dependent Tests( loss functions)
18The four types of results that are possible when
testing a distribution model
(see Pearson NCEP module 2007)
19Presence-absence test statistics
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
Proportion () correctly predicted (or
accuracy, or correct classification rate)
(a d)/(a b c d)
20Presence-absence test statistics
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
Cohens Kappa
21Presence-only test statistics
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
Proportion of observed presences correctly
predicted (or sensitivity, or true positive
fraction) a/(a c)
22Presence-only test statistics
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
Proportion of observed presences correctly
predicted (or sensitivity, or true positive
fraction) a/(a c)
Proportion of observed presences incorrectly
predicted (or omission rate, or false negative
fraction) c/(a c)
23Presence-only test statisticstesting for
statistical significance
U. sikorae
U. sikorae
Success rate 4 from 7 Proportion predicted
present 0.231 Binomial p 0.0546
Success rate 6 from 7 Proportion predicted
present 0.339 Binomial p 0.008
24Absence-only test statistics
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
Proportion of observed (or assumed) absences
correctly predicted (or specificity, or true
negative fraction) d/(b d)
25Absence-only test statistics
Recorded (or assumed) absent
Recorded present
Predicted present
a (true positive)
b (false positive)
Predicted absent
c (false negative)
d (true negative)
Proportion of observed (or assumed) absences
correctly predicted (or specificity, or true
negative fraction) d/(b d)
Proportion of observed (or assumed) absences
incorrectly predicted (or commission rate, or
false positive fraction) b/(b d)
26AUC a threshold-independent test statistic
(1 omission rate)
(fraction of absences predicted present)
sensitivity a/(ac)
specificity d/(bd)
27Threshold-independent assessment The Receiver
Operating Characteristic (ROC) Curve
A
B
set of absences
set of presences
1
Frequency
1
0
Predicted probability of occurrence
sensitivity
C
set of absences
set of presences
Frequency
0
1
0
0
1
Predicted probability of occurrence
1 - specificity
(check out http//www.anaesthetist.com/mnm/stats/
roc/Findex.htm)