Frosted Elfin Larvae Habitat Use - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Frosted Elfin Larvae Habitat Use

Description:

... 1956) readings measured at breast height in each of the 4 cardinal directions. The linear distance (cm) from the nearest edge of the wild indigo plant to the ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 50
Provided by: Bas91
Category:

less

Transcript and Presenter's Notes

Title: Frosted Elfin Larvae Habitat Use


1
Frosted Elfin Larvae Habitat Use
Caryl Becerra, Rachel Bolus, Juliet Lamb, and
Skye Long
2
(No Transcript)
3
Evaluating the suitability of the data set
  • Circular transformation into radians

function to change to radians radlt-function(deg)
degpi/180 circ.translt-function(angle,ref.angle)
cos(rad(ref.angle-angle)) yrefangle1lt-circ.tran
s(yDirectionTree,0) yrefangle2lt-circ.trans(yDir
ectionTree,45) yrefangle3lt-circ.trans(yDirection
Tree,90) yrefangle4lt-circ.trans(yDirectionTree,1
35) y
4
Evaluating the model assumptions
  • Independent samples
  • Z-score standardization
  • Removed outliers
  • Manhattan distance measure

5
Testing for differences
  • MRPP
  • NULL HYPOTHESIS Average within-group distance
    no smaller than expected by chance (M G)
  • Group 0 Wild Indigo plants without frosted
    elfin larvae
  • (n 247)
  • Group 1 Wild Indigo plants with frosted elfin
    larvae
  • (n 207)
  • Dropped Slope aspect
  • Dissimilarity index manhattan
  • Weights for groups n
  • Equal location, unequal spread

6
MRPP
  • Effect size (chance corrected within-group
    agreement) A 0.0395
  • Based on observed delta 8.834 and expected delta
    9.198
  • Delta Weighted mean within-group distance
  • Significance of delta lt 0.001
  • Effects of sample size are obvious

7
MRPP
  • Ecological interpretation Within the range of
    all possible Wild Indigo plants (in xeric open
    habitats maintained by disturbance), there is a
    subset of habitat that is more suitable for
    larvae survival
  • Not all patches of suitable habitat contain
    frosted elfin, but frosted elfin are more likely
    to be found in that habitat

8
Summary of variables
TOTAL CANOPY
WILD-INDIGO SIZE
F 84.216 p lt 2.2e-16
F 22.545 p 2.76e-06
9
Summary of variables
NEAREST WILD INDIGO
DISTANCE NEAREST TREE
F 91.174 p lt 2.2e-16
F 19.127 p 1.532e-05
10
Summary of variables
TREE DIRECTION
SLOPE ASPECT
F 1.0918 p 0.2966 (45)
F 0.2456 p 0.6205 (45)
11
Summary of variables
SLOPE
  • Frosted elfins prefer
  • ? Canopy cover
  • ? Wild Indigo Size
  • ? Distance between wild indigo plants
  • ? Distance to nearest tree
  • ? Slope (maybe)
  • ? Direction to tree
  • ? Slope aspect

F 3.7234 p 0.05429
12
MRPP without noise variables
  • Dissimilarity index manhattan
  • Weights for groups n
  • Chance corrected within-group agreement A
    0.07613
  • Based on observed delta 4.361 and expected delta
    4.72
  • Significance of delta lt 0.001

13
MANTEL
  • Mantel statistic based on Pearson's
    product-moment correlation
  • Mantel statistic r 0.09994
  • Significance lt 0.001

14
MANTEL
  • With Slope Aspect Tree Direction
  • Mantel statistic
  • r 0.09994
  • Significance lt 0.001
  • Without Slope Aspect Tree Direction
  • Mantel statistic
  • r 0.1377
  • Significance lt 0.001

15
Discriminant Analysis Yes or No?
16
Assumption Homogeneity of Variance
  • A. Conduct univariate tests of variance
  • Bartlett Test of Homogeneity of Variances
  • Bartletts K-squared
    p-value
  • TotalCanopy 16.237
    0.000
  • WildIndigoSize 79.392
    0.000
  • NearestWildIndigo 118.186
    0.000
  • DistanceNearestTree 407.882
    0.000
  • Slope 0.004
    0.952
  • refangle0 0.046
    0.830
  • refangle45 0.082
    0.775
  • refangle90 0.030
    0.863
  • refangle135 0.057
    0.812

17
Assumption Homogeneity of Variance
Total Canopy
Wild Indigo Size
Nearest Wild Indigo
Distance Nearest Tree
Slope
18
Assumption Homogeneity of Variance
  • B. Conduct multivariate tests of distribution
  • 2-sample E-test of equal distributions
  • Sample Sizes 237, 206
  • 999 Replicates
  • E-statistic 54.3809
  • P-value lt 2.2e-16

19
Assumption Homogeneity of Variance
Group distributions for all replicates
Predicted vs. actual group distributions
20
Assumption Multivariate normality
A. Conduct univariate tests for normal
distribution Anderson-Darling Test of
Normality Anderson-Darling A
p-value TotalCanopy 34.852
lt0.001 WildIndigoSize 20.214
lt0.001 NearestWildIndigo
48.353 lt0.001 DistanceNearestTree
27.682 lt0.001 Slope
17.193 lt0.001 refangle0
20.903 lt0.001 refangle45
21.288 lt0.001 refangle90
19.365 lt0.001 refangle135
25.079 lt0.001
21
Assumption Multivariate normality
Total Canopy
Wild Indigo Size
Nearest Wild Indigo
Distance Nearest Tree
Slope
22
Assumption Multivariate normality
B. Conduct multivariate assessment of
normality Energy test of multivariate
normality Sample size 237 Replicates 999
E-statistic 9.753 P-value lt 2.2e-16
23
Assumption Multivariate normality
We attempted to standardize our data using
univariate log transformations
but it didnt work
24
Assumption Multivariate normality
Wild Indigo Size
Nearest Wild Indigo
Slope
Distance Nearest Tree
25
Assumption Multivariate normality
Multivariate normality assessment of
log-transformed data
Energy test of multivariate normality Sample
size 237 Replicates 999 E-statistic
10.246 P-value lt 2.2e-16
26
Assumption Singularity
A. Determine correlation coefficients for all
possible pairs of variables
Total Canopy W.I. Size
Nearest W.I. Nearest Tree Slope 0
45 90 W.I. Size -0.046 NearestW.I. 0.107
-0.217 Nearest Tree -0.433 -0.088
-0.054 Slope 0.155 0.112
-0.004 -0.239 refangle0 0.037
0.008 -0.008 -0.059
0.026 refangle45 0.053 -0.008
-0.036 -0.019 0.000 0.689 refangle90
0.036 -0.019 -0.042
0.031 -0.025 -0.027 0.705 refangle135
-0.000 -0.019 -0.024
0.063 -0.036 -0.709 0.021 0.723
27
Assumption Singularity
  • Calculate F statistics for non-singular variables
    and retain the variable with the highest F value
  • Df Sum Sq Mean Sq F value
    Pr(gtF)
  • Reference Angle 0 1 0.102 0.102
    0.2068 0.6495
  • Residuals 441 216.579 0.491

Reference Angle 45 1 0.506 0.506
1.0384 0.3088 Residuals 441 214.744 0.487

Reference Angle 90 1 0.472 0.472
0.9228 0.3373 Residuals 441 225.508 0.511

Reference Angle 135 1 0.068 0.068
0.1316 0.717 Residuals 441 227.343 0.516
28
Assumption Linear Relationships Between Variables
Most of our variables did not appear to have
linear relationships with one another. Often,
exponential relationships appeared more fitting.
29
Attempting Discriminant Analysis
We conducted an LDA and examined the resulting
classification rate
LDA 0 1 0 195 42 1 43 163 Cobs
0.808 Kappa 0.614
Jackknife 0 1 0 142 95 1 19 187 Cobs
0.742 Kappa 0.494
Split Sample Validation 0 1 0 76 39 1 6
102 Cobs 0.798 Kappa 0.599
30
Attempting Discriminant Analysis
Split-sample Cross-Validation Classification
Summary Correct Classification Rate
0 1 Total Kappa(real or
observed) minimum 0.491 0.803 0.686
0.396 5th percentile 0.547 0.855 0.714
0.441 median 0.624 0.908 0.754
0.516 95th percentile 0.700 0.952 0.797
0.596 maximum 0.800 0.990 0.855
0.708 mean 0.623 0.906 0.755
0.517 Mean Cobs 0.755 Mean Kappa 0.517
31
Attempting Discriminant Analysis
  • Our data did not meet three of the assumptions
    underlying discriminant analysis
  • - homogeneity of variance NO
  • - multivariate normal distribution NO
  • - linear relationships between variables NO
  • When we used DA and assessed the results, our
    chance-adjusted classification rates were around
    50.

32
Discriminate Analysis- Yes or No?
  • NO

33
Initial CART Gini Method, No Pruning, all
variables.
Gini index is the probability that 2 individuals
chosen at random will be in different groups Like
Simpsons Diversity
34
Gini Confusion Matrix
CCR Null 54, Model 90 (410/454) Kappa
0.804 Chance corrected CCR Confusion matrix 0
1 0 232 15 1 29 178
35
Information
Information index is the Sum of the probability
that at a node the sample falls into a certain
class
Not much change in structure from the Gini
36
Information Confusion Matrix
CCR Null 54, Model 91 (413/454) Kappa
0.818 Confusion matrix 0 1 0 225 22 1
19 188
Better Kappa
Better but still poor classification rate for
larval sites
37
Information at 9 leaves
CCR Null 54, Model 89 (406/454) Kappa
0.788 Confusion matrix 0 1 0 217 30 1 18
189
Still have poor classification
38
Cost VS. Priors
  • Cost is better used when making management
    decisions
  • Priors make more sense if estimating a real
    population
  • Priors did not give any better separation than
    cost

Priors .1/.8
Cost 71
CCR Null 54, Model 84 (383/454) Kappa
0.693 Leaves 11 Confusion matrix 0
1 0 177 70 1 1 206
CCR Null 54, Model 87 (397/454) Tau
0.762 Leaves 8 Confusion matrix 0
1 0 199 48 1 9 198
39
Cost
Cost 51 Correct classification rate Null
54, Model 86 (389/454) Kappa 0.719
Leaves 14 Confusion matrix (rows observed,
cols predicted) 0 1 0 182 65 1 0 207
Cost 31 Correct classification rate Null
54, Model 90 (407/454) Kappa 0.795
Leaves 18 Confusion matrix (rows observed,
cols predicted) 0 1 0 202 45 1 2 205
40
Cost 71
41
Cost 71, 11 leaves
Correct classification rate Null 54, Model
84 (383/454) Kappa 0.693 Confusion matrix
(rows observed, cols predicted) 0 1 0
177 70 1 1 206
42
Surrogates
With Cost Node number 1 454 observations,
complexity param0.2307692 predicted class1
expected loss0.5440529 class counts 247
207 probabilities 0.544 0.456 left son2
(113 obs) right son3 (341 obs) Primary
splits DistanceNearestTree lt 0.02284431
to the right, improve36.3279000, (0 missing)
TotalCanopy lt -0.822055 to the left,
improve35.7119700, (0 missing)
Without Cost Node number 1 454 observations,
complexity param0.4347826 predicted class0
expected loss0.4559471 class counts 247
207 probabilities 0.544 0.456 left son2
(210 obs) right son3 (244 obs) Primary
splits TotalCanopy lt -0.4788286
to the left, improve58.5416900, (0 missing)
DistanceNearestTree lt 0.02284431 to the right,
improve55.4422000, (0 missing)
Surrogate splits DistanceNearestTree lt
-0.3673464 to the right, agree0.881,
adj0.743, (0 split)
Surrogate splits TotalCanopy lt
-0.822055 to the left, agree0.899,
adj0.593, (0 split) NearestWildIndigo lt
4.582521 to the right, agree0.756,
adj0.018, (0 split)
43
Surrogates
Correct classification rate Null 54, Model
87 (393/454) Kappa 0.734 Confusion matrix
0 1 0 195 52 1 9 198
higher
lower
44
Surrogates, 81 cost, 13 leaves
Correct classification rate Null 54, Model
85 (387/454) Kappa 0.71 Confusion matrix
0 1 0 182 65 1
2 205
45
Monte Carlo 71
CCR 0.8326 Correct classification rate Null
54, Model 83 (378/454) Kappa 0.672
Confusion matrix 0 1 0 172 75 1 1
206 Permutation test results P lt 0.01
kernel-based P 0
46
Random Forest
Call randomForest(formula grp ., data
y.std, ntree 1000, importance TRUE,
proximity TRUE) Type of random
forest classification
Number of trees 1000 No. of variables tried at
each split 3 OOB estimate of error
rate 14.98 Confusion matrix 0 1
class.error 0 214 33 0.1336032 1 35 172
0.1690821
47
Random Forest
1 TotalCanopy 100.00 2
WildIndigoSize 80.80 3 NearestWildIndigo
55.44 4 DistanceNearestTree 54.75 5
Slope 20.15 6
refangle135 18.76 7 refangle90slope
16.82 8 refangle135slope 11.64 9
refangle0 9.63 10 refangle45
9.34 11 refangleslope45 7.11 12
refangleslope0 6.69 13 refangle90
6.21
0 1 MeanDecreaseAccuracy MeanDecreaseGini Tota
lCanopy 2.57 3.04 2.12
54.79 WildIndigoSize 2.67 3.17
2.19 53.36 NearestWildIndigo
1.40 1.89 1.44
20.62 DistanceNearestTree 2.38 2.53
1.98 37.55 Slope 0.86
1.03 0.92
10.30 refangle0 0.24 0.33
0.27 5.86 refangle45 0.38
0.66 0.51
5.48 refangle90 0.33 0.80
0.56 5.67 refangle135 0.05
0.67 0.37
5.70 refangleslope0 0.49 0.76
0.61 5.87 refangleslope45 0.10
0.83 0.45
5.04 refangle90slope 0.13 0.85
0.50 5.18 refangle135slope 0.30
0.98 0.66 5.46 gt
48
Random Forest
OOB estimate of error rate 16.67 Confusion
matrix 0 1 class.error 0 119 19
0.1376812 1 21 81 0.2058824
Test set error rate 15.89 Confusion matrix
0 1 class.error 0 95 14 0.1284404 1 20 85
0.1904762
49
Ecological Importance
Write a Comment
User Comments (0)
About PowerShow.com