Local%20Enhancement%20%20of%20Global%20Estimation - PowerPoint PPT Presentation

About This Presentation

Title:

Local%20Enhancement%20%20of%20Global%20Estimation

Description:

progress & new directions. Two-stage sample design. Spatial modeling of EMAP data ... For some, did two manual and one automatic fit ... – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 48

Provided by: David2863

Learn more at: https://www.stat.colostate.edu

Category:

more less

Transcript and Presenter's Notes

Title: Local%20Enhancement%20%20of%20Global%20Estimation

1
Local Enhancement of Global Estimation
Molly Leecaster, Ph.D. Kerry Ritter, Ph.D.
DAMARS and STARMAP 2nd Annual Conference Oregon
State University Corvallis, OR August 11, 2003
2
Acknowledgement
PROJECT FUNDING

The work reported here was developed under the
STAR Research Assistance Agreement CR-829095
awarded by the U.S. Environmental Protection
Agency (EPA) to Colorado State University. This
presentation has not been formally reviewed by
EPA. The views expressed here are solely those
of the presenter and STARMAP, the Program they
represent. EPA does not endorse any products or
commercial services mentioned in this
presentation.

3
Outline of Presentation

Introduction
Two-stage sample design
Spatial modeling of binary EMAP data
Indicator kriging
Conditional autoregressive model
Simulation Example
Future work

4
Introduction

EMAP developed for estimation of areal extent of
resources
Sample locations are spatially separated
EMAP participants are interested in global
estimation but also have local concerns
Spatial modeling
EMAP data does not provide information on the
local spatial structure required for good spatial
models
Therefore .
Augment EMAP design to improve spatial modeling

5
Goals

Present enhancement to EMAP design
Use of enhanced sample in spatial models of
indicator data
Indicator kriging
Conditional autoregressive model

6
Outline of Presentation

Introduction
Two-stage sample design
Spatial modeling of EMAP data
Simulation Example
Future work

7
Two-stage Systematic Grid Plus Star Cluster
Sample Design

Two-stage because two goals
Systematic (EMAP) grid for global structure
Star cluster sample for variogram estimation
Enhance EMAP design with additional sample
locations
Ideal for areal extent and prediction
Ideal for variogram estimation

8
Two-Stage Design
Pink..absence Blue..presence Black....s
ystematic Green...star clusters
1 Orange....star clusters 2
9
Stage One Systematic Component (EMAP)

Based on global estimation requirements
e.g. 30 spatially separated locations per strata

10
Stage TwoStar Cluster Component

Star clusters of sample sites around stage-one
locations
Star clusters provide estimate of small scale
pair-wise variance
Star clusters also provide many added pairs of
samples at various distance lags
Star clusters provide directional information at
small scale
How to specify star clusters?

11
Stage TwoStar Cluster Component

Location of star clusters
Adaptive, locate at specified observed response
Does this bias the variogram estimation?
Random stage-one locations
Systematic subset of stage-one locations
Size of star clusters
Diameter of star variogram range
Diameter of star gt variogram range
Number of star clusters
At least two, but how many more?

12
Outline of Presentation

Introduction
Two-stage sample design
Spatial modeling of EMAP data
Simulation Example
Future work

13
Spatial Models for Binary Data

Indicator kriging for geo-referenced data
Conditional autoregressive model for binary
lattice data

14
Indicator Kriging

Binary geo-referenced data
Spatial correlation structure modeled from data
Precision of predictions depends on sample
spacing and variogram parameters

15
Ordinary Indicator Kriging

Estimate local indicator mean,
, at each location
Apply simple IK estimator using estimated mean

16
Conditional Autoregressive Model for Binary Data

Binary lattice data
Spatial correlation structure assumed locally
(neighborhood) dependent Markov random field
Neighborhood defined as fixed pattern of
surrounding grid points
Precision of predictions depends on neighborhood
structure, grid size, and variance of response

17
Conditional Autoregressive Model for Binary Data
18
Comparison of Models

Ordinary Indicator Kriging
Advantages
Knowledge of spatial relationship improves
prediction
Assumed spatial relationship based on data
Disadvantages
Not robust to variogram mis-specification
Requires strong stationarity assumption
Conditional autoregressive
Advantages
No need to estimate or model variogram
Can be used without geo-referenced data
Disadvantages
Assumed spatial relationship based on a grid size
that could be inaccurate

19
Outline of Presentation

From last year to now progress new
directions
Two-stage sample design
Spatial modeling of EMAP data
Simulation Example
Future work

20
Simulation Example

Used simulation so spatial structure was known
Simulated response from specific variogram model
on to 50x50 hexagon grid of points
Specified presence/absence cutoff
Applied two-stage sample design (2 realizations)
Estimated and modeled variogram from sample data
For some, did two manual and one automatic fit
Predicted probability of presence using indicator
kriging and conditional autoregressive model

21
Simulation Methods

Simulated data from Gaussian random field
(S-Plus)
Spherical variogram, range 22, sill 0.4,
nugget 0
Simulated value gt 2 gt presence
Sample Designs
Systematic sample (n30)
Systematic sample plus 2 star clusters (n54)
Systematic sample plus 4 star clusters (n78)
Models
Indicator kriging
Conditional autoregressive model

22
Data Simulation with Sample Sites
Pink..absence Blue..presence Black....s
ystematic Green...star clusters
1 Orange....star clusters 2
23
Variogram for Sample Designs
Systematic
Systematic 2 Stars
Systematic 4 Stars
Range Sill Nugget
Systematic 17 0.17 0
Sys. 2 20 0.4 0
Sys. 4 14 0.4 0
24
Systematic Sample Results
25
Systematic Sample with 2 Stars
26
Systematic Sample with 4 Stars
27
Three Fits Systematic 2 Stars
Automatic Fit
Manual Fit 1

Range Sill Nugget
17 0.3 0
0.4 0
0.27 0
All use correct model

Manual Fit 2
28
Predictions from 3 Variogram Fits
Automatic Fit
Manual Fit 1
Manual Fit 2
29
Comparison of Prediction Errors

Sensitivity
Number of presence sites predicted to be present
Specificity
Number of absence sites predicted to be absent
True Positive Rate
Number of predicted presence sites that truly are
present
True Negative Rate
Number of predicted absence sites that truly are
absent

30
Comparison of Predictions (Data1F) (positive if
probability gt 0.5)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 28 98 85 74
Systematic 2 Stars 41 (36, 27) 94 (96, 99) 77 (80, 76) 77 (90, 74)
Systematic 4 Stars 32 97 85 75
Conditional Auto. Systematic 15 96 63 70
Systematic 2 Stars 56 85 64 80
Systematic 4 Stars 54 86 65 80
31
Comparison of Predictions (Data1F) (positive if
probability gt 0.3)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 48 91 71 78
Systematic 2 Stars 59 (56, 44) 85 (87, 93) 65 (67, 76) 81 (80 ,78)
Systematic 4 Stars 49 91 73 79
Conditional Auto. Systematic 48 80 53 76
Systematic 2 Stars 80 46 42 83
Systematic 4 Stars 80 49 43 83
32
Data Simulation with Sample Sites
Pink..absence Blue..presence Black....s
ystematic Green...star clusters
1 Orange....star clusters 2
33
Variograms for Sample Designs
Systematic
Systematic 2 Stars
Systematic 4 Stars
Range Sill Nugget
Systematic 15 0.27 0
Sys. 2 12 0.30 0.05
Sys. 4 13 0.30 0.03
34
Systematic Sample Results
35
Systematic Sample with 2 Stars
36
Systematic Sample with 4 Stars
37
Three Fits Systematic
Automatic Fit
Manual Fit 1

Range Sill Nugget
30 .25 .21
15 .27 0
.22 0
All use correct model

Manual Fit 2
38
Predictions from 3 Variogram Fits
Automatic Fit
Manual Fit 1
Manual Fit 2
39
Comparison of Predictions (Data3F) (positive if
probability gt 0.5)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 31 (1, 15) 92 (99, 97) 65 (88, 69) 73 (68, 70)
Systematic 2 Stars 21 96 75 72
Systematic 4 Stars 24 97 81 72
Conditional Auto. Systematic 7 98 65 69
Systematic 2 Stars 17 97 71 71
Systematic 4 Stars 18 99 88 71
40
Comparison of Predictions (Data3F) (positive if
probability gt 0.3)(Auto, Manual 2)
Model Sample Sensitivity Specificity True Positive Rate True Negative Rate
Indicator Kriging Systematic 62 (72, 37) 80 (69, 89) 60 (53, 63) 81 (84, 75)
Systematic 2 Stars 43 90 68 77
Systematic 4 Stars 44 91 71 77
Conditional Auto. Systematic 68 57 41 77
Systematic 2 Stars 78 58 47 84
Systematic 4 Stars 80 56 47 85
41
Simulation Conclusions - Design

Two star clusters improved small-scale features
of variogram
Two star clusters improved prediction accuracy
Four star clusters offered little improvement
over two stars

42
Simulation Conclusions - Models

Variogram model affects predictions
Kriging tends toward overall mean probability of
presence, i.e. it smooths
Kriging builds patches whose diameter is
approximately the range of the variogram
Conditional autoregressive model attempts to
connect observed presence
Neither model had consistently higher sensitivity
or specificity

43
Outline of Presentation

From last year to now progress new
directions
Two-stage sample design
Spatial modeling of EMAP data
Simulation Example
Future work

44
Future Work

Further simulation studies on two stage design
Effect of sample size
Number of star clusters necessary to improve
variogram estimation
Effect of size of star clusters
Bias from adaptive second-stage sampling
Advantages of indicator kriging and conditional
autoregressive model
Sensitivity of conditional autoregressive model
to initial values, prior distributions, and grid
size
Sensitivity of kriging to variogram model
specification

45
Future Work

Apply two-stage sample design to real data
DDT data from Santa Monica Bay, CA
EMAP data and local monitoring data
Freely distribute functions for applying the
conditional autoregressive model on a hexagon
lattice
Functions in R to produce hexagon lattice input
for WinBUGS
File in WinBUGS to apply model
Investigate optimal grid size to achieve EMAP and
spatial modeling goals

46
Systematic (EMAP) Grid Based on Variogram Model

Kriging variance
Analog for conditional autoregressive model

47
Systematic (EMAP) Grid Based on Variogram Model

Prediction variance is minimized by large
covariance between prediction location and sample
locations
For kriging, grid refers to sample locations
For conditional autoregressive, grid refers to
sample locations and prediction locations
Want -------- Sample locations close together
Samples too far apart gt
Kriging -gt correctly uses no spatial relationship
Conditional autoregressive -gt incorrectly uses
assumed spatial relationship
Samples too close together gt waste of resources