Spatial Interpolation - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Spatial Interpolation

Description:

Arthur J. Lembo, Jr. Cornell University ... Arthur J. Lembo, Jr. Cornell University. IDW. ArcGIS provides a nice interface to view points ... – PowerPoint PPT presentation

Number of Views:371
Avg rating:3.0/5.0
Slides: 50
Provided by: AJL2
Category:

less

Transcript and Presenter's Notes

Title: Spatial Interpolation


1
Spatial Interpolation
  • Inverse Distance Weighting
  • The Variogram
  • Kriging
  • Much thanks to Bill Harper for his insights in
    Practical Geostatistics 2000 and personal
    conversation

2
Objectives
  • In this session we will evaluate a dataset and
    attempt to
  • Explore the theory and implementation of inverse
    distance weighting
  • Evaluate issues with IDW interpolation
  • Explore the theory and implementation of the
    semi-variogram and its applicability to
    interpolation
  • Explore the theory and implementation of kriging
    and its applicability to interpolation

3
Data Set
  • Simulated Borehole data (PG 2000)
  • Iron concentration
  • Need to interpolate iron content for unsampled
    areas
  • General Statistics
  • 47 samples
  • Mean value 36.3
  • S.D. 3.73

4
General Statistics
  • Histogram shows the relative distribution of the
    data
  • Generally follows a normal distribution
  • Other observations
  • Minor skew, no big deal

5
Data Set
  • The best unbiased estimate for the standard
    deviation is 3.726 (see right)
  • Therefore, we are 90 confident that a point
    drawn at random would be
  • 30 lt T lt 42.6
  • This is based on consulting a students t
    distribution with 47 samples

6
Subset of Area (northwest area)
  • Subset of borehole data
  • Upper left side
  • General Statistics
  • 7 samples
  • Mean value 40
  • S.D. 2.82
  • Getting somewhat better

7
  • The best unbiased estimate for the standard
    deviation is 3.05 (see right)
  • Therefore, we are 90 confident that a point
    drawn at random would be
  • 34.2 lt T lt 45.7
  • This is based on consulting a students t
    distribution with 7 samples
  • Now, the question is, do some of the points
    exhibit more influence than others?
  • Probably, so lets evaluate the point taking
    nearness into account

8
Inverse Distance Weighting
  • IDW works by using an unbiased weight matrix
    based on the distances from an unknown value to
    known values.
  • Weights may be defined a number of different ways

9
IDW
  • ArcGIS provides a nice interface to view points
  • This example looks at 7 neighbors
  • Now, lets look at it the old fashioned way

10
IDW
  • Using 7 neighboring points allows us to
    interpolate a value based on distances
  • Interpolated value is 39.9
  • So, our calculation is the same as that in ArcGIS
    its just math.

11
IDW standard Error
  • We will compute it, without considering the
    autocorrelation in the data
  • Standard error 2.75
  • Therefore, we are 90 confident that a point
    drawn at random would be
  • 34.7 lt T lt 45.1
  • This is based on consulting a students t
    distribution with 7 samples

Caveat we are treating IDW like weighted mean,
and the standard deviation like a weighted
standard deviation. In reality, you shouldnt
develop confidence intervals for data that is
autocorrelated
12
IDW Methods
Power 2, search 230
Power 2, search 600
So which is best???
Power 2, search 150
Power 4, search 600
13
10 Questions to Evaluate1
  • What function of distance should we use?
  • How do we handle different continuity in
    different directions?
  • How many samples should we include in the
    estimation?
  • How do we compensate for irregularly spaced or
    highly clustered sampling?
  • How far should we go to include samples in our
    estimation process?
  • Should we honor the sample values?
  • How reliable is the estimate when we have it?
  • Why is our map too smooth?
  • What happens if our sample data is not Normal?
  • What happens if there is a strong trend in the
    values?

1Clark and Harper Practical Geostatistics 2000.
Ecosse North America, Llc
14
Answering the 10 Questions
  • The Variogram

15
What is a Semi-Variogram
  • The semi-variogram is a function that relates
    semi-variance (or dissimilarity) of data points
    to the distance that separates them.
  • If we can understand the difference between an
    unknown quantity and a known quantity, we we can
    estimate the unknown point

1
d1
16
Estimating via semi-variogram
  • Lets assume the relationship between the unknown
    and known point depends on distance 121 feet
    NE/SW
  • If these two points have the same relationship as
    the other points, we can look at the other points
    that are 121 feet NE/SW

17
Computing the standard differences
  • For all 31 pairs we can compute the standard
    deviation
  • We are assuming a mean of 0, and a normal
    distribution

18
Computing the standard differences
  • The single point we are looking at is 37 Fe.
  • If our original samples come from a normal
    distribution, the differences will be normal, so
    we be 90 confident that a point drawn at random
    would be

19
Taking the semi-variogram further
  • Chances are, we wont get to sample our data on a
    regular grid.
  • We have to algebraically define some function of
    distance with the differences in value
  • Therefore, we will assign h to the distance

20
Variograms
  • Variogram g(h) ½ var Z(x) Z(xh)
  • ½ E Z(x) Z(xh)2
  • In practice
  • g(h)
  • Where
  • N(h) is the total number of
  • pairs of observations
  • separated by a distance h.
  • The fitted curve minimizes
  • the variance of the errors.

21
Variogram components
  • Nugget variance a non-zero value for g when h
    0. Produced by various sources of unexplained
    error (e.g. measurement error).
  • Sill for large values of h the variogram levels
    out, indicating that there no longer is any
    correlation between data points. The sill should
    be equal to the variance of the data set.
  • Range is the value of h where the sill occurs
    (or 95 of the value of the sill).
  • In general, 30 or more pairs per point are needed
    to generate a reasonable sample variogram.
  • The most important part of a variogram is its
    shape near the origin, as the closest points are
    given more weight in the interpolation process.

22
Variogram models
Variogram models must be positive definite so
that the covariance matrix based on it can be
inverted (which occurs in the kriging process).
Because of this, only certain models can be used.
23
Semi-variogram models
We can enter some numbers in Mathcad and see how
the variogram changes.
24
Effect of lag size on variograms
Variogram with a lag size of 5m and a lag
tolerance of 2.5m.
Variogram with a lag size of 10m and a lag
tolerance of 5m.
25
Anisotropy
  • There may be higher spatial autocorrelation in
    one direction than in others, which is called
    anisotropy
  • The figure shows a case of geometric anisotropy,
    which is incorporated in the variogram model by
    means of a linear transformation.

26
Semi-variogram tips
  • We are assuming a normal distribution
  • Gives us a picture of the relationship of data
    values with distance.
  • If you dont have a good spatial structure in the
    semi-variogram, dont revert to IDW this is
    stupid!!!

27
Comparing Software for Computing the
Semi-Variogram
ArcGIS Geostatistical Analyst
Practical Geostatistics 2000
28
Assessing Fit of the Variogram
  • Cressie Goodness of Fit
  • For each point used to create the variogram,
    match how well the model actually fits it

29
Kriging
  • Kriging is based on the idea that you can make
    inferences regarding a random function Z(x),
    given data points Z(x1), Z(x2), Z(xn).

3 components structural (constant mean), random
spatially correlated component and residual
error.
Z(x) m(x) g(h) e
30
Kriging
  • This is our variogram from the borehole data
  • To discuss the mathematics of kriging, we will
    look at a simple example of 3 points, and get
    back to our data in a moment

31
Kriging
  • Numerical Exampleof Iron Ore Data
  • From Practical Geostatistics 2000

32
Data Set
  • Iron Ore Data, based on sample set from PG 2000
  • Three point example for simplicity

33
Calculating Distances
  • The first thing we do is determine the distances
    between each point
  • Also calculate difference in Z values between all
    points

34
Semi Variogram
  • We apply the GLM, based on other test performed
    on the data
  • The values chosen give the best Cressie
    statistics for fit on all data points
  • Note Mathcad is not great at creating
    semivariograms!!!

35
ComputingWeights
  • Using basic matrix algebra, we can solve for the
    weights.
  • The weights will add to one, due to our eventual
    slight of hand with the last row.

36
Solving theUnknown
  • Basic matrix algebra will solve for the unknown
    value
  • We also compute the standard error and variance

37
Solving OurBorehole Data
  • Start with our original example
  • Since we have 7 points rather than 3, the screens
    will be busier

38
Borehole Data
  • The ability to create semi-variograms in MathCad
    is pretty bad, but this allows us to visualize
    the mathematics
  • Here we are using the spherical model

39
Borehole Data
  • Again, we can see with this dataset the weights
    also add up to one

40
Solution
  • Here weve computed the value of the unknown
    point, and the standard error
  • This was based on the limited set of 7 points,
    now well do it with the rest.

41
Predicting the Point
  • ArcGIS has a good interface for evaluating the
    weights of the points, in addition to predicting
    a test location

42
Kriging Results
  • ESRI Geostatistical Analyst
  • Interpolated value
  • 41.26
  • Standard error
  • 2.16
  • PG 2000
  • Interpolated value
  • 41.14
  • Standard error
  • 2.11

43
Standard Errors
  • Based on Kriging results, we can assume the
    true value of the unknown point, with 90
    confidence as
  • 37.6 lt 41.14 lt 44.68 Fe
  • So, we are getting better results, better looking
    maps, and smaller confidence intervals

44
IDW vs. Kriging
Kriging
  • Kriging appears to give a more natural look to
    the data
  • Kriging avoids the bulls eye effect
  • Kriging also give us a standard error

IDW
45
Results
46
Review of 10 Questions to ask1
  • What function of distance should we use?
  • The variogram shows us the spatial structure, and
    association of the data, and will give us a hint
    as to what function to possibly use.
  • How do we handle different continuity in
    different directions?
  • Here again, the variogram will tell us whether
    there is any spatial association, and we can
    determine which direction by evaluating whether
    anisotropy exists.
  • How many samples should we include in the
    estimation?
  • Again, we can look at the variogram
  • How do we compensate for irregularly spaced or
    highly clustered sampling?
  • The variogram defines the relationship between
    points and their distances from other points.
    Calculating weights in Kriging takes the
    distances among all points into account.

1Clark and Harper Practical Geostatistics 2000.
Ecosse North America, Llc
47
10 Questions to ask1
  • How far should we go to include samples in our
    estimation process?
  • By looking at the variogram we can identify the
    sill (that area where the spatial correlation has
    little value). The range tells us the distance
    where the points are no longer correlated.
  • Should we honor the sample values?
  • Still lots of debate on this one. IDW says yes,
    thats why we get the bullseye. The nugget
    effect in Kriging allows us to say no. But, we
    can set the nugget to zero with Kriging.
  • How reliable is the estimate when we have it?
  • Kriging allows us to compute the standard error
  • Why is our IDW map too smooth?
  • In IDW when you include points far away they
    become part of the weights. Since the weights
    have to add up to one, you are basically taking
    power away from the closer ones.

1Clark and Harper Practical Geostatistics 2000.
Ecosse North America, Llc
48
10 Questions to Ask
  • What happens if our sample data is not Normal?
  • Basically, make the data normal
  • What happens if there is a strong trend in the
    values?
  • First, remove the trend, then re-interpolate the
    points (see ESRI Calif. Ozone example, or Clark
    and Harper Wolfcamp Data)

49
Conclusions
  • It is possible to interpolate an unknown point
    based on other points in a data set
  • While it can be done with descriptive statistics,
    other methods are clearly better
  • The variogram helps answer many questions related
    to our data, and provides a wealth of information
    related to the spatial structure of the data
  • More robust (geostatistical) methods for
    interpolation appear to provide better results
Write a Comment
User Comments (0)
About PowerShow.com