Point Pattern Analysis - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Point Pattern Analysis

Description:

Point Pattern Analysis Point Patterns fall between the two extremes, highly clustered and highly dispersed . Most tests of point patterns compare the observed ... – PowerPoint PPT presentation

Number of Views:114
Avg rating:3.0/5.0
Slides: 30
Provided by: ariz74
Category:

less

Transcript and Presenter's Notes

Title: Point Pattern Analysis


1
Point Pattern Analysis
Point Patterns fall between the two extremes,
highly clustered and highly dispersed . Most
tests of point patterns compare the observed
patterns to CSR. The two measurements that are
used to describe pattern are Density of points
across the analysis area Distance between points
within the analysis area
2
Distance Methods
  • Distance methods are becoming more common
  • Does not require rasterization
  • Easy to do with GIS

3
Issues with Length Measurement
  • Measurements in GIS are often made on horizontal
    projections of objects
  • length and area may be substantially lower than
    on a true three-dimensional surface

4
Be careful
  • 0.251 Hypotenuse 1.03
  • 0.51 Hypotenuse 1.11
  • 11 Hypotenuse 1.41
  • 21 Hypotenuse 2.24
  • 31 Hypotenuse 3.16
  • No an issue if the gradient is uniform.

5
Manhattan Distance
  • Distance is computed between to points (cells) by
    moving either N-S or E-W.

Cell 2 10, 20 (row, column)
Cell 1 15, 15
6
Distance Methods
  • Nearest-Neighbor Distance (NND)
  • Basic Statistics from Sample (Mean, SD)
  • Compare to Expect Population Mean, SD
  • Z statistic, R statistic
  • Assumes a normal distribution to compute
    expected values
  • Global estimate of pattern

7
Nearest Neighbor Distance
R lt 1
R gt 1
8
Nearest Neighbor Analysis Nearest neighbor
analysis examines the distances between each
point and the closest point to it, and then
compares these to expected values for a random
sample of points from a CSR (complete spatial
randomness) pattern. CSR is generated by means of
two assumptions 1) that all places are equally
likely to be the recipient of a case (event) and
2) all cases are located independently of one
another. The mean nearest neighbor distance
where N is the number of points. di is the
nearest neighbor distance for point i.
9
The expected value of the nearest neighbor
distance in a random pattern where A is the
area and B is the length of the perimeter of the
study area. The variance
10
And the Z statistic
This approach assumes Equations for the expected
mean and variance cannot be used for irregularly
shaped study areas. The study area is a regular
rectangle or square. Area (A) is calculated by
(Xmax Xmin) (Ymax Ymin), where these
represent the study area boundaries. R
statistic Observed Mean d / Expect d R 1
random, R ? 0 cluster, R ? 2 uniform
11
2 x 0.5 A 1, B 5 E (di) 0.05277 Var (d)
8.85 x 10-6 1 x 1 A 1, B 4 E(di)
0.05222 Var(d) 8.48 x 10-6 2 x 2 E(di)
0.10444
12
Wilderness Campsites
Real world study areas are complex and violate
the assumptions of most equations for expected
values.
13
  • Solution
  • Simulate randomization using Monte Carlo
    Methods.
  • Compare simulated distribution to observed.
  • If possible use the true area and perimeter
    to compute the expected value.
  • Software that does not ask for area/perimeter
    or a shapefile of the study area will assume a
    rectangle.

14
Autotheft Within City
15
Autotheft - Downtown
16
Autotheft - Neighborhood
17
Nearest Neighbor - ArcMap
Method Area Observed NND Expected NND Z Score P-Valve
Euclidean 1668437432 278 729 -33.1 0.000
Euclidean 943000863 278 548 -26.3 0.000
Manhattan 1668437432 399 729 -28.6 0.000
Manhattan 57850697 227 235 -1.1 0.284
Manhattan 10743164 251 223 1.8 0.071
18
Distance Methods
  • G Function (Revised NND)
  • Same measurements as NND
  • Analyzed using a CDF Compare to Expected
  • Expected CDF can be Theoretical or
    Generated (E(G(d))
  • d statistic (max distance between Observed
    and Expected CDF)
  • Can test d statistic with the Kolmogorov-
  • Smirnov Test

19
G Function
1/12 0.083
From OSullivan and Unwin Geographic Information
Analysis
20
Distance Methods
  • F Function
  • Similar to G but measures distance for a
    set of random points
  • Also uses CDF and same Expected Distribution
    Function as G
  • Harder to Interpret!!!
  • I have never used it. I also do not like it!
  • Both G and F Functions have edge and area
    problems. Better to use a generated expected
    distribution

21
G and F Functions
Clustered
Evenly Spaced
From OSullivan and Unwin Geographic Information
Analysis
22
Distance Methods
  • K Function (Riley, 1976)
  • Statistic is based on the sum of all the
    points within a distance d of each observation
  • where n of points
  • ? Density (n/area)
  • C(si, d) a circle with radius d centered at
    point si

23
Ripley K counts the number of points found with r
distance from each point. The maximum r distance
should be about ½ the short dimension of the
input points. The K increases quicker then
expected the points are clustered. If K
increases slower then expected the points are
dispersed.
24
Distance Methods
  • Expect K(d)
  • E(K(d)) ? p d2 / ? p d2
  • L(d) (K(d)/ p)1/2
  • E(L(d)) d

25
K Function
Clustered
L(d)
Evenly Spaced
L(d)
From OSullivan and Unwin Geographic Information
Analysis
26
(No Transcript)
27
There are a total of 32 points in this analysis.
New Mexico is approximately 500km per side, so
we will set our maximum study distance at 250km.
We choose 25 increments so that we will calculate
the observed L(d) and confidence interval for
every 10km. 99 permutations are used for
creating the confidence envelope in order to test
the null hypothesis at approximately the a0.01
level.
28
Figure 2 Graph of K-Function Results
29
A graph of the K-function results is shown below.
The observed L(d) is 0 for 10km and 20km because
the closest pair of points is approximately 29km
apart. At a distance of 30km, the observed L(d)
falls within the generated confidence interval.
However, for distances between 40km and 90km the
observed L(d) lies outside of the confidence
interval. This indicates that we can reject the
null hypothesis of CSR. Also, since the observed
L(d) is less than the Minimum L(d), this implies
that we have a statistically significant
dispersed or regular distribution of points.
Write a Comment
User Comments (0)
About PowerShow.com