Title: Spatial Modeling Kernel Density Estimation
1Spatial ModelingKernel Density Estimation
2Histogram Density Interpolation
- Continuous variable is divided up into bins of a
specified interval with counts falling into a
bin. - Histogram is assumed to represent a smoothed
distribution, that is, a density function. - Estimating a smooth density function is done by
linking the center points of the interval with a
line. - Causes three statistical problems
- Information is discarded by center point
assignment. - Creates a discontinuous density function.
- Dependent on arbitrarily specified bin sizes.
3(No Transcript)
4Kernel Density Interpolation
- Overcomes the first two statistical problems. Bin
size, or bandwidth, is still a problem. - Involves placing a symmetrical surface over each
central reference point, which is the centroids
on a grid overlaid on the study area. - Evaluates and sums the distance from all points
to the central reference point - The functions of all the symmetrical surfaces are
summed over each other to produce an estimate at
that central reference point. - Was developed as an alternative method for
estimating density of a frequency histogram.
5(No Transcript)
6.51
.20
.17
.13
.01
7Uniform
Quartic
Normal
Negative Exponential
Triangular
8Normal Density Functions
- Functional Form
- Function extends in all directions over entire
study area. - All points are factored in.
Weight at point location i.
Intensity at point location i.
Bandwidth at reference point i (Standard
Deviation).
Distance weight between incident i and
reference point j.
9Quartic Function
- Within radius
- Outside radius
Weight at point location i.
Intensity at point location i.
Bandwidth at reference point i (Standard
Deviation).
Distance weight between incident i and
reference point j.
10Triangular Function
- Within radius
- Outside radius
Constant is set to 0.25.
Bandwidth at reference point i (Standard
Deviation).
Distance weight between incident i and
reference point j.
11Negative Exponential Function
- Within radius
- Outside radius
Constant is set to 1.
Exponent is set to 3.
Distance weight between incident i and
reference point j.
12Uniform Function
- Within radius
- Outside radius
Constant, is set to 0.1.
13Size Shape of Bandwidth
- Spatial effects/autocorrelation are to be
captured. - Too large of a bandwidth hides local clustering
trends by producing a large, combined, hot
spot. - Too small of a bandwidth may produce too many
peaks and valleys possibly indicating false hot
spots or cold spots. - Use results to determine size and shape from
- Previous Results, such as Moran Correlogram,
Nearest Neighbor Index or Ripleys K. - Theoretical Guidelines.
- Knowledge of the Environment.
14(No Transcript)
15(No Transcript)
16Number of Points in Bandwidth
- Spatial process/autocorrelation are to be
captured. - Too many produces a lot of hot spots with
several possibly being false (random variation). - Too few produces an even surface and nothing
indistinguishable as a hot spot. - Use results to determine minimum number of points
from - Previous Results, such as Nearest Neighbor Index,
Ripleys K or count distributions from aerial
units. - Theoretical Guidelines.
- Knowledge of the Environment.
17Kernel Interpolation Input
18(No Transcript)
19Prue NNI Base Comparison
20Prue K Statistic Base Comparison
21Type of Bandwidth (Single)
- Fixed
- A non changing distance interval must be
specified in units of measurement. - Should be based of some distance from theoretical
guidelines or empirical evidence. - Assumes stationary spatial processes.
- Variable (for Dual Interpolation)
- Adaptive
- An adjusted distance based on capturing the
minimum number of points. - Improves statistical precision by being narrower
in areas with a higher concentration of incidents
and wider in areas with more dispersed incidents.
22Density Calculations (Single)
- Absolute Estimates of each cell are re-scaled so
that the sum of the densities over all cells is
equal to the total number of observations
estimates are points per grid cell. Used for
comparisons between crime types or same crime
type and different time period. - Relative Absolute densities of each cell are
divided by the area of the cell estimates are
points per square unit of measurement. Used for
expression in units across the study area. - Probabilities Absolute density is divided by the
total number of observations within the grid
estimates that are the likelihood of an incident
occurring within a cell.
23Single Kernel Density Estimation
24Single Kernel Interpolation Input
25Single Kernel Interpolation Output
26(No Transcript)
27(No Transcript)
28(No Transcript)
29Dual Kernel Density Estimation
30Dual Kernel Interpolation Input
31Type of Bandwidth (Dual)
- Fixed (same as for Single Interpolation)
- Variable
- Each file (Primary Secondary) has different
intervals. - Divides cell value from primary file with value
from same cell in secondary file. However, as a
value in a cell from the secondary file
approaches zero the quotient will be become
exponentially larger providing an overestimation
for that cell. - Adaptive (same as for Single Interpolation)
32Density Calculations (Dual)
- Ratio The primary file cell values are divided
by the secondary file cell values to produce a
risk ratio. - Log Ratio Natural logarithm of the density ratio
for those a set of grid cells that have a very
skewed distribution of density values. This will
mute the over-estimations, or spikes. - Absolute Difference The primary file cell values
are subtracted from the secondary file cell
values producing a differentials. Also used for
comparing grids that are created with and without
a weight or intensity variable to produce more
precise estimates when clustering occurs in a
spatial process.
33Density Calculations (Dual)
- Relative Difference Standardizes the values from
the primary and secondary cells and subtracts
secondary cell relative density from the primary
relative density. Used for comparing the
relative density change between two periods of
the same crime type. - Sum Adds the values from primary and secondary
cells. Used for combining two density surfaces
to show additive effect of two different crime
types. - Relative Sum Standardizes the values from the
primary and secondary cells and adds the
secondary cell relative density from the primary
relative density. Used for identifying the total
effect of two different crime types.
34Dual Kernel Interpolation Output
35(No Transcript)
36Population at RiskBurglary
37Population at RiskBurglary
38(No Transcript)
39Exercise
- Calculate several single densities for different
crime types using different bandwidths, minimum
number of points and kernel type. Each of these
should be based on what was found in previous
analysis from descriptives and/or from NNI,
Ripleys K or Morans Correlogram. - Examine the differences and identify the
parameters that make sense for further analysis. - Repeat this process for any base data that will
be used as a baseline. - Calculate several dual densities for different
crime types with the input parameters from the
single density analysis.