Spatial Data Analysis Areas II - PowerPoint PPT Presentation

About This Presentation
Title:

Spatial Data Analysis Areas II

Description:

Local Moran: percentage of rented houses. LM. S 3.579176. LAPA -1.555046. SANTA CEC LIA 3.128312 ... Gi* - percentage of rented housing. IBGE LOCAL MORAN ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 58
Provided by: ACER60
Category:

less

Transcript and Presenter's Notes

Title: Spatial Data Analysis Areas II


1
Spatial Data AnalysisAreas II Exploratory
Spatial Data Analysis
Ifgi, Muenster, Fall School 2005
  • Gilberto Câmara
  • INPE, Brazil

2
Data-Driven Approaches
  • Exploratory spatial data analysis" (ESDA)
  • Point pattern analysis
  • Indices of spatial association
  • Compare the observed pattern in the data (e.g.,
    locations in point pattern analysis, values at
    locations in spatial autocorrelation) to one in
    which space is irrelevant.
  • The second common aspect is that the spatial
    pattern, spatial structure, or form for the
    spatial dependence are derived from the data
    only.

3
Spatial Autocorrelation
  • Complicated name, simple concept...
  • Expresses the amount of spatial dependence
  • How much proximity matters in spatial data
  • Correlation is the key notion
  • It indicates how much two properties vary
    together
  • Correlation in space
  • Is a variable in a location correlated with its
    values in nearby places?
  • Spatial auto correlation

4
Positive, High Correlation
5
Sometimes we need to transform the data
Scatter plots (a) Y versus PORC3_NR (percentage
of large farms in number ) (b) log10 Y versus
log 10 (PORC3_NR).
Predicted versus Observed Plots (a) model with
variables not transformed) R2 0.61 (b) Model
7 R2 0.85.
6
Log x linear correlation
  • Y aX - linear corellation
  • Y Xa or log Y a logX log correlation

7
No Correlation
8
Is this data spatially autocorrelated?
9
Spatial Randomness
  • Null Hypothesis No Spatial Autocorrelation
  • Spatial randomness
  • values observed at a location do not depend on
    values observed at neighboring locations
  • observed spatial pattern of values is equally
    likely as any other spatial pattern
  • the location of values may be altered without
    affecting the information content of the data

10
Random or Clustered?
Columbus homicide data (source Luc Anselin)
11
Random or Clustered?
Columbus homicide data (source Luc Anselin)
12
Random or Clustered?
Columbus homicide data (source Luc Anselin)
13
Exploratory Spatial Data Analysis
  • Visualization of spatial data
  • Global Indicators of Spatial Autocorrelation
  • Local Indicators of Spatial Autocorrelation
    (LISA)

14
Visualization of Area Patterns
  • Grouping
  • Equal intervals
  • Quantiles
  • Standard deviation
  • Be careful!
  • Color mapas can lead to wrong interpretation

Breast cancer in England (1985-1989)
Source Bailey and Gattrel, 1995
15
Equal-Interval Visualization
  • Defined by maximum and minimum values.
  • Shows data dispersion.
  • Outliers can mask differences.

Source Bailey and Gattrel, 1995
16
Quantiles
  • Each group has the same numbre of elements
  • Ordenation
  • e.g best 25 and worst 25

Source Bailey and Gattrel, 1995
17
Standard Deviations
  • Dispersion around a mean value
  • Breaks 1 stdev, 1/2 stdev
  • Shows the statistical behaviour
  • Best for normality case

Source Bailey and Gattrel, 1995
18
Visualization
Source Bailey and Gattrel, 1995
19
Visualization
Source Bailey and Gattrel, 1995
20
Spatial Proximity Matrix
  • Matrix W (n x n) , where each elements wij
    represents a measure of nearness between Oi and
    Oj
  • Criteria
  • wij 1, if Oi touches Oj
  • wij 1, if distance(Oi, Oj) lt h

21
Moving Averages
  • Local smoothing of attribute values
  • where
  • Wij is the spatial weights matrix.
  • yi is the attribute value for each area.
  • n is the number of areas

22
Moving Averages
  • Proportion of population aged 70 or older, São
    Paulo, 1991

23
Moving Averages using Bar Graphs
Regions where there is a large difference between
the original value and the local mean Indicates
places of spatial transitions
Atributo
Média local
24
Moran Scatterplot Values x Local Means
Q1 (val. , means ) and Q2 (val. -, means
-) Locations of positive spatial
association (Im similar to my neighbours).
WZ
Q1
Q4
a
0
Q3 (val. , means -) and Q4 (val. -, means
) Locations of negative spatial
association (Im different from my neighbours).
Q2
Q3
z
0
25
Moran Scatterplot Map
São Paulo
WZ
Q1 HH
Q4 LH
a
0
Q3 HL
Q2 LL
z
0
Old-aged population
26
Indicators of spatial autocorrelations
  • Generic formulation

global
local
where
spatial proximity between i and j
a
measured relation between object and its
neighbors
ij
27
Indicators of spatial autocorrelation
n
n
n
å
å
å

G
G

a
w
w
a
ij
ij
i
ij
ij
j
i
j
(
)
(
)
Moran (covariance)
z
z
-
-
x
x
x
x
i
j
j
i
(
)
(
)
2
2
-
-
Geary (variance)
z
z
x
x
j
i
j
i
(
)
(
)

x
x
ou
x
G or G (moving averages)

z
ou
z
z
j
i
j
j
i
j
28
Global Indicators of Spatial Autocorrelation
  • Morans I
  • onde
  • n number of areas,
  • yi attribute value in area i,
  • mean value in study region
  • wij spatial weigths matrix.
  • How to interpret the above equation?

29
Global Indicators of Spatial Autocorrelation
  • Similar to tradicional correlation calculation,
    but restricted to spatial neighbours
  • Values of I go from -1 to 1.
  • -1 negative spatial autocorrelation
  • 0 no spatial autocorrelation
  • 1 positive spatial autocorrelation
  • For the old-age population in São Paulo, I0.45
  • Is this significant?

30
Randomization Strategy
  • Empirical Distribution Function
  • permute arrangement of objects
  • associate values with locations
  • associate locations with values
  • recompute indicators
  • Obtain a distribution
  • Compare observed G to distribution of
    pseudo-Significance
  • p (t 1) / (m 1)
  • M permutations
  • T times GAW G

31
Random or Clustered?
extremo
Distribuição simulada
  • Testing Morans I
  • Permutate the spatial values 999 times
  • Obtain a probability distribution
  • Locate the real value in the distribution
  • In this case, I .45 (very significant!)

32
Pros and cons of randomization
  • Advantages
  • non-parametric
  • no distributional assumptions
  • easy to compute
  • easy to interpret
  • Disadvantages
  • sample specific
  • no generalization to population
  • precision of pseudo significance arbitrary
  • 1/(991) yields 0.01, and 1/(9991) yields 0.00
  • sensitive to random number generator

33
Random or Clustered?
Morans I -0.003
Morans I 0.486
Columbus homicide data (source Luc Anselin)
34
Spatial Analysis
What distinguishes spatial statistical data
analysis is that its main focus is on inquiring
about spatial patterns of places and values, the
spatial association between them and the
sistematic variation of the phenomenon in
diffeent locations. Anselin,1992
35
Local Indicators of Spatial Autocorrelation (LISA)
  • Moran I is global
  • What if we want to find out the spatial
    correlation of each area?
  • Use a local indicator
  • Compares local value to that of its neighbours

36
Local and Global Analysis
  • Global
  • one statistic to summarize pattern
  • Clustering
  • Homogeneity
  • Local
  • location-specific statistics
  • clusters
  • heterogeneity

37
LISA Definition (Anselin 1995)
  • LISA satisfies two requirements
  • indicate significant spatial clustering for each
    location
  • sum of LISA proportional to a global indicator of
    spatial association
  • LISA Forms of Global Statistics
  • local Moran, local Geary, local Gamma

38
Use of LISA
  • Identify Hot Spots
  • significant local clusters in the absence of
    global autocorrelation
  • some complications in the presence of global
    autocorrelation (extra heterogeneity)
  • significant local outliers
  • high surrounded by low and vice versa
  • Indicate Local Instability
  • local deviations from global pattern of spatial
    autocorrelation

39
Local Indicators of Spatial Autocorrelation (LISA)
LISAs enable a quantitative expression of spatial
distribution of values
Distributution characteristics
-concentrations -persistences -transitions
40
Local Indicators of Spatial Autocorrelation (LISA)
Local Moran
G index
where is the spatial weight for
objects i and j
41
Distance Statistics for Local Spatial Association
  • Getis-Ord Gi and Gi
  • one statistic for each location
  • contiguity as distance bands, wij(d)
  • Gi Statistic
  • does not include observation i
  • Gi Statistic
  • includes observation i in sum

42
Interpretation of Gi Statistics
  • Local Spatial Association
  • positive clusters of high values
  • negative clusters of low values
  • Inference
  • randomization
  • permutation
  • Visualization
  • map of locations with significant Gi or Gi

43
Spatial weights matrix
44
Local Indicators of Spatial Autocorrelation (LISA)
  • How can we know if a LISA value means anything?
  • Use permutation to construct a probability
    distribtuion
  • Change everybodys place but one region
  • Produce a map showing those areas whose LISA
    values are different from the rest (LISA MAP).
  • Statistical Significance
  • Not significant
  • Significant at 95 (1,96s), 99 (2,54s) e 99,9
    (3,2s).

45
LISA Map for old age in São Paulo


46
Data
proportion of jobs per local population in
greater São Paulo
47
Local moran signifcance map
48
ANÁLISE ESPACIAL II - LISA
Mapa Gi normalizado classificados por desvios
padrão
49
ANÁLISE ESPACIAL II - LISA
Mapa de Espalhamento de Moran
50
Interpretation and Limitations
  • Most Important
  • assessing lack of spatial randomness
  • suggests significant spatial structure
  • Multivariate Association
  • univariate spatial autocorrelation may result
    from
  • multivariate association
  • scale mismatch
  • need to control for other variables spatial
    regression
  • LISA Clusters and Hot Spots
  • suggest interesting locations
  • do not explain

51
LISAs in São Paulo
Population density
population
52
Local Moran percentage of rented houses
LM SÉ 3.579176 LAPA
-1.555046 SANTA CECÍLIA 3.128312 REPÚBLICA
5.159141 BOM RETIRO 2.788280 BRÁS
2.360710
53
Gi - percentage of rented housing
Gi_qi_aluguel
Region Z PROB CONSOLAÇÃO
3.7602 0.0002 SÉ 3.6893 0.0002 CAMPO
LIMPO -3.0400 0.0024 JD. ÂNGELA -2.7608
0.0058
54
Gi - percentage of rented housing
GI_qi_aluguel
IBGE Z PROB SÉ 4.1501
0.0000 REPÚBLICA 4.0764 0.0000 JD. SÃO
LUIS -3.2949 0.0010 CAMPO LIMPO -3.1093
0.0019
55
Local moran no income
IBGE LOCAL MORAN CONSOLAÇÃO
3.372106 JD. PAULISTA 5.925623 ITAIM
PAULISTA 4.440743 JD. HELENA
3.608146 MOEMA 4.258492
56
Gi statistic no income
Gi_qi_srend
Z PROB VILA CURUÇÁ 4.1568
0.0000 SÃO MIGUEL 3.7919 0.0001 MOEMA
-4.6730 0.0000 ITAIM BIBI -4.5468 0.0000
57
Gi statistic no income
Z PROB VILA CURUÇÁ 4.3464
0.0000 ITAIM PAULISTA 4.0837 0.0000 MOEMA
-5.0586 0.0000 ITAIM BIBI -4.8552 0.0000
Write a Comment
User Comments (0)
About PowerShow.com