Geographic Information Systems (GIS): Spatial Analysis November 1, 2005

About This Presentation
Title:

Geographic Information Systems (GIS): Spatial Analysis November 1, 2005

Description:

slaug Enger Olsen, Maria Lyngstad, Guro Bakke H ndlykken og Jorunn Randby (M3) ... Subtract the area of the extended trapezium (in this case, a rectangle) ... –

Number of Views:171
Avg rating:3.0/5.0
Slides: 71
Provided by: keith249
Category:

less

Transcript and Presenter's Notes

Title: Geographic Information Systems (GIS): Spatial Analysis November 1, 2005


1
Geographic Information Systems(GIS) Spatial
AnalysisNovember 1, 2005
2
Notes
  • Oslo Project
  • Groups
  • Assignment
  • Due Date December 15, 2005
  • Mid-term quiz 2 November 8
  • Progress in GI Science eSeminar Series

3
Existing Groups
  • Marita Sanni, Julie Aaraas, Kristin I. Dankel,
    Solveig Melå (4)
  • Ã…slaug Enger Olsen, Maria Lyngstad, Guro Bakke
    HÃ¥ndlykken og Jorunn Randby (M3)
  • Nina Ambro Knutsen, Ellen Winje og Leif Ingholm
    (3)
  • Birte Mobraaten, Hans Petter Wiken, Silje Hernes
    and Bente Lise Stubberud (4)
  • Daniel Molin, Ida Sjølander, Anne-Lise Folland
    and Nicolai Steineger (4)
  • Hæge Skjæveland, Marie Aaberge, Cecilie Hirsch,
    Kaja Korsnes Kristensen
  • Urs Dippon, Steven huiching Yip, Harald Kvifte
    Eirik Waag
  • Marthe Stiansen, Marielle Stigum, Tomas
    Nesset,Andreas Skjetne
  • Gjermund Steinskog (Archaeaology M16-18)
  • Solveig Lyby (Archaeaology - M10-12)
  • 10. Andreas Dyken, HÃ¥kon Grevbo, Terje-Andre
    Gudmundsen (3)

4
Project Examples from 2004
  • Tilgjengelighet til legesentre i Bydel Grorud
  • Innvandrernes bosettingsmønster
  • Distinksjoner i Oslo En Bourdieusk alanyse av
    ulikehet ved hjelp av geografiske
    informasonssystemer
  • Sosiale skiller i Oslo
  • Sosiale ulikheter i Oslo
  • Inntekt og boligstruktur i Oslo med fokus pÃ¥
    bydel Gamle Oslo
  • Privatisering og innntektsnivÃ¥ i bydel Vestre Aker

5
  • GI Science eSeminar Series

6
Outline for Todays lecture
  • What is spatial analysis?
  • Queries and reasoning
  • Measurements
  • Spatial Interpolation
  • Descriptive Summaries
  • Optimization
  • Hypothesis Testing

7
Spatial Analysis
  • Turns raw data into useful information
  • by adding greater informative content and value
  • Reveals patterns, trends, and anomalies that
    might otherwise be missed
  • Provides a check on human intuition
  • by helping in situations where the eye might
    deceive

8
Definitions
  • A method of analysis is spatial if the results
    depend on the locations of the objects being
    analyzed
  • move the objects and the results change
  • results are not invariant (i.e., they vary!)
    under relocation
  • Spatial analysis requires both attributes and
    locations of objects
  • a GIS has been designed to store both

9
The Snow Map (cholera outbreaks in the 1850s)
  • Provides a classic example of the use of location
    to draw inferences
  • But the same pattern could arise from contagion
    (cholera spread through the air)
  • if the original carrier lived in the center of
    the outbreak
  • contagion was the hypothesis Snow was trying to
    refute. Today, a GIS could be used to show a
    sequence of maps as the outbreak developed
  • contagion would produce a concentric sequence,
    drinking water a random sequence

10
Types of Spatial Analysis
  • There are literally thousands of techniques
  • Six categories are used in this course, each
    having a distinct conceptual basis
  • Queries and reasoning
  • Measurements
  • Transformations
  • Descriptive summaries
  • Optimization
  • Hypothesis testing

11
Queries and Reasoning
  • A GIS can respond to queries by presenting data
    in appropriate views
  • and allowing the user to interact with each view
  • It is often useful to be able to display two or
    more views at once
  • and to link them together
  • linking views is one important technique of
    exploratory spatial data analysis (ESDA)

12
The Catalog View
Shows folders, databases, and files on the left,
and a preview of the contents of a selected data
set on the right. The preview can be used to
query the data sets metadata, or to look at a
thumbnail map, or at a table of attributes. This
example shows ESRIs ArcCatalog.
13
The Map View
A user can interact with a map view to identify
objects and query their attributes, to search for
objects meeting specified criteria, or to find
the coordinates of objects. This illustration
uses ESRIs ArcMap.
14
The Table View
Here attributes are displayed in the form of a
table, linked to a map view. When objects are
selected in the table, they are automatically
highlighted in the map view, and vice versa. The
table view can be used to answer simple queries
about objects and their attributes.
15
Measurements
  • Many tasks require measurement from maps
  • measurement of distance between two points
  • measurement of area, e.g. the area of a parcel of
    land
  • Such measurements are tedious and inaccurate if
    made by hand
  • measurement using GIS tools and digital databases
    is fast, reliable, and accurate

16
Measurement of Length
  • A metric is a rule for determining distance from
    coordinates
  • The Pythagorean metric gives the straight-line
    distance between two points on a flat plane
  • The Great Circle metric gives the shortest
    distance between two points on a spherical globe
  • given their latitudes and longitudes

17
Issues with Length Measurement
  • The length of a true curve is almost always
    longer than the length of its polyline or polygon
    representation

18
Issues with Length Measurement
  • Measurements in GIS are often made on horizontal
    projections of objects
  • length and area may be substantially lower than
    on a true three-dimensional surface

19
Measurement of Area
  • Calculate and sum the areas of a series of
    polygons, formed by dropping perpendiculars to
    the x axis. Subtract the area of the extended
    trapezium (in this case, a rectangle).
  • The area for each polygon is calculated as the
    difference in x times the average of y.

y2
y1
x1
x2
20
Measurement of Shape
  • Shape measures capture the degree of
    contortedness of areas, relative to the most
    compact circular shape
  • by comparing perimeter to the square root of area
  • normalized so that the shape of a circle is 1
  • the more contorted the area, the higher the shape
    measure

21
Shape as an indicator of gerrymandering in
elections
The 12th Congressional District of North Carolina
was drawn in 1992 using a GIS, and designed to be
a majority-minority district with a majority of
African American voters, it could be expected to
return an African American to Congress. This
objective was achieved at the cost of a very
contorted shape. The U.S. Supreme Court
eventually rejected the design.
22
Slope and Aspect
  • Calculated from a grid of elevations (a digital
    elevation model)
  • Slope and aspect are calculated at each point in
    the grid, by comparing the points elevation to
    that of its neighbors
  • usually its eight neighbors
  • but the exact method varies
  • in a scientific study, it is important to know
    exactly what method is used when calculating
    slope, and exactly how slope is defined

23
Alternative Definitions of Slope
The ratio of the change in elevation to the
actual distance traveled, range 0 to 1
The angle between the surface and the horizontal,
range 0 to 90
The ratio of the change in elevation to the
horizontal distance traveled, range 0 to infinity
24
Transformations
  • Create new objects and attributes, based on
    simple rules
  • involving geometric construction or calculation
  • may also create new fields, from existing fields
    or from discrete objects

25
Buffering (Dilation)
  • Create a new object consisting of areas within a
    user-defined distance of an existing object
  • e.g., to determine areas impacted by a proposed
    highway
  • e.g., to determine the service area of a proposed
    hospital
  • Feasible in either raster or vector mode

26
Buffering
Line
Point
Polygon
27
Raster Buffering Generalized
  • Vary the distance buffered according to values in
    a friction layer

City limits
Areas reachable in 5 minutes
Areas reachable in 10 minutes
Other areas
28
Point in Polygon Transformation
  • Determine whether a point lies inside or outside
    a polygon (enclosure)
  • Basis for answering many simple queries
  • used to assign crimes to police precincts, voters
    to voting districts, accidents to reporting
    counties

29
The Point in Polygon Algorithm
Draw a line from the point to infinity in any
direction, and count the number of intersections
between this line and each polygons boundary.
The polygon with an odd number of intersections
is the containing polygon all other polygons
have an even number of intersections.
30
Polygon Overlay
  • Two case for discrete objects and for fields
  • Discrete object case find the polygons formed by
    the intersection of two polygons. There are many
    related questions, e.g.
  • do two polygons intersect?
  • Which areas fall in Polygon A but not in Polygon
    B?
  • The complexity of computing polygon overlays was
    one of the greatest barriers to the development
    of vector GIS

31
Polygon Overlay, Discrete Object Case
B
A
In this example, two polygons are intersected to
form 9 new polygons. One is formed from both
input polygons four are formed by Polygon A and
not Polygon B and four are formed by Polygon B
and not Polygon A.
32
Polygon Overlay, Field Case
  • Two complete layers of polygons are input,
    representing two classifications of the same area
  • e.g., soil type and land ownership
  • The layers are overlaid, and all intersections
    are computed creating a new layer
  • each polygon in the new layer has both a soil
    type and a land ownership
  • the attributes are said to be concatenated
  • The task is often performed in raster

33
Polygon overlay, field case
Owner X
Owner Y
Public
A layer representing a field of land ownership
(colors) is overlaid on a layer of soil type
(layers offset for emphasis). The result after
overlay will be a single layer with 5 polygons,
each with a land ownership value and a soil type.
34
Spurious or Sliver Polygons
  • In any two such layers there will almost
    certainly be boundaries that are common to both
    layers
  • e.g. following rivers
  • The two versions of such boundaries will not be
    coincident
  • As a result large numbers of small sliver
    polygons will be created
  • these must somehow be removed
  • this is normally done using a user-defined
    tolerance

35
Overlay of fields represented as rasters
The two input data sets are maps of (A) travel
time from the urban area shown in black, and (B)
county (red indicates County X, white indicates
County Y). The output map identifies travel time
to areas in County Y only, and might be used to
compute average travel time to points in that
county in a subsequent step.
36
Spatial Interpolation
  • Values of a field have been measured at a number
    of sample points
  • There is a need to estimate the complete field
  • to estimate values at points where the field was
    not measured
  • to create a contour map by drawing isolines
    between the data points
  • Methods of spatial interpolation are designed to
    solve this problem

37
Spatial Interpolation
  • Thiessen polygons (define individual areas of
    influence around each of a set of points. They
    are polygons whose boundaries define the area
    that is closest to each point relative to all
    other points, defined by the perpendicular
    bisectors of the lines between all points.

38
(No Transcript)
39
Inverse Distance Weighting (IDW)
  • The unknown value of a field at a point is
    estimated by taking an average over the known
    values
  • weighting each known value by its distance from
    the point, giving greatest weight to the nearest
    points
  • an implementation of Toblers Law

40
point i known value zi location xi weight wi
distance di
unknown value (to be interpolated) location x
The estimate is a weighted average
Weights decline with distance
41
Issues with IDW
  • The range of interpolated values cannot exceed
    the range of observed values
  • it is important to position sample points to
    include the extremes of the field
  • this can be very difficult

42
A Potentially Undesirable Characteristic of IDW
interpolation
This set of six data points clearly suggests a
hill profile (dashed line). But in areas where
there is little or no data the interpolator will
move towards the overall mean (solid line).
43
Kriging
  • A technique of spatial interpolation firmly
    grounded in geostatistical theory
  • Kriging is based on the assumption that the
    parameter being interpolated can be treated as a
    regionalized variable (intermediate between a
    truly random and a completely deterministic
    variable)
  • Points near each other have a certain degree of
    spatial autocorrelation, and points that are
    widely separate are statistically independent.
  • Kriging is a set of linear regression routines
    which minimize estimation variance from a
    predefined covariance model.

44
A semivariogram. Each cross represents a pair of
points. The solid circles are obtained by
averaging within the ranges or bins of the
distance axis. The solid line represents the best
fit to these five points, using one of a small
number of standard mathematical functions.
45
Stages of Kriging
  • Analyze observed data to estimate a semivariogram
  • Estimate values at unknown points as weighted
    averages
  • obtaining weights based on the semivariogram
  • the interpolated surface replicates statistical
    properties of the semivariogram

46
Density Estimation and Potential
  • Spatial interpolation is used to fill the gaps in
    a field
  • Density estimation creates a field from discrete
    objects
  • the fields value at any point is an estimate of
    the density of discrete objects at that point
  • e.g., estimating a map of population density (a
    field) from a map of individual people (discrete
    objects)

47
The Kernel Function
  • Each discrete object is replaced by a
    mathematical function known as a kernel
  • Kernels are summed to obtain a composite surface
    of density
  • The smoothness of the resulting field depends on
    the width of the kernel
  • narrow kernels produce bumpy surfaces
  • wide kernels produce smooth surfaces

48
The result of applying a 150km-wide kernel to
points distributed over California
A typical kernel function
49
When the kernel width is too small (in this case
16km, using only the S California part of the
database) the surface is too rugged, and each
point generates its own peak.
50
Other types of spatial analysis
  • Data mining
  • Descriptive summaries
  • Optimization
  • Hypothesis testing

51
Data Mining
  • Analysis of massive data sets in search for
    patterns, anomalies, and trends
  • spatial analysis applied on a large scale
  • must be semi-automated because of data volumes
  • widely used in practice, e.g. to detect unusual
    patterns in credit card use

52
Descriptive Summaries
  • Attempt to summarize useful properties of data
    sets in one or two statistics
  • The mean or average is widely used to summarize
    data
  • centers are the spatial equivalent
  • there are several ways of defining centers

53
The Centroid
  • Found for a point set by taking the weighted
    average of coordinates
  • The balance point

54
The Histogram
  • A useful summary of the values of an attribute
  • showing the relative frequencies of different
    values
  • A histogram view can be linked to other views
  • e.g., click on a bar in the histogram view and
    objects with attributes in that range are
    highlighted in a linked map view

55
A histogram or bar graph, showing the relative
frequencies of values of a selected attribute.
The attribute is the length of street between
intersections. Lengths of around 100m are
commonest.
56
Spatial Dependence
  • There are many ways of measuring this very
    important summary property
  • Most methods have been developed for points
  • Patterns can be random, clustered, or dispersed
  • Measures differ for unlabeled and labeled
    features (e.g. individual house locations, versus
    housing types)

57
Dispersion
  • A measure of the spread of points around a center
    (standard deviation)
  • Related to the width of the kernel used in
    density estimation

58
Fragmentation Statistics
  • Measure the patchiness of data sets
  • e.g., of vegetation cover in an area
  • Useful in landscape ecology, because of the
    importance of habitat fragmentation in
    determining the success of animal and bird
    populations
  • populations are less likely to survive in highly
    fragmented landscapes

59
Three images of part of the state of Rondonia in
Brazil, for 1975, 1986, and 1992. Note the
increasing fragmentation of the natural habitat
as a result of settlement. Such fragmentation can
adversely affect the success of wildlife
populations.
60
Optimization
  • Spatial analysis can be used to solve many
    problems of design or create improved design
    (minimizing distance traveled or construction
    costs, maximizing profit)
  • A spatial decision support system (SDSS) is an
    adaptation of GIS aimed at solving a particular
    design problem

61
Optimizing Point Locations
  • The minimum aggregate travel (MAT) is a simple
    case one location and the goal of minimizing
    total distance traveled to get there
  • The operator of a chain of convenience stores
    (e.g. Seven Eleven) might want to solve for many
    locations at once
  • where are the best locations to add new stores?
  • which existing stores should be dropped?

62
Routing Problems
  • Search for optimum routes among several
    destinations
  • The traveling salesperson problem
  • find the shortest tour from an origin, through a
    set of destinations, and back to the origin

63
Routing service technicians for Schindler
Elevator. Every day this companys service crews
must visit a different set of locations in Los
Angeles. GIS is used to partition the days
workload among the crews and trucks (color
coding) and to optimize the route to minimize
time and cost.
64
Optimum Paths
  • Find the best path across a continuous cost
    surface
  • between defined origin and destination
  • to minimize total cost
  • cost may combine construction, environmental
    impact, land acquisition, and operating cost
  • used to locate highways, power lines, pipelines
  • requires a raster representation

65
Solution of a least-cost path problem. The white
line represents the optimum solution, or path of
least total cost, across a friction surface
represented as a raster. The area is dominated by
a mountain range, and cost is determined by
elevation and slope. The best route uses a
narrow pass through the range. The blue line
results from solving the same problem using a
coarser raster.
66
Hypothesis Testing
  • Hypothesis testing is a recognized branch of
    statistics
  • A sample is analyzed, and inferences are made
    about the population from which the sample was
    drawn
  • The sample must normally be drawn randomly and
    independently from the population

67
Hypothesis Testing with Spatial Data
  • Frequently the data represent all that are
    available
  • e.g., all of the census tracts of Los Angeles
  • It is consequently difficult to think of such
    data as a random sample of anything
  • not a random sample of all census tracts
  • Toblers Law guarantees that independence is
    problematic
  • unless samples are drawn very far apart

68
Possible Approaches to Inference
  • Treat the data as one of a very large number of
    possible spatial arrangements
  • useful for testing for significant spatial
    patterns
  • Discard data until cases are independent
  • no one likes to discard data
  • Use models that account directly for spatial
    dependence
  • Be content with descriptions and avoid inference

69
Summary
  • All methods of spatial analysis work best in the
    context of a collaboration between human and
    machine. One benefit of the machine is that it
    sometimes serves to correct any misleading
    aspects of human intuition. (Human can be poor at
    guessing the answers to optimization problems in
    space.)

70
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com