GIS Data Sources and Data Quality - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

GIS Data Sources and Data Quality

Description:

... unit of aggregation (census tract, ZIP code, etc. ... Positional accuracy is a function of map scale. ... incorrect entry of ZIP code. or latitude/longitude) ... – PowerPoint PPT presentation

Number of Views:817
Avg rating:3.0/5.0
Slides: 45
Provided by: thomas120
Category:
Tags: gis | code | data | map | quality | sources | zip

less

Transcript and Presenter's Notes

Title: GIS Data Sources and Data Quality


1
GIS Data Sources and Data Quality
Glen Johnson New York State Department of
Health and School of Public Health, University at
Albany
2
GIS marriage of geospatial objects and
attributes
1. Geospatial Objects vector coverage
- points, lines, polygons raster
coverage (grid cells) - images, model
results 2. Attribute Data information
associated with geospatial objects
3
Vector data - points (locations of residences,
hospitals, hazardous waste sites, etc.) - lines
(roads, streams, etc.) - polygons (- artificial,
for human management, such as
counties, census tracts, ZIP codes
- real, such as
watershed boundaries, geology, soil, water
bodies) Raster data (for continuous spatial
coverage) - remote sensing products -
aerial photographs, orthorectified to adjust for
distortion - satellite imagery - grid-based
model results (air pollution, soil erosion, etc.
) - digital elevation models (DEMs) for
topography
4
(No Transcript)
5
So where acquire all these data ?
  • Many sources, increasing every year
  • Our focus is on Public Health applications
  • We will highlight key sources
  • You are responsible for the quality of any data
    used for your GIS projects

6
Some key GIS data sources(Other than what ships
with commercial software or that you can purchase)
NationwideFederal Geographic Data Committee
(FGDC) Geospatial One-Stop http//gos2.geodata.
gov/wps/portal/gos Statewide (a quick sample)
New York State GIS Clearinghouse http//www.nysg
is.state.ny.us/(though access is increasingly
limited for security reasons) New York State
Data Center http//www.nylovesbiz.com/nysdc/downlo
ad_intro.asp)
7
Global (another quick sample) Cornell
University Librarywww.library.cornell.edu/olinuri
s/ref/maps/intldata.htmlStanford University
Librarywww-sul.stanford.edu/depts/gis/web.html
Many many URLs today just search
8
Procuring Data Download Uncompress Translate
(for particular software and coordinate
projections) need metadata for information on
coordinate projections
9
  • If several GIS users need the same data, best
    to locate on a central server or database
  • Avoid wasteful duplication of effort
  • Central maintenance - updating - data quality
    assurance

If using a virtual globe environment like Google
Earth, ArcExplorer, etc., much of the
public-domain data are already in place
10
Source data Geographic objects attributes

PC or central server or Database (personal or
enterprise)
tables / reports
statistical and other external analyses
Other, external data
Mapsprinted or served for more interactive
viewing/analysis
11
  • Attribute data for public health applications
    include
  • Population (based on census)
  • Health Outcomes
  • Exposure
  • Environmental

12
U.S. census data source www.census.gov Census
geography based on TIGER files (see handout)
13
US Census TIGER FilesNew York State (2000)
  • County - 62
  • Census Tract 4,907
  • Block Group 15,079
  • Census Block 298,506

14
Census Geography, Albany City
15
Census Geography, Albany County
16
Socio-demographic attribute data
  • Census Short Form (Summary Tape File 1 in 1990
    Summary File 1 in 2000)
  • 100 data
  • Lowest level of geography is census block
  • age groups
  • Sex
  • Race (much more detailed in 2000)
  • Ethnicity (Hispanic origin)
  • Housing Units

17
  • Census Long Form (STF3 in 1990 SF3 in 2000)
  • 1 in 6 households sampled.
  • Lowest level of census geography is block group.
  • Education
  • Income
  • Housing
  • Source of water and sewer
  • Commuting time
  • Country of origin
  • Occupation
  • many other attributes (variables)

18
- subject to confidentiality protection at the
personal level- public domain data often
available at an aggregated level
Health Outcomes Data
  • Vital statistics. Birth and Death (ICD Codes)
  • Hospital Discharge Data (ICD, DRG, MDC codes) -
    SPARCS in New York
  • Cancer Incidence
  • Congenital Malformations
  • STDs, HIV/AIDS
  • Infectious Diseases

19
Exposure Registries
  • Some New York State examples
  • Occupational Heavy Metals Registry
  • Childhood Blood Lead Reporting System
  • Radon Registry
  • Volatile Organic Compound Registry
  • Pesticide Registry

20
Environmental Exposure Sources(mostly from U.S.
EPA and state agencies)some examples
  • Toxic Release Inventory
  • Inactive Hazardous Waste Sites
  • Municipal landfills
  • Discharges to water (SPDES)
  • Household measures of radon
  • Soil sample data
  • Drinking water contaminants
  • Air pollution modeled and measured
  • Contaminants in fish
  • Power plants
  • Contaminants in raw and finished drinking water

21
Data Quality in GIS
22
Producer - responsible for documenting data
quality (producing metadata)
User - responsible for checking data quality,
especially with respect to the particular
application
feedback
23
Data Quality Standards
Federal Geographic Data Committee (FGDC)
(www.fgdc.gov) - established Spatial Data
Transfer Standard and Content Standards for
Digital Geospatial Metadata
in other words provides common set of
terminology and common structure for
geospatial metadata - fundamental data
quality information to be reported -
tests to be performed
24
Fundamental aspects of data quality that apply to
both geospatial and attribute components of a GIS
  • Accuracy (closeness to truth)
  • Resolution (level of detail)
  • Consistency (logical?)
  • Completeness (degree of omission)

geospatial data quality requirements depend on
application and scale
25
Scale means different things Map Scale - say
1 on map 24,000 on real land - 124,000 is
said to be a larger scale map than, say
1100,000 Measurement Scale (primary unit of
observation, also known as grain or resolution)
- i.e. areal unit of aggregation (census tract,
ZIP code, etc.) - pixel size in raster (grid)
image Extent - spatial boundary within which
a study applies - i.e. state of New York, Kings
County, Adirondack Blue Line, etc.
26
Spatial / Positional accuracy is a function of
map scale. Accuracy Standards employed by the
U.S.G.S. for Various Scale Maps
11,200 3.33 feet 12,400 6.67 feet 14,800
13.33 feet 110,000 27.78 feet 112,000
33.33 feet 124,000 40.00 feet 163,360
105.60 feet 1100,000 166.67 feet
This means that when we see a point on a map we
have its "probable" location within a certain
area. The same applies to lines.
27
Spatial Accuracy of a Point Object
28
Spatial Accuracy of a Line Object
29
Digitizing Errors
Other issues of geospatial errors
30
  • Attribute Quality
  • same issues as with non-GIS studies (must do
    usual data checking and cleaning)
  • attribute errors become spatial errors when
    attributes are used for mapping (for example,
    incorrect entry of ZIP code or
    latitude/longitude)

31
EPA 1989 Toxic Release Release InventoryQuery
for New York State sites
32
EPA 1989 Toxic Release Release Inventory Query
for New York State sites
33
(No Transcript)
34
  • Resolution
  • - spatial
  • nodes / line (vector format data) pixel
    size (raster format data)
  • - temporal
  • Incidence rates over 5 yr period vs. 1 yr period

Low resolution not necessarily bad Optimum
resolution depends on application and
consequently desired map scale
35
When digitizing natural boundaries, greater
resolution generally means greater accuracy (at
the cost of greater data storage requirements and
processing times)
36
Local analysis, such as identifying buildings
in a neighborhood, may call for fine resolution
digital orthophotos
37
Can zoom in very close before actual pixel
structure emerges
38
(No Transcript)
39
Regional analysis, such as analyzing land cover
patterns, may be better suited for coarser
resolution satellite imagery
Southeast Pennsylvania land cover based on
30-meter resolution LANDSAT image
40
Not appropriate for local analysis, such as
identifying buildings
Philadelphia International Airport, zoomed in
from previous image
41
Optimum resolution is balance between objective
of analysis and data storage/processing efficiency
Consider increase resolution by halving the
length of a pixel side results in quadrupling
the data set size
42
Temporal Accuracy
43
  • Key Points on Data Quality
  • accuracy, resolution, consistency, completeness
  • scale dependence
  • consider spatial, temporal and attribute errors
  • metadata - how complete is it? - does it
    exist at all?

44
An excellent, up-to-date overview of many aspects
of GIS and Spatial analysis that is freely
accessible De Smith, Goodchild and Longley,
2006-2008 Geospatial Analysis - a comprehensive
guide. http//www.spatialanalysisonline.com/outp
ut/
Write a Comment
User Comments (0)
About PowerShow.com