Title: Putting time into the GeoWeb:
1Putting time into the GeoWeb
- Data persistence in a web services environment
- Steve Morris
- NCSU Libraries
July 23, 2008
2Overview
- Background to the digital preservation problem
- Problems
- Temporal data access issues
- Capturing data state in a services or API context
- Making the business case for older data
- Preservation approaches
- Future directions
3Project background North Carolina Geospatial
Data Archiving Project
- Partnership between university library (NCSU) and
state agency (NCCGIA) - Under cooperative agreement with Library of
Congress in NDIIPP national preservation program - Focus on state and local geospatial content in
North Carolina (state demonstration) - Tied to NC OneMap initiative, which provides for
seamless access to data, metadata, and
inventories - Goal Engage spatial data infrastructure (SDI) in
data preservation and archiving
Demonstration repository as catalyst for an
industry conversation
4SDI role in data preservation
- Data inventories support content identification
- Metadata standards support discoverability and
use - Content standards support data interoperability
over time and help eliminate semantic confusion - Data exchange networks
- Minimize need to make contact
- Add technical, administrative, descriptive
metadata - Establish rights and provenance
5Project roots NCSU Libraries data directories
- Tracking data, map servers, and web services
since 2000 - Ranked 3rd in traffic among entry points to
entire library website - Persistent identifiers
- usage tracking
- ID links used in other sites
- Community help in site maintenance
6County map and data services in NC
100 Counties in North Carolina
7Carrboro, NC Population 17,797 (2005 est.)
24 downloadable GIS data layers
6 web mapping applications
4 WMS data layers
9 downloadable PDF map layers
8Downtown Raleigh Near State Capitol 1914 Sanborn
Map
9Downtown Raleigh Near State Capitol 1993 DOQQ
10Downtown Raleigh Near State Capitol 1999 Wake
County Ortho
11Downtown Raleigh Near State Capitol 2005 Wake
County Ortho
12Imagery Durable Static Simple structure Mostly
open formats Vector data Volatile Frequent
update Complex structure Mostly proprietary
formats
Imagery Durable Static Simple structure Mostly
open formats Vector data Volatile Frequent
update Complex structure Mostly commercial
formats
Downtown Raleigh Near State Capitol 2005 Wake
County Ortho
Downtown Raleigh Near State Capitol 2005 Wake
County Ortho
13Data preservation points of failure
- Data is not saved, or
- cant be found, or
- media is obsolete, or
- media is corrupt, or
- format is obsolete, or
- file is corrupt, or
- meaning is lost
Solutions Migration Emulation Encapsulation XML
14Problem Data state in a web services or
API-driven environment
15Problem Temporal data unavailability
- Industry focus on latest and greatest data
- Kill and fill as a common approach to data
management (past versions of vector data lost) - Not just data loss, also Loss of memory about
data - Of superceded county orthophoto flights in NC
only 22 recorded in the states GIS inventory
Some older inventories only available through
Internet Archive
16Availability of older orthoimagery on county map
servers in NC
Only 30 of superceded digital ortho flights
accessible through county map servers
17Availability of older orthoimagery on county map
servers in NC
23 Counties in NC publish ortho WMS services 0
Counties in NC publish superceded orthos as WMS
services
18Problem Making business case for archiving
1993
1998
1999
2005
2002
Use case Land use and impervious surface change
analysis
19Building the preservation business case
- Land use change analysis
- Site location analysis
- Real estate trends analysis
- Disaster response
- Resolution of legal challenges
- Impervious surface change mapping
20Planned 2008 NC business case survey
- Case description
- Resources/Scope of effort
- Benefits and results
- Fiscal assessment
Based on previous experience, pending projects,
examples of when a project could have been served
better if archival data were available
21Geospatial data preservation challenges
- Producer focus on current data
- Future support of data formats in question
- Inadequate or nonexistent metadata
- Spatial databases
- Complex data objects (multi-file, multi-format)
- Shift to web services-based access (ephemeral
data) - Difficult to capture data state at point of
decision-making
22Preservation approaches Temporal data snapshots
Issue How frequently should county and municipal
vector data layers be captured in
archives? Parcels, centerlines, jurisdictions,
zoning,
Parcel Boundary Changes 2001-2004, North
Raleigh, NC
23NC frequency of data capture surveys
- How often should continually changing vector
datasets be captured? - Tap into data custodian understanding of
production patterns and uses - Tap into local innovation
- Learn about local business drivers for data
archiving - 2006 and 2008 surveys of NC cities and counties
- 2008 survey of archival practice in state
agencies in NC - Planned survey of data users in NC
http//www.nconemap.com/AboutNCOneMap/tabid/289/De
fault.aspxpreservation
24Preservation approaches Dessicated data
Complex data representations can be made more
preservable (and less useful) through
simplification
25Preservation approaches Dessicated data
- Complex documents may be very hard to preserve
over time - GIS project files
- Layer definitions
- Web services or API interactions
- Image outputs capture some sense of final
product--but lose underlying data intelligence
26Dessicated data PDF and GeoPDF
- Cartographic outputs analogous to the old paper
maps - Combined datasets, with data models,
classification, symbolization, annotation - More data intelligence than in images
27Dessicated data Geospatial PDF
- Explosion of geospatial PDF content in past few
years - Standards issues
- GeoPDF proprietary TerraGo technology
- PDF an open ISO standard
- Open PDF variants created through ISO standards
process (PDF/E, PDF/X, PDF/A, ) - PDF content retained in addition to, NOT instead
of data
28Preservation approaches Historical WMS tile
caches?
No market for archived tiles without standard way
to describe tiles and without commonly used
tiling schemes
29Preservation approachesHistorical WMS tile
caches?
- Tile cache systems developed for more responsive
WMS or mapping systems - WMS Tile Caching (WMS-C) incubated by OSGEO
- WMTS (Web Map Tiling) OGC white paper
- No explicit temporal component in WMS-C or WMT
To what extent do temporal geospatial systems
become video-like?
30Old maps coming into the GeoWeb
Pronounced local agency interest in archiving,
digitizing, and geo-referencing older analog
products
- Use Sanborn map slide or replacement
31New archiving interest Location-based content
Oblique Imagery
Street Views
3D Images
- Present-day value in location-based services and
mobile applications
32New archiving interest Location-based content
- Future value of non-spatial place-based imagery
as cultural heritage resource - More descriptive of place and function than
spatial imagery
33Moving forward
- GICC Archival and Long-Term Access Committee
- Geo Multistate Archival and Preservation
Partnership (GeoMAPP) - OGC Data Preservation Working Group
34Community response to data archiving challenge
- Nov. 2007 NC Geographic Information
Coordinating Council (GICC) - Ten Recommendations in Support of Geospatial
Data Sharing released - Recommendation Establish archive and long term
data access strategies - Suggested best practices include Establish a
policy and procedure for the provision of access
to historic data, especially for framework data
layers.
35GICC Archival and Long-Term Access Committee
- Initiated in response to agency requests for
guidance on temporal data management - Federal, state, regional, and local agency
representation - Key focus
- Best practices for data snapshots and retention
- State Archives processes appraisal, selection,
retention schedules, etc. - Who, What, Why, When, Where, How
36Geo Multistate Archive and Preservation
Partnership (GeoMAPP)
- Lead organizations North Carolina Center for
Geographic Information Analysis (NCCGIA), State
Archives of NC, with Library of Congress - Partners
- State geospatial organizations of Kentucky and
Utah - State Archives of Kentucky and Utah
- NCSU Libraries in catalytic/advisory role
- State-to-state and geo-to-Archives collaboration
- 2 year project Nov. 2007-Dec. 2009
- Archives as part of Spatial Data Infrastructure
37OGC Data Preservation Working Group
- Formed Dec. 2006
- Engage archival community
- Find points of intersection with other OGC
activities - GML for archiving
- Content packaging
- Large scale data transfers
- Time in decision support
38The Content Packaging Problem
- Files
- Multi-file dataset
- Georeferencing
- Metadata file
- Symbols file
- Additional
- documentation
- License
- Disclaimer
- More
- Metadata
- ISO/FGDC
- Acquisition metadata
- Transfer metadata
- Ingest metadata
- Archive rights
- Archive processes
- Collection metadata
- Series metadata
Metadata Exchange Format (MEF) in GeoNetwork a
form of content packaging
39Questions?
Contact Steve Morris Head, Digital Library
Initiatives NCSU Libraries Steven_Morris_at_ncsu.edu
NCGDAP site http//www.lib.ncsu.edu/ncgdap/
40(No Transcript)