Title: Data Standards at the IRI Data Library
1Data Standards at the IRI Data Library
M.Benno Blumenthal, Michael Bell, John del
Corral, and Emily Grover-KopecInternational
Research Institute for Climate and
SocietyColumbia Universityhttp//iridl.ldeo.co
lumbia.edu/
2Current Data Exchange Standards
- There are many of them
- Some are flexible but semantically weak
- Others are semantically specific but not
sufficiently flexible - We are working on this
3Data Library Overview
Specialized Data Tools
Maproom
Generalized Data Tools
Data Viewer
Data Language
IRI Data Collection
URL/URI for data, calculations, figs, etc
4IRI Data Collection
Ocean/Atm geolocated by lat/lon multidimensiona
l
GIS geolocation by vector object or projection
metadata
spectral harmonics equal-area grids GRIB grid
codes climate divisions
5IRI Data Collection
6IRI Data Collection
7OpenDAP
- OpenDAP very important to us because we can act
as both a client and as a server, and because it
is flexible enough to represent all our
calculations (virtual variables), i.e. a user
can specify an analysis and export it. - At the moment we cannot read shapefile data using
it (and the serving of shapes over OpenDAP is
consequently untested), but hopefully that is
temporary - Impedance mismatch is low
8Other Important Standards
- netcdf
- GRIB
- GEOTIFF
- Shapefiles vs. PostGIS in Postgres (OGC
compliant)
9Standards becoming important to us (we think)
- OGC GIS Conceptual Framework
- OGC WMS, WFS, WCS
- These are designed to be partial we will have
many datasets/analyses that we cannot transfer
using these protocols
10Interoperability requires Semantics
- Currently we have some numeric interoperability,
but we have a long ways to go for semantic
interoperability
11Standard Metadata
Standard Metadata Schema/Data Services
Datasets
Tools
Users
12Many Data Communities
13Super Schema
Standard metadata schema
14Super Schema direct
Standard metadata schema/data service
15Flaws
- A lot of work
- Super Schema/Service is the Lowest-Common-Denomina
tor - Science keeps evolving, so that standards either
fall behind or constantly change
16RDF Standard Data Model Exchange
Standard metadata schema
RDF
RDF
RDF
RDF
RDF
RDF
17RDF Data Model Exchange
Standard metadata schema
RDF
18RDF Architecture
Virtual (derived) RDF
19Why is this better?
- Maps the original dataset metadata into a
standard format that can be transported and
manipulated - Still the same impedance mismatch when mapped to
the least-common-denominator standard metadata,
but - When a better standard comes along, the original
complete-but-nonstandard metadata is already
there to be remapped, and late semantic binding
means everyone can use the new semantic mapping - Can uses enhanced mappings between models that
are close - EASIER these are tools to enhance the mapping
process
20Key Features of RDF/OWL
- Web-based Framework for writing down and
interrelating semantic standards - Non-contextual Modeling data object
relationships are stated explicitly, not inferred
from context - Late-Semantic-Binding semantics do not alter
transport/storage, semantic mapping can be added
later as scientific fields evolve - Not much track record yet
21RDF vs. XML Schema
- RDF is usually transported as XML
- So it is XML
- But it differs from XML Schema in that the Schema
is not fixed beforehand - XML Schema a prearranged exchange
- RDF/XML add to/query an information space
22Sample Tool Faceted Search
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l?...