Title: Data Management Concepts
1Page 1
Data Management Concepts for CUAHSI
Michael Piasecki Department of Civil,
Architectural, and Environmental
Engineering Drexel University Coordination
Meeting for Environmental Observatories Ocean.US
Arlington, VA March 23, 2005
Drexel University, College of Engineering
2Page 2
Background
Consortium of Universities for the Advancement of
the Hydrologic Sciences, Inc.
Hydrologic Information Systems Group The
objective of HIS is
- a hydrologic information system prototype of
mapping and hydrologic data from the Neuse
basin. - a structure for the hydrologic data model
- a hydrologic metadata definition that is
compatible with the hydrologic data model and
is interoperable with other Metadata
profiles - conduct hydrologic case study in interpretation
and visualization to estimate and represent
streamflow and rainfall continuously in space
and time across the stream network in the
Neuse basin
Drexel University, College of Engineering
3Page 3
The Demands
METADATA
Drexel University, College of Engineering
4Page 4
Metadata Standards
ISO 19115 International Standard Organisation
DIF Data Interchange Format, NASA
ANZLIC Australia New Zealand Land Information
Council DC Dublin Core FGDC Federal Geographic
Data Committee ADN ADEPT, DELESE, NASA
Drexel University, College of Engineering
5Page 5
Related Markup Languages
Drexel University, College of Engineering
6Page 6
Semantics
Metadata Standards do not resolve semantic
heterogeneities
Metadata (ISO) about dataset X keyword Stage
Height thesaurusName GCMD
Metadata (FGDC) about dataset Y Theme_Keyword
Gage Height Theme_Keyword_Thesaurus USGS
Finds only data set X
and not data set Y
Drexel University, College of Engineering
7Page 7
Semantics Solution
Drexel University, College of Engineering
8Page 8
Ontologies for Metadata Profile
Prepare the CUAHSI Metadata Profile for the
Future!
Drexel University, College of Engineering
9Page 9
CUAHSI Profile V.1.0
- Extend ISO
- Set as core (Metadata elements selected to be
used by CUAHSI) - Controlled Vocabulary
- Create domain list to fit needs
Express in machine readable format OWL/XML
fully interoperable with original ISO
Drexel University, College of Engineering
10Page 10
CUAHSI Profile V.1.0
Hydrology Hydrologic Model Runoff
Direct Runoff Base Runoff
Surface Runoff Surface water
Discharge or Flow Water Depth
Stage Height Base flow Pressure
Water yield Ground water Ground
Water Discharge or Flow Safe yield
Infiltration Hydraulic Conductivity
Drainage Moisture Wells
Aquifer
- Created a
- CUAHSI Keyword Controlled Vocabulary
- based on
- UNESCO Intl Glossary of Hydrology
- Global Change Master Directory
- Work still in progress
- need to identify additional terms
- cast it in ontology
-
Meteorology Fundamental Parameters Air
Temperature Dew Point Humidity
Wind Chill Evaporation Land
Temperature Precipitation Standardized
Precipitation Index Rainfall Sleet
Surface Snow Hail .
.
Water Quality Fundamental Parameters
BOD5 COD TSS MLVSS
Ammonia Phosphate Dissolved
Oxygen . .
Topography Barometric Altitude
Bathymetry Contours Geotechnical
Properties Land Use Classes Soil
Classification Landforms Terrain
Elevation Vegetation Snow/Ice Cover
Satellite Imagery . .
Drexel University, College of Engineering
11Page 11
Current CUAHSI Status
- Metadata Profile is under CUAHSI internal
review - report by August 2005 - CV and
metadata beta-tested in HydroViewer application
- Data Formats are under consideration - several
formats are under consideration for adaptation -
netCDF, HDF5, SHEF, (also GRIB) - Digital Library System is being built - store
Arbitrary Digital Objects (ADO) using Storage
Resource Broker (SRB) - store metadata
information in MIF files and also in postgreSQL
database - Digital Watershed prototype is being
developed - GIS type display of watershed
relevant data compilations - Neuse River Basin
as test-bed - Observational Geodatabase concept is tested -
numerical obs, field obs, lab obs - use data
cube as conceptual base
Drexel University, College of Engineering
12Page 12
More Ontologies
We currently have
What we need is
Ontology Examples
Drexel University, College of Engineering
13Page 13
Drexel University, College of Engineering
14Page 14
Thank you Questions?
http//loki.cae.drexel.edu8080/web/how/me/metadat
acuahsi.html
Drexel University, College of Engineering
15Page 2
Drexel University, College of Engineering
16Page 6
Meta-Semantics
Categorization of Metadata Elements
search
use
Drexel University, College of Engineering
17Page 7
Semantics Problem1
Metadata Standards lack domain specific elements
They do not suggest if area and outlet location
should be defined when a watershed is being
described ? They do not incorporate a list of
possible stations and variables related to
surface water collected by a particular
Hydrologic Information Community, HIC
Drexel University, College of Engineering
18Page 11
Resolve Semantic Heterogeneity
e.g. search for Stage Height
Finds data set X and Y
Metadata repository
Drexel University, College of Engineering
19Page 13
CUAHSI Profile V.1.0
1) Extend ISO 19115 and use other ISO 19___
family standards
ISO 19108 (temporal objects)
ISO 19103 (units)
ISO 19110
ISO 19115 (general)
CUAHSI Profile
In progress
Features-Profile
TimeSeries-Profile
HUC
Drexel University, College of Engineering
20Page 14
CUAHSI Profile V.1.0
2) Set as core
Using flag to mark the core elements
ltowlAnnotationProperty rdfID"core"gt
core
true
Drexel University, College of Engineering
21Page 15
CUAHSI Profile V.1.0
- Core Metadata Version 1.0
- selected 77 descriptive elements that are
CUAHSI core, i.e that are mandatory for any
data set within CUAHSI -
- made available online for review and comments
- output is in SDSC formats MTF template
format MIF interchange format - entire ISO 191xx family of metadata norms is
also available for inclusion in CUAHSI profile
Metadata Tree Visualization
Drexel University, College of Engineering
22Page 16
CUAHSI Profile V.1.0
Drexel University, College of Engineering
23Page 17
CUAHSI Profile V.1.0
Mandatory
1. SensorDescription
Mandatory
2. SensorName (long)
Mandatory
3. SensorModelNumber
MD_Classification Code for Observation
Metadatabased on SensorML
Mandatory
4. SensorType
Mandatory
5. SenorName (short)
Mandatory
6. SensorDeployAgency
Mandatory
7. SensorInServiceDateTime
Mandatory
8. SenorManufacturer
Mandatory
9. ObservableProperty
10. SensorDocumentationOrganization
Mandatory
Mandatory
11. SensorOperatedOrganization
Optional
15. SensorPlatfomType
Optional
11. MeasurementMethod
Optional
12. SamplerType
Optional
13. SensorIdentificationNumber
Optional
14. SenorNotOperatioanSinceDateTime
Drexel University, College of Engineering
24Page 18
CUAHSI Profile V.1.0
3) Controlled Vocabulary
Ontologies
Labels
Annotations
Pangloss TOOL
Drexel University, College of Engineering
25Page 20
CUAHSI Profile V.1.0
4) Set New Codelists
- MD_Classification Code for security Constraints
- World
- Group
- Owner
Drexel University, College of Engineering
26Page 7
The Solution
Hydrology Community Metadata Profile
Drexel University, College of Engineering
27e.g. Restrict the descriptor code to only have
W-station values
Dynamic HTML form using the extension
Program could infer
28Page 9
Metadata Standards
Which one to pick?
ANZLIC has recommended to use ISO 19115 FGDC is
about to recommend for its 3rd Content Standard
for Digital Geospatial Metadata,
CSDGM, version to use ISO 19115. Since the
beginning they have developed
cross-walks from FGDC to ISO DC is based
on library science and therefore not very
suitable, even though it is compact,
DCMI has also developed cross-walks to ISO ADN
is based on a collaboration between the ADEPT,
DLESE and NASA (DIF), there is no
basic standard and only very few cross-walks
exist, none to FGDC, which is
interesting as its use has been mandated
gt ISO 19115 is bound to become reference
standard, also it is the only one
that has been conceptualized in UML.
Drexel University, College of Engineering