Title: Data Management for CENS
1Data Management for CENS
- Stasa Milojevic
- Information Studies
- UCLA
2CENS Data
- CENS will generate massive amounts of
heterogeneous scientific and technical data from
the sensors. - The data need to be useful for CENS researchers
- Real time
- Archived
- The data also need to be useful for other
researchers in those problem domains (larger
community).
3Data Management Goals
- Data Metadata Share with community
- ltdatasetgt ltalternateIdentifiergtPLT-GCE
M-0311b.1.0lt/alternateIdentifiergt lttitlegtFall
2003 plant monitoring survey -- biomass
calculated from shoot height and flowering status
of plants in permanent plots at GCE sampling
sites 1-10lt/titlegt - ltcreatorgt
ltorganizationNamegtGeorgia Coastal Ecosystems LTER
Projectlt/organizationNamegt - ltaddressgt
ltdeliveryPointgtDept. of Marine Scienceslt/deliveryP
ointgt ltdeliveryPointgtUniversity of
Georgialt/deliveryPointgt ltcitygtAthenslt/citygt
ltadministrativeAreagtGeorgialt/administrativeAreagt
ltpostalCodegt30602-3636lt/postalCodegt
ltcountrygtUSAlt/countrygt lt/addressgt
4How to make data useful and usable?
- One data model for all of CENS
- Not likely, that presumes that all science
problems are the same - One data model for each CENS research area
- More promising approach
- Various scientific communities have agreed on the
common models
5Seismology
- Seismic data has been collected via digital
instruments for over 30 years. - There are robust and stable standards for
describing seismic data across systems and data
formats (SEED Standard for the Exchange of
Earthquake Data) - Consortia to centralize and disseminate seismic
datasets - IRIS (Incorporated Research Institutions for
Seismology) - NEES (Network for Earthquake Engineering
Simulation)
6Habitat Monitoring
- Habitat monitoring research
- Draws upon multiple disciplines and technologies
- Integrates data across a wide range of ecological
scales (chemistry, physiology, ecology, and
environment) - Available testbeds include embedded microclimate
sensor network and embedded phenology network
(including wildlife and plant monitoring)
- Habitat monitoring data
- Temperature, moisture, and barometric pressure
- Video data
7James Reserve and habitat monitoring community
- Why we started with this community?
- One of the initial CENS sensor deployments
- The project is at an early stage of defining data
and metadata requirements - Data from this project are being used as the
basis for our initial inquiry learning research
in CENS
8Ecological Metadata Language (EML)
- XML- based standard, developed by and for
ecological community - Divided into modules such as eml-access,
eml-attribute, eml-project - Describes data, literature, software, products
- Not well optimized for sensor data
- Optimized for describing data and not the
derivation of data - Uses Morpho Client as a cross-platform for
creating and organizing data and metadata, either
locally or on a shared network server
9Ecological Metadata Language (EML)
- - ltcoveragegt
- - ltgeographicCoveragegt
- ltgeographicDescriptiongtGCE Study Site GCE1 --
Eulonia, Georgia, USA. Transitional salt
marsh/upland forest site at the upper reach of
the Sapelo River near Eulonia, Georgia. The main
marsh area is to the north of the channel where
the upland is controlled by DNR. Several small
creeks lie within the study area. Residential
development is increasing on the upland areas
south of the channel. A hydrographic sonde is
deployed within this site attached to a private
dock to the south of the main channel near the
HW-17 bridge.lt/geographicDescriptiongt - - ltboundingCoordinatesgt
- ltwestBoundingCoordinategt-81.427321lt/westBounding
Coordinategt - lteastBoundingCoordinategt-81.410390lt/eastBounding
Coordinategt - ltnorthBoundingCoordinategt31.546173lt/northBoundin
gCoordinategt - ltsouthBoundingCoordinategt31.535095lt/southBoundin
gCoordinategt - lt/boundingCoordinatesgt
- lt/geographicCoveragegt
10Describing Instruments
- Sensor Model Language (SensorML)
- Emerging OpenGIS standard for describing sensors
and sensor data - Developed to support data discovery, data
processing and geolocation - Can be used for in-situ or remote sensors,
dynamic or static platforms - Optimized for large sensors and large platforms
- Describes resources for sensor management and
discoveries, but not sensor-derived data
11Sensor Model Language (SensorML)
12Science and Education
- We need to make the science data useful for
teaching grade 6-12 science. - Problem because the scientific models describe
the data, and the education models describe
lessons (grade level, instruments required for
the lesson, time required to perform the lesson,
educational standards, etc.)
13Science and Education Data Models
14Science and Education Data Models Possible
Solution
- Manage scientific data with models appropriate to
the scientific community - Construct filters and tools to make scientific
data useful to K-12 students and teachers - Reduce granularity of data (e.g. temperature at
hourly, rather than minute intervals) - Develop tools to display these data (e.g. simple
charts and graphs) - Describe filters and tools using models
appropriate to educational community (e.g. LOM,
SCORM, GEM)
15Science and Education Data Models Possible
Solution
Sets of Data collected
run through Filters and Tools
to produce understandable Tables, Charts and
Graphs
16Current accomplishments and next steps
- James Reserve
- Map current data structures to EML and SensorML
to determine the fit - Analyze scientific papers and documents to
determine required data elements - Create use scenarios
- Interview scientists
17Current accomplishments and next steps
- Education
- Work with inquiry module team to identify data
requirements - Interview teachers
18Discussion and Conclusions
- Ensuring accessibility and integrity of CENS data
to multiple communities - requires
- Understanding of the practices of each community
- Understanding of relationships between those
practices - Means to bridge the gaps
19Acknowledgements
- Christine Borgman
- Andrew Wu
- Bill Sandoval
- Noel Enyedy
- Joe Wise
- Mike Wimbrow