Title: EarthChem Solid Earth Geochemistry in Geoinformatics
1EarthChemSolid Earth Geochemistry in
Geoinformatics
2Why Do We Need Data Management in Solid Earth
Geochemistry?
- Geochemical data are essential for answering
fundamental questions about the composition,
structure, evolution of the Earth, its oceans,
continents, and climate - Problems
- Data is dispersed in literature, often not in
electronic form - Compilations by investigators are time-consuming,
redundant, often incomplete - Missing links among related data
- Data is lost due to incomplete publication
3Data Management in Solid Earth Geochemistry
SedDB
4PetDB, NAVDAT,GEOROC
- Offer the only generally accessible compilations
of large volumes of data on the compositional
variation of igneous rocks. - Provide desktop access to the entire published
geochemical literature within minutes, - allowing researchers to address questions that
otherwise would be dropped due to the large
effort required to find and compile the data. - allowing students to explore the global dataset
within a formerly unimaginable timeframe that can
be accommodated in the course schedule.
5PetDB, NAVDAT, GEOROC
- Compile and serve ALL raw geochemical data
- Share common relational data model (Lehnert et
al. 2000) - Data fully integrated
- Wide range of sample analytical metadata
- Generally applicable for sample-based
petrological and chemical data for rocks - Each value linked to original publication or
producer
6Interactive, Dynamic Web Interfaces
- Select, filter, view, download customized data
sets - Explore metadata
7Other Features (database-specific)
Visualization tools (NAVDAT)
Interactive map interfaces (NAVDAT)
Disparate data for individual samples linked via
unique sample IDs (PetDB)
Interoperability (PetDB)
8Data Quality Control
Comprehensive analytical metadata
- allow proper data quality assessment
Example PetDB interface
- can be used as data quality filters
9Content of PetDB, NAVDAT, GEOROC
- gt 4 Million individual chemical values
- for gt ca. 230,000
- igneous rock samples
- from gt 6,300 publications
10Benefits of Rigorous Scientific Data Management
- Maximized Utility of the Geochemical Dataset
- Enhanced Data Quality Control
- Data Integration Visualization across the
Geosciences - Impact on Science Education
11Maximize Utility of the Geochemical Dataset
More than just a timesaver, these databases make
it possible to address both global and regional
questions that I would otherwise never bother to
attempt. The amount of time saved is such that
countless ideas cross from the realm of the
totally impractical for a busy working scientist
into the realm of easy to squeeze into a spare
half hour. Simply put, I can now test theoretical
ideas against all the world's data, and can
readily compare any specific region I am working
on to its global counterparts. This is a
monumental benefit. Paul Asimov, California
Institute of Technology EarthChem User Survey
January 2005
12Scientific Return
gt120 papers that cite PetDB GEOROC
- Plank, T. Constraints from Thorium/Lanthanum on
Sediment Recycling at Subduction Zones and the
Evolution of the Continents, Journal of Petrology
46, 921-944, 2005. - Ballentine, C.J. et al. Neon isotopes constrain
convection and volatile origin in the Earth's
mantle, Nature, 433, 33 38, 2005 - V. Salters A. Stracke Composition of the
depleted mantle. G3, 2004 - Cipriani, A. et al. Oceanic crust generated by
elusive parents Sr and Nd isotopes in
basalt-peridotite pairs from the Mid-Atlantic
Ridge. Geology, 32 (8), 657660, 2004. - Herzberg, C. Geodynamic Information in
Peridotite Petrology, Journal of Petrology, 45,
2507-2530, 2004 - M. Hirschmann et al. Alkalic magmas generated by
partial melting of garnet pyroxenite. Geology 31,
2003 - Kellogg, J. B., Jacobsen, S. B., OConnell, R.
J. Modeling the distribution of isotopic ratios
in geochemical reservoirs, Earth Planet. Sci.
Letters 217, 2004.
13 Application to Education
14Challenges for Database Providers
- Optimize interaction with the data for a broad
audience ranging from the casual to the expert
user - Efficiently populate databases with legacy and
new data - Integrate data with the larger Earth Science
dataset - Ensure longevity of data systems
15The Problem of Distributed Datasets
A typical science question What is the
relationship between what is being subducted at
the Aleutian trench and what is being erupted in
Aleutian volcanoes?
- Need Nd, Sr, Pb, Hf isotope ratios, and
incompatible trace element compositions
Aleutian Volcanics
North Pacific (Juan de Fuca Ridge) MORB
16The EarthChem Consortia
Founded in 2003 by R. Carlson, A. Hofmann, K.
Lehnert D. Walker
- Build an integrated data management and
information system for solid earth geochemistry, - based on and expanding the collaboration of
PetDB, GEOROC, and NAVDAT.
- Nurture synergies among projects
- Minimize duplication of efforts
- Share tools and approaches
17EarthChem Activities
- Community Workshop (October 2003, Carnegie
Institution Washington) - Reviewed the current status of data management
efforts in Solid Earth Geochemistry - Discussed ways in which these activities can grow
and collaborate to best participate in and
contribute to the Cyber Infrastructure revolution
in the Geosciences
- Exhibits demos at AGU 2003 2004 and GSA 2004
- Presentations at GSA2003, AGU2004, various
workshops - Session on Geoinformatics for Geochemistry at
AGU 2004, co-chaired with GERM - Web site at www.earthchem.org
18EarthChem Priorities
- Build the EarthChem portal as a central access
point to a system of federated geochemistry
databases (One-Stop Shop for Geochemical Data) - Ensure efficient and continuing update and
expansion of data holdings
Proposal submitted to NSF (EAR IF) January
2005 K. Lehnert, D. Walker
19One-Stop-Shop for Geochemical Data
Users
Geoscience CI
Interoperability
EARTHCHEM PORTAL Uniform data submission Search
capability across federated databases Standardized
integrated data output Generally applicable
tools for DQ assessment data analysis/visualizat
ion
SedDB
and more..
20Building the One-Stop Shop
- Interface federated databases
- Implement web services SOAP/XML/WSDL, OAI, OGC
- Standardize metadata (ISO19115, OGC-GML)
- Systematize nomenclature vocabulary
(ontologies) - Register database schemas with GEON?
- Implement unique sample identification through
use of the International Geo Sample Number - Build user interfaces with flexible data
selection and extraction, tiered for different
levels of expertise - Use customized GEON Portal technology?
- Use EarthChem map viewer, GeoMapApp browser, or
other tools to integrate with other data types
such as seismic tomography, gravity, structural
features, etc. - Provide tools for data evaluation such as
- interactive discriminant plots, P/T calculators,
data quality filters
21The Bottleneck Data Entry
- Difficult to find knowledgeable data managers
- Missing metadata (e.g. locations, analytical
info) - No unique sample identification
- Missing standards for data presentation (e.g.
units) - Unavailable data files
- Errors in original data tables
- Missing cooperation from authors
EXPENSIVE!
22Efficient Update Expansion of Data Holdings
- Encourage direct data contributions from the
community - Build on-line data submission capability for
future data (compliance with data policies for
science programs!) - Provide services for on-line storage of routine
data about analytical procedures (MyEarthChem) - Facilitate incorporation of existing large data
compilations - Provide technical assistance to investigators who
want to compile new datasets
23Facilitate Community Contributions
- Assist contributors with design, implementation,
population of databases. - Serve databases via the EarthChem portal.
- Contributed datasets will retain their identity
within the EarthChem system. - PILOT PROJECT
- A relational database of the Mexican Volcanic
Belt - Straub, Ferrari, Langmuir
24Expansion of Data Holdings
- Generate additional datasets
- Identify and prioritize new target datasets
through community outreach and the EarthChem
Advisory Committee - Data entry by dedicated EarthChem personnel
25Integration with Science GeoInformatics
26A Users Vision
in theory the best thing would be one big
Geo-database where all different types of
geochemical reservoirs are included and all
analytical tools as well and where you can search
for either regions or reservoir type or method...
ok thats a big goal.