Title: The Earth System Grid
1The Earth System Grid
UCRL-PRES-148116
- Presented by
- Dean N. Williams
- PIs Ian Foster (ANL) Don Middleton (NCAR) and
Dean Williams (LLNL) - http//www.earthsystemgrid.org
Presented at The EO GRID Workshop Frascati,
Italy
2Earth System Grid (ESG) Overview
- Funded by the Scientific Discovery through
Advanced Computing (SciDAC), this program seeks a
new paradigm in the climate change community
evolving from centralized data sharing to
distributed data-sharing. - Enabling geographically distributed teams of
researchers to effectively and rapidly acquire
knowledge and understanding of massive amounts of
climate data holdings. - Multiple interfaces to ESG will allow researchers
to focus on science and not issues with data
receipt, format, and data set manipulation.
3ESG Why is ESG Important to the U.S. Climate
Change Program
- Climate model output and quality observations are
vital to providing timely assessments of climate
change and impacts. - Recent U.S. and IPCC assessment efforts made it
clear the lack of accessibility to model
simulations is a major problem for future
assessments. - Access to retrospective climate data (input and
output) needed to enable a feedback mechanism to
tie researchers directly back to quality control
and diagnostics of models. - Researchers require access to format
independent climate and observational data for
case-study training. - In the U.S., climate simulation can be viewed as
a systems problem, requiring a team of
multi-agencies and institutions working together
in collaboration.
4ESG U.S. Collaborations Development
ANL Computational grids, grid-based
applications
LBNL Climate storage facility
LLNL Model diagnostics inter-comparison
USC/ISI Computational grids, grid-based
applications
ORNL Climate storage computational resources
LANL Next generation coupled models computing
NCAR Climate change predication and scenarios
5ESG Requirements Priority Matrix
6ESG U.S. Department of Energy (DOE) Next
Generation Internet (NGI) Project
- ESG-I (past)
- Focused on developing techniques for the
high-speed data movement between sites and users
(e.g., the secure highly efficient File Transfer
service, called gridFTP, developed by ANL (i.e.,
Globus)) - Developed replica catalogs for keeping track of
data locations - Developed request manages for coordinating
multiple transfers - Developed a grid-enabled version of LLNLs data
analysis package
7ESG ESG-I Architecture
8ESG ESG-I Team Presented their work at
Supercomputing 2001
RAID
CLOUD
LDAP/Sever Metadata Catalog LLNL
TERRAIN
U V
LDAP/Sever Metadata Catalog LBNL
parallel disk system
9ESG DOE SciDAC Project
- ESG-II (present)
- Building upon the substantial work of ESG-I
- Grid-wide services supporting authentication,
authorization, data discovery, and user specified
analysis - Metadata services supporting remote data
browsing, querying, accessing, displaying, etc. - Filtering services performing intelligent model
specific analysis before delivering the results
to the user - Integrate next-generation data analysis and
visualization applications (such as ongoing work
at LLNL and NCAR), web-based data portals and
other thin clients supporting the Distributed
Oceanographic Data System (DODS), and
collaborative problem-solving environments.
10ESG ESG-II Architecture
11ESG Metadata Services
ESG CLIENTS API USER INTERFACES
PUBLISHING
ANALYSIS VISUALIZATION
SEARCH DISCOVERY
ADMINISTRATION
BROWSING DISPLAY
HIGH LEVEL METADATA SERVICES
METADATA EXTRACTION
METADATA BROWSING
METADATA QUERY
METADATA ANNOTATION
METADATA DATA REGISTRATION
METADATA DISPLAY
METADATA VALIDATION
METADATA AGGREGATION
METADATA DISCOVERY
CORE METADATA SERVICES
METADATA ACCESS (update, insert, delete, query)
SERVICE TRANSLATION LIBRARY
METADATA HOLDINGS
mirror Dublin Core XML Files
Data Metadata Catalog
Dublin Core Database
COARDS Database
COMMENTS XML Files
12ESG Collaboration Network
Grid and Network
Infrastructure
13ESG Example of a Web-based Data Portal
(currently serving 40 simulations of AMIP, CMIP,
and PCM data, and growing)
14ESG Example of a Client Application
15ESG Example of a Script Access
- The next-generation language, Python, is used to
access the Earth System Grid at LLNL
16ESG Concluding Statements
- ESG is a highly collaborative effort and will
allow users to quickly access data storage
facilities storing petabytes of raw or processed
data in an application independent manner. - Payoffs of this distributed collaborative
infrastructure, would include - distributed data-sharing
- Simplified data discovery of climate data
- Large-scale climate data processing and analysis
- Increased collaboration among climate research
scientists - Aid in climate assessments and estimates of
future climate variability and trends - For more information on ESG, visit our website
at http//www.earthsystemgrid.org