Title: GENIE: Delivering eScience to the environmental scientist
1GENIE Delivering e-Science to the environmental
scientist
2Contents
- What is GENIE?
- The problem Thermohaline circulation
- The solution e-Science
- Delivering Grid computing resources
- Delivering Grid resource access
- Delivering Grid data management
- The results Scientific achievements
- Summary
- Acknowledgements
3What is GENIE?
- Grid ENabled Integrated Earth system model.
- Investigate long term changes to the Earths
climate (i.e. global warming) by integrating
numerical models of the Earth system. - e-Science aims
- Flexibly couple together state-of-the-art
components to form unified Earth System Model
(ESM). - Execute resultant ESM on a Grid infrastructure.
- Share resultant data produced by simulation runs.
- Provide high-level open access to the system,
creating and supporting virtual organisation of
Earth System modellers.
4GENIE model framework
3D atmosphere
Atmospheric CO2
2D sea ice
3D ice sheets
3D ocean
2D land surface
Land biogeochemistry
Ocean biogeochemisty
Ocean sediments
5The problemThermohaline circulation
- Ocean transports heat through the global
conveyor belt. - Heat transport controls global climate.
- Wish to investigate strength of model ocean
circulation as a function of two external
parameters. - Use GENIE-Trainer.
- Wish to perform 31?31 961 individual
simulations. - Each simulation takes ?4 hours to execute on
typical Intel P3/1GHz, 256MB RAM, machine ? - time taken for 961 sequential runs ? 163
days!!!
6The solutionDelivering Grid computing resources
- Use flocked Condor pools between SReSC, DoC at
Imperial College London, and LeSC (?200 Linux and
Solaris nodes). - time taken for 961 Condor runs ? 3 days!!!
- Advantages of Condor
- simulations are nearly parallel.
- automatic check pointing and job migration.
- Condor File Transfer Mechanism.
- Problems
- Firewalls! Overcame by designating and utilising
port ranges specified by the Condor and firewall
admin.
7The solutionDelivering Grid resource access
- User authenticated with their X.509 e-Science
certificate. - Ability to create experiments
- create 961 simulation input files.
- create necessary files for Condor.
- create metadata files.
- Ability to submit experiments to flocked Condor
pool. - Ability to monitor and manage progress of running
experiments. - Ability to archive resultant data in database.
8GENIE channel in web portal
9Example of a typical Condor request through portal
1. User makes request through web portal to list
status of jobs on Condor.
2. Servlet container interprets request and makes
a system level call to Condor.
3. Condor responds with status of users jobs.
4. Response is translated by container and
returned to user as web page.
Apache Tomcat Servlet Container
web portal
Linux / Solaris OS
10The solutionDelivering Grid data management
- Exploited and advanced upon grid-enabled database
management system employed in Geodise project - simpler to install and deploy.
- removed specific metadata requirements to allow
easier integration with existing systems. - Data generated by experiments in GENIE portal are
archived in database hosted at Southampton
Regional e-Science Centre. - Users can query, retrieve and visualise data
using database client in MATLAB.
11Architecture of GENIE data management system
Grid
Visualisation Client
Portal
Matlab Functions
Java clients
CoG
GridFTP
Apache SOAP
GridFTP
Java Web Services
Location Service
Java clients
Authorisation Service
CoG
Metadata Archive Query Services
Apache SOAP
Metadata Database
12Data query and retrieval
13The resultsScientific achievements
Intensity of the thermohaline circulation as a
function of freshwater flux between Atlantic and
Pacific oceans (DFWX), and mid-Atlantic and North
Atlantic (DFWY).
Surface air temperature difference between
extreme states (off - on) of the thermohaline
circulation. North Atlantic 2?C colder when the
circulation is off.
14SummaryReal science through real e-Science
- Delivered Grid resources to perform simulations
of prototype Earth System Model. - Delivered web based system to allow a virtual
organisation of environmental scientists to
create and manage simulations at a high-level. - Delivered database management system to allow
scientists to share, access and visualise data
produced by simulation runs. - ? Exciting new scientific results with profound
- implications. Papers have been written!
15Acknowledgments
- GENIE investigators
- Prof. Paul Valdes (Reading), Prof. John Shepherd
(SOC, Southampton), Prof. Andrew Watson (UEA),
Prof. Melvyn Cannell (CEH Edinburgh), Dr. Anthony
Payne (Bristol), Prof. Richard Harding (CEH
Wallingford), Prof. Simon Cox (SReSC), Dr. Steven
Newhouse (LeSC) and Prof. John Darlington (LeSC). - Recognised researchers
- Andrew Price (SReSC), Andrew Yool (SOC), Dr.
Robert Marsh (SOC), Dr. Timothy Lenton (CEH
Edinburgh) and Dr. Neil Edwards (Bern), J. L.
Wason (SReSC), Marko Krznaric (LeSC).