Title: Seismic Hazard Modeling using Heterogeneous Scientific Workflows
1Seismic Hazard Modeling using Heterogeneous
Scientific Workflows
Philip MaechlingInformation Technology
ArchitectSouthern California Earthquake Center
(SCEC)University of Southern Californiahttp//ww
w.scec.org14 September 2007
2(No Transcript)
3Example SCEC Project Rosters
- Large Collaborative Projects involving computer
scientists and geophysicists worked together.
- TeraShake
- Kim B. Olsen (SDSU), Bernard Minster (IGPP),
Reagan Moore (SDSC), Steve Day (SDSU), Phil
Maechling (USC), Tom Jordan (USC), Marcio Faerman
(SDSC), Geoffrey Ely (IGPP), Boris Shkoller
(IGPP), Carey Marcinkovich (EXxonMobil), Jacobo
Bielak (CMU), David Okaya (USC), Ralph Archuleta
(UCSB), Steve Cutchin (SDSC) , Amit Chourasia
(SDSC), George Kremenek (SDSC), Yuanfang Hu
(SDSC), Arun Jagatheesan (SDSC), Nancy
Wilkins-Diehr (SDSC), Richard Moore (SDSC), Bryan
Banister (SDSC), Leesa Brieger (SDSC), Amit
Majumdar (SDSC), Yifeng Cui (SDSC), Giridhar
Chukkapalli (SDSC), Qiao Xin (SDSC), Donald Thorp
(SDSC), Patricia Kovatch (SDSC), Larry Diegel
(SDSC), Tom Sherwin (SDSC), Christopher Jordan
(SDSC), Marcus Thiebaux (ISI), Julio Lopez (CMU) - Workflow Systems Including CyberShake and
Earthworks - Hans Chalupsky (ISI), Maureen Dougherty
(USC/HPCC), Ewa Deelman (ISI), Yolanda Gil (ISI),
Sridhar Gullapalli (ISI), Vipin Gupta (USC), Carl
Kesselman (ISI), Jihie Kim (ISI), Gaurang Mehta
(ISI), Brian Mendenhall (USC/HPCC), Thomas Russ
(ISI), Gurmeet Singh (ISI), Marc Spraragen (ISI),
Garrick Staples (USC/HPCC), Karan Vahi (ISI),
Yifeng Cui (SDSC), Thomas Jordan (SCEC), Li Zhao
(USC), David Okaya (USC), Robert Graves (URS),
Ned Field (USGS), Nitin Gupta (SCEC), Scott
Callaghan (USC), Hunter Francoeur (USC), Joanna
Muench (IRIS), Philip Maechling (USC)
4Outline
- SCEC Earthquake System Science
- Earthquake System Science Computing
- Heterogeneous Workflow Research
- SCEC View of the Computing World
5One Week of Earthquakes in California
6Seismic Hazard Analysis as a System-Level
Earthquake Research Problem
- Definition Specification of the maximum
intensity of shaking expected at a site during a
fixed time interval - Example National seismic hazard maps
- Intensity measure peak ground acceleration (PGA)
- Interval 50 years
- Probability of exceedance 2
7(No Transcript)
8To SCEC, a Geosystem is Defined by Predicted
Behavior
- A geosystem comprises a set of interacting
elements that changes in time according to a set
of prescribed laws, and produce a specific
beharior. - In other words, the behavior defines the system
- Geosystem Behavior
- Climate system (a.k.a. Earth system) climate
- mantle convection plate tectonics
- core dynamo magnetic field
- orogen mountain building and erosion
- active fault system earthquakes
- super-cell thunderstorm tornado
- volcano magmatic eruption
- petroleum reservoir accumulation flow of
petroleum
9Earthquake System Science research seeks to
predict ground motions. Ground motion (future
shaking) is the behavior that the SCEC system
science program seeks to predict.
10Puente Hills EarthquakeScenario
(E. Field, H. Seligson, N. Gupta, V. Gupta, T.
Jordan K. Campbell, 2005)
Projected losses 82 B - 252 B 3,000 - 18,000
fatalities 142,000 -735,000 displaced
households 30,000 - 99,000 tons of debris
Percent Building Loss
11System Science Integrates Observations,
Geological Models, and Simulations
Seismicity
Paleoseismology
Geologic structure
Local site effects
Faults
Seismic Hazard Model
Stress transfer
Rupture dynamics
Crustal motion
Crustal deformation
Seismic velocity structure
12Earthquake System Science
- Prediction of seismic shaking is a true
system-level problem - Involve nonlinear, multiscale interactions among
many geologic components, themselves often
complex subsystems - SCEC research seeks to develop a predictive
understanding of earthquake processes by applying
an interdisciplinary, system science, research
approach that uses Southern California as a
natural laboratory.
13Foster and Kesselman - IEEE Computer (2006)
14- Southern California Earthquake Center
- Involves 500 scientists at 55 institutions
worldwide - Focuses on earthquake system science using
Southern California as a natural laboratory - Translates basic research into practical products
for earthquake risk reduction
Communicate understanding to society at large as
useful knowledge for reducing earthquake risk
Integrate information into a comprehensive,
physics-based understanding of earthquake
phenomena
Gather data on earthquakes in Southern
California and elsewhere
SCEC Mission Statement
15- Southern California Earthquake Center
- Involves 500 scientists at 55 institutions
worldwide - Focuses on earthquake system science using
Southern California as a natural laboratory - Translates basic research into practical products
for earthquake risk reduction
Tectonic Evolution B.C.s
Deformation Models
Earthquake Rupture Forecasts
Earthquake Rupture Models
Fault Models
Block Models
Seismic Hazard Products
Risk Mitigation Products
Anelastic Structures
Attenuation Relationships
Ground Motion Simulations
SCEC Master Model Focus Groups
16SCEC Member Institutions(October 1, 2006)
17SCEC Research Funding Opportunities
- SCEC performs a yearly proposal process and
distributes a large percentage of NSF and USGS
SCEC funding to individual research groups. - Request for Proposals Posted on SCEC Web site
September. - Proposals Due in Nov
- Proposals Selected in Jan
- Funding available in March
- Proposal typically small (30k more or less).
Must be in line with SCEC Science plan. A wide
range of activities are funded - Workshops
- Support for Interns
- Field studies
- Simulation programs
- Geoinformatic developments
- etc
18SCEC3 Organization
SCEC Director Board of Directors
External Advisory Council
Planning Committee
CEO Program
Center Administration
Information Architect
Earthquake Geology
Unified Structural Representation
Seismic Hazard Risk Analysis
CME
Tectonic Geodesy
Fault Rupture Mechanics
PetaSHA
Knowledge Transfer
Seismology
Crustal Deformation Modeling
Public Outreach
PetaShake
Lithospheric Architecture Dynamics
K-12 Informal Education
CSEP
Earthquake Forecasting Predictability
USEIT/SURE Intern Programs
ExGM
Ground Motion Prediction
MPRESS
ACCESS Forum
Disciplinary Committees
CEO Activities
Special Projects
Focus Groups
19- SCEC Community Modeling Environment (SCEC/CME)
-
- Funded by NSF/EAR/CISE
- 10.0M for 5 years 1 year ext. (ending 2007)
- Science thrust areas
- Extend scale range of deterministic ground-motion
simulations and dynamic rupture simulations - Compute physics-based PSHA maps and validate them
using seismic and paleoseismic data - Computer science objectives
- Grid-based Workflows
- Data Management
- Knowledge Capture
- Promote vertical integration of
cyberinfrastructure
CME
PetaSHA
PetaShake
CSEP
ExGM
MPRESS
20- A Petascale Cyberfacility for Physics-Based
Seismic Hazard Analysis (PetaSHA) -
- Funded by NSF/EAR
- 2.1M for 2 years (ending 2008)
- Science thrust areas
- Extend scale range of deterministic ground-motion
simulations and dynamic rupture simulations - Compute physics-based PSHA maps and validate them
using seismic and paleoseismic data - Computer science objectives
- Reach petascale computing capability by 2009
- Promote vertical integration of
cyberinfrastructure
CME
PetaSHA
PetaShake
CSEP
ExGM
MPRESS
21- Enabling Earthquake System Science through
Petascale Calculations (PetaShake) -
- Funded by NSF/EAR/CISE/OCI
- 1.8M for 2 years (ending 2009)
- Science thrust areas
- Extend scale range of deterministic ground-motion
simulations and dynamic rupture simulations - Highly scalable Earthquake Wave Propagation codes
- Computer science objectives
- Codes that scale to 100,000 processors by 2009
- Workflow-based On-demand verification and
Validation for Petascale codes
CME
PetaSHA
PetaShake
CSEP
ExGM
MPRESS
22- Collaboratory for the Study of Earthquake
Predictability (CSEP) -
- Funded by W. M. Keck Foundation
- 1.2M for 3 years (ending 2009)
- Science thrust areas
- Formulate Techniques for Evaluating Earthquake
Predictions - Develop international collaboration to study
earthquake predictability - Computer science objectives
- Robust, reliable testing center with reproducible
results - Automated executing of prediction models and
prediction evaluation algorithms
CME
PetaSHA
PetaShake
CSEP
ExGM
MPRESS
23Students in Computer Science and Geoscience
SCEC/USEIT
2003
2002
2004
2005
24(No Transcript)
25Current SCEC ACCESS Students
- SCEC - Advancement of Cyberinfrastructure Careers
through Earthquake System Science (ACCESS)
involves both undergraduate and graduate students
in computer science and geoscience research
26- Scientific Publications and Public Outreach raise
awareness and improve preparedness.
27Outline
- SCEC Earthquake System Science
- Earthquake System Science Computing
- Heterogeneous Workflow Research
- SCEC View of the Computing World
28SCEC Science Goals
- SCEC seeks to improve predictive earthquake
system models by constantly integrating more
realistic physics into the numerical simulations. - As simulations become more physically realistic
the computational and data management
requirements increase enormously
29Types of SCEC Numerical Simulations
Earthquake Wave Propagation Simulations
Ensemble Probabilistic Seismic Hazard Calculations
Friction-based Dynamic Rupture Simulations
30Types of SCEC Physics-based Earthquake Simulations
- Large Earthquake Wave Propagation Simulations
- Dynamic Earthquake Rupture Simulations
- Ensemble Probabilistic Seismic Hazard Simulations
- Data Inversion Simulations improve Structure
Models - Ground motion coupled to building response
31System Science Implies Broad Computing Needs
SCEC Cyberinfrastructure Needs
- High Performance Computing (Capability Computing)
- Distributed, high throughput, grid computing
(Capacity Computing) - Large-scale Data and Metadata Management (Data
Intensive Computing) - 2D, 3D, and 4D Data Visualization
- Scientific workflows
- Interactive Science Gateway technologies and
Collaboration tools. - High Performance Computing Expertise.
- Urgent Computing (after a significant event)
32SCEC Simulation Working Group
33(No Transcript)
34Development of SCEC Community Models Structural
Representation
35SCEC Community Fault Model
A. Plesch and J. Shaw (2003)
36SCEC Community Velocity Model
H. Magistrale et al. (2000)
37SCEC Community Block Model
Set of interconnected, closed volumes that are
bounded by major faults, as well as topography,
base-of-seismicity, and Moho surfaces.
Intended for use in fault systems analysis
(FEM) property modeling
J. Shaw et al. (2004)
38SCEC Crustal Motion Map From GPS Data
CMM.3.0.1 (Agnew et al., 2003)
39Unified Structural Representation
Crustal Motion Map
Tectonic models
Community Fault Model
Community Block Model
Structural models
40SCEC Science Goals
- SCEC seeks to improve predictive earthquake
system models by constantly integrating more
realistic physics into the numerical simulations. - Example 1 Ground motion predictions based on
wave propagation simulations through realistic
geological structures are more physically
realistic than ground motion predictions using
attenuation relationships and simple geological
models.
41Predicted Strong Ground Motion Predictions for
Hypothetical Earthquake Old Style (Attenuation
Relationship)
Light
Moderate
Heavy
Very Heavy
Ground Motion Levels for a Hypothetical Ml 6.75
in Hollywood
42Wave Propagation Assuming Simple Consistent
Geology
43Geology in Los Angeles Is Complex with Deep
Sedimentary Basins
44Wave Propagation using 3D waveform modeling and
Realistic Geology in Los Angeles
45Verification of Earthquake Wave Propagation Codes
Using Simplified Problem
Mxy
46SCECs Predictive simulations are validated Using
Historical Earthquakes.We perform simulations
of historic earthquakes for which there are
seismic recordings.These two maps compare
observed ground motions (top) to results of
simulation (bottom) for a recent earthquake
47SCECs Predictive Models Are Validated Using
Historical Earthquakes
48Scenario Earthquake Simulations Puente Hills
Peak SA 2.0 magnitude Map
Velocity Y Component Animation
Puente Hills Simulation Scenario Earthquake (10
Hz) Robert Graves (AWM), Amit Chourasia et al
(Viz)
49TeraShake-1 Simulation Results New Scale and
Resolution (1.2GB Mesh Points)
50SCEC Science Goals
- SCEC seeks to improve predictive eathquake system
models by constantly integrating more realistic
physics into the numerical simulations. - Example 2 Fault rupture simulations that are
constrained to obey friction laws (dynamic
ruptures) are more physically realistic than
ruptures simulations that arent constrained by
friction laws (kinematic ruptures).
51Cartoons that illustrate kinematic and dynamic
ruptures
52Kinematic Rupture Simulation Using a Constant
Rupture Velocity
Why Faults Start Slipping and Why They Stop
Slipping is Still Difficult to Understand and
Model.
53(No Transcript)
54Earthquake Rupture Mode Code Verification
- Supported by SCEC ESP Focus Group
- First workshop held in November, 2003
- Second workshop planned for Summer, 2004
- Results will be archived in CME
55TeraShake-2 Simulation Results
56SCEC Science Goals
- SCEC seeks to improve predictive earthquake
system models by constantly integrating more
realistic physics into the numerical simulations. - Example 3 Produce probabilistic seismic hazard
maps by running ensemble calculations that
simulate all possible earthquakes that might
affect the region.
57Probabilistic Seismic Hazard Analysis Long Term
Ground Motion Forecast
- Definition Specification of the maximum
intensity of shaking expected at a site during a
fixed time interval - Example National seismic hazard maps Used as
basis for Building Codes
- Intensity measure peak ground acceleration (PGA)
- Interval 50 years
- Probability of exceedance 2
58Probabilistic Seismic Hazard Calculations Require
Ensemble Calculations (like weather forecasts)
59Probabilistic Seismic Hazard Calculations Require
Ensemble Calculations (like weather forecasts)
60The multiple programs that are used to calculate
a CyberShake hazard curve calls for use of
scientific workflow technology.
61- Simulate all possible earthquakes within 200KM
of site we are studying.
62For Every Possible Earthquake, Define Every
Possible Rupture
63Validation Using Precarious Rocks
UNR Database
64Outline
- SCEC Earthquake System Science
- Earthquake System Science Computing
- Heterogeneous Workflow Research
- SCEC View of the Computing World
65SCEC IT Priorities
- SCEC IT groups charge is to perform large scale
scientific computing - We must get research done
- We tend to be technology neutral
- We are careful to apply technology only when it
helps us get our research done.
66SCEC Research-oriented computing needs are
heterogeneous - full of contrasts
- Compiled codes (C, Fortran)
- Interpreted code (Java, Python)
- Geological Time scales
- Emergency Response Time scales
- Distributed Stateful Object codes.
- Stateless web services.
- Supercomputer savvy software engineers.
- Domain experts with little high performance
computing experience. - Very Long running jobs (24hours)
- Very short running jobs (lt 1 minute)
- Small serial calculations
- Very large parallel calculations
- Single (Hero) runs - Parallel supercomputers
calculations - Many small calculations high throughput jobs
- Very large data files (1TB)
- Many small files (1M)
- Data in relational databases
- Legacy, heavily used codes
- Rapidly changing codes
- Single programs
- Multi-stage processing streams (workflows)
67Accepting Heterogeneity
- We have been forced into the view that
heterogeneity is inevitable and we must deal with
it. - Grids help deal with heterogeneity.
- Our workflow tools are specifically selected to
support grid-based workflows in a heterogeneous
environment.
68Some Characteristics of Grids
Numerousresources
Owned by multiple organizations individuals
Connected by heterogeneous, multi-level networks
Different security requirements policies
Different resource management policies
Geographically distributed
Unreliable resources and environments
Resources are heterogeneous
Slide by Hiro
69Dealing with Heterogeneity of Computing
- Applying grid software stack to homogenous
computing environment may be wasteful. - Example Originally, TeraGrid was envisioned as
homogenous hardware and software. - Example of Unclear on the concept of grids as
tools to deal with heterogeneity. - It has now evolved away from this.
70Example SCEC Research Application OpenSHA
- Research Computing Challenge
- Collection of existing scientific codes that
implement different techniques for calculating
existing probabilistic seismic hazards curves and
maps. - Mostly small, short, serial calculations.
- Must be extensible as new techniques and codes
are developed. - Solution
- Standard object model for domain
- Java language framework
- Java interface wrapping FORTRAN codes to avoid
re-writing existing solutions. - Distributed, stateful objects implemented using
Java RMI - CondorG-based calculations for high throughput
codes.
71Pathway 1 OpenSHA
Time Span
OpenSHA A Community Modeling Environment
for Seismic Hazard Analysis
Earthquake- Rupture Forecast
IM
Rupn,i
Site
Type, Level
Sourcei
Intensity-Measure Relationship
72OpenSHA
A framework where any arbitrarily complex (e.g.,
physics based) SHA component can plug in for
end-to-end SHA calculations.
- open source
- object oriented
- platform ind.
- web/GUI enabled
- distributed (potentially)
- Java (or wrapped code)
- validated
73OpenSHA
Applications Available
1) Hazard Curve Calculator
Publications WGCEP-2002 ERF (Field et al. 2005,
SRL) Distributed Object Technologies (Maechling
et al., 2005, SRL)
74OpenSHA
Applications Available
1) Hazard Curve Calculator
2) Scenario ShakeMap Calculator
75OpenSHA
Applications Available
1) Hazard Curve Calculator
2) Scenario ShakeMap Calculator
3) Hazard Map Data Calculator
4) Hazard Map Plotter
Utilization of GRID computing (Maechling et al.,
2005, SRL)
76OpenSHA Publications
Field, E.H., H.A Seligson, N. Gupta, V. Gupta,
T.H. Jordan, and K.W. Campbell (2005). Loss
Estimates for a Puente Hills Blind-Thrust
Earthquake in Los Angeles, California, Earthquake
Spectra, 21, 329-338. Field, E.H., N. Gupta, V.
Gupta, M. Blanpied, P. Maechling, and T.H. Jordan
(2005). Hazard Calculations for the WGCEP-2002
Earthquake Forecast Using Distributed Object
Technologies, Seism. Res. Lett., 76,
161-167. Maechling, P., V. Gupta, N. Gupta, E.H.
Field, D. Okaya and T.H. Jordan. (2005) Seismic
Hazard Analysis Using Distributed Computing in
the SCEC Community Modeling Environment, Seism.
Res. Lett., 76, 177-181. Field, E.H., V. Gupta,
N. Gupta, P. Maechling, and T.H. Jordan (2005).
Hazard Map Calculations Using OpenSHA and GRID
Computing, Seism. Res. Lett., In
Press. Maechling, P., V. Gupta, N. Gupta, E.H.
Field, D. Okaya and T.H. Jordan. (2005). Grid
Computing in the SCEC Community Modeling
Environment , Seism. Res. Lett., In Press.
77Example SCEC Research Application Community
Modeling Environment to Help Non-expert Users Run
Wave Propagation Simulations
- Research Computing Challenge
- Help user setup and run earthquake wave
propagation simulations - Run simulations on any available supercomputer
- Run multiple stages including setup, execution,
data extraction, and discovery - Manage data and provide data discovery.
- Solution
- Formulate calculations as a grid-based workflow.
- Build knowledge-based users interfaces for
workflow construction and data discovery. - Utilize USC High Performance Computing and
TeraGrid computers - Integrate Data Management using Digital Library
technology (Storage Resource Broker)
78SCEC/CME Computational Pathway Construction
A major SCEC/CME objective is the ability to
construct and run complex computational pathways
for SHA
Lat/Long/Amp (xyz file) with 3000 datapoints
(100Kb)
Define Scenario Earthquake
ERF Definition
Calculate Hazard Curves
Extract IMR Value
Plot Hazard Map
9000 Hazard Curve files (9000 x 0.5 Mb 4.5Gb)
IMR Definition
GMT Map Configuration Parameters
Gridded Region Definition
Probability of Exceedence and IMR Definition
Pathway 1 example
79Users run Wave Propagation Simulations
- Our simulation system used a lifecycle concept
that recognized several phases in workflow
development and use and tried to address each one - Construct workflow.
- Submit workflows
- Monitor workflows
- Archive workflow data
- Discover workflow data
- Retrieve workflow data
80(No Transcript)
81Creation of Workflows in Layers of Increasing
Detail
- Workflow Template
- Specifies executables and dataflow
- No data specified, just their type
- Workflow Instance
- Specifies data files for a given template
- Logical file names, not physical file replicas
- Executable Workflow
- Specifies physical locations of data files,
hosts/pools for execution of jobs, and data
movement jobs
82The Process of Creating an Executable Workflow
User guided
- Creating a valid workflow template (human guided)
- Selecting application components and connecting
inputs and outputs - Adding other steps for data conversions/transforma
tions - Creating instantiated workflow
- Providing input data to pathway inputs (logical
assignments) - Creating executable workflow (automatically)
- Given requirements of each model, find and assign
adequate resources for each model - Select physical locations for logical names
- Include data movement steps, including data
deposition steps
Automated
83USC HPCC Resources
Grid
SCEC Computing Resources
TeraGrid Computing Resources
84Common TeraGrid Software Stack a Key Component of
a Scientific Computing Environment
85SCEC Workflow System Software Stack
Pegasus Meta-scheduler Ability converting
abstract workflows to concrete workflows Automatic
expansion of workflows to stage and transfer
files. Virtual Data Toolkit (VDT) Includes
Condor, Globus Job scheduling including condor
glide-in provisioning for embarrisingly parallel
execution on cluster Metadata Catalog
Service Metadata archive and data
discovery Replica Location Service Mapping
logical files to physical files Globus GSI
security supported at USC and TeraGrid. GridFTP
for data transfer. Support for Condor job
scheduling. Storage Resource Broker Client (SRB
Server at SDSC).
86INTEGRATED WORKFLOW ARCHITECTURE
J. Zechar _at_ USC (Teamwork Geo CS)
Workflow Template Editor (CAT)
Query for components
D. Okaya _at_ USC
Tools
Domain Ontology
Workflow Template (WT)
Workflow Library
Component Library
Query for WT
Data Selection
Query for data given metadata
L. Hearn _at_ UBC
COMPONENTS
I/O data descriptions
Conceptual Data Query Engine (DataFinder)
Metadata Catalog
Workflow Instance (WI)
Execution requirements
Engineer
Workflow Mapping (Pegasus)
Grid info svcs
Tools
Grid
K. Olsen _at_ SDSU
Executable Workflow
87Assisted Pathway Composition
- Problem In order to bring sophisticated models
to a wide range of users we need to provide
assistance and automation while allowing users to
guide the process - Approach Mixed-initiative system that helps
users create, reuse, and combine pathways by
exploiting - Knowledge-based descriptions of components
- Ontology of components and component types based
on common features and parameter constraints - Analysis of (partially constructed) pathways
based on AI planning techniques - Provide formal definitions of desirable
properties of pathways
ErrorScan Input Workflow W ltC,L,I,Ggt Output
list of errors and corresponding fix
suggestions I. If W is not purposeful, return
Error. Suggestions define end result e using
types from the KB, AddEndResult (e). . II.
For each Component C in W a. If C is not
Justified, return Error. Suggestions ?p
that is output-parameter (c), find components
cj in the workflow or the KB that have
pj as input- parameter(cj), and subsumes(pj,p),
AddLink(c,p,cj,pj) b. If C is not
grounded, return Error. Suggestions (?
Cj ? FindDirectSubtypes(c),
SpecializeComponent(C, Cj). c. For each i in
input-parameter(c) 1. If i is not
Satisfied, return Error.
Suggestions ? cj ? C with output parameter pj
such that
subsumes(range(c,i),range(cj,pj))
AddLink(cj,pj,c,i).
Suggestions ? cj ? FindMatchingOutput (i)),
AddLink(cj,pj,c,i).
SuggestionAddAndLinkComponent (W,
AddInitialInput(i),range( i), c, i) III.
For each Link L in W a.If L is not
Consistent, return Error. Suggestions ?
Ci ? FindInterPosingComponent(L),
InterposeComponent (Ci, L).
Suggestion RemoveLink(L). b. If L is
Redundant, return Error. Suggestion
RemoveLink (L).
88CAT Composition Analysis Tool Now Called Wings
Declarative descriptions of models are linked to
ontologies and reasoners
System reasons about model constraints and points
out errors and fixes
User builds a pathway specification from library
of models
System guarantees correctness of pathway templates
89SCEC Digital Library Contains TB of Data and
Metadata
- SRB-based Digital Library
- More than 160 Terabytes of tape archive
- 4 Terabytes of on-line disk
- 5 Terabytes of disk cache for derivations
90Conceptual Queries DataFinder
- Browsing interface is the wrong metaphor
- Too many files to look through
- Too many metadata atttributes
- Search is the right answer
- Focus on items of interest
- Provide conceptual level abstractions
- Need scalable approach
91Velocity Mesh Conceptual Model
92Maechling, P., H. Chalupsky, M. Dougherty, E.
Deelman, Y. Gil, S. Gullapalli, V. Gupta, C.
Kesselman, J. Kim, G. Mehta, B. Mendenhall, T.
Russ, G. Singh, M. Spraragen, G. Staples, K. Vahi
(2005) Simplifying Construction of Complex
Workflows for Non-Expert Users of the Southern
California Earthquake Center Community Modeling
Environment, ACM SIGMOD Special issue on
Scientific Workflows, Record Vol. 34 No. 3,
September 2005, pp. 24-30
93Lesson from Our Integrated Workflow Architecture
- Conclusions of Paper (maybe still valid)
- Move knowledge about the workflow components
(programs) and computational infrastructure
(grid) into knowledge-bases, databases - Let experts in the domain, or in the computing
environment, create these data structure. - Then the workflow tools can utilize these data
structures and the user doesnt need to be an
expert in these area.
94Lesson from Our Integrated Workflow Architecture
- On Workflows
- The separation into 3 stages of creation
(template, abstract, and concrete) works well. It
is especially well suited for grid-based
workflows. - Many tools, especially visual workflow
construction tools, combine the three stages, and
they only help you create an instance of a
concrete workflow. This makes it difficult to
run a lot of workflows through a GUI. - Achilles heel of workflow tools continues to be
workflow template construction (First stage of
workflows). Workflow templates are difficult to
create and modify
95Lesson from Our Integrated Workflow Architecture
- On Knowledge Representation
- Work started on this workflow system continues
- "Provenance Trails in the Wings/Pegasus Workflow
System", Jihie Kim, Ewa Deelman, Yolanda Gil,
Gaurang Mehta, Varun Ratnakar. In To appear in
Concurrency and Computation Practice and
Experience, Special Issue on the First Provenance
Challenge, 2007 - "Wings for Pegasus Creating Large-Scale
Scientific Applications Using Semantic
Representations of Computational Workflows",
Yolanda Gil, Varun Ratnakar, Ewa Deelman, Gaurang
Mehta, Jihie Kim. In To appear in Proceedings of
the 19th Annual Conference on Innovative
Applications of Artificial Intelligence (IAAI),
Vancouver, British Columbia, Canada, July 22-26,
2007 - "Workflow Composition", Yolanda Gil. In In
Workflows for e-Science, D. Gannon, E. Deelman,
M. Shields, I. Taylor (Eds), Springer Verlag,
2006 - The workflow construction tools that use
knowledge-bases require knowledge-engineers. C
programmers are had enough to find and hire, and
knowledge engineers are much harder. So,
knowledge-bases were very difficult to maintain. - Development of ontologies about computer domains
worked because computer terms were less ambigous.
Development of scientific ontologies was rough
sledding and was not providing enough benefit
helping us get our SCEC research done enough to
continue to support.
96Integrated Workflow system Evolved into Science
Gateway
- Goals of Earthworks System
- Automatically run simulation after local event as
an urgent workflow. - Post results (data products) on SCEC web site.
- Allow scientists to re-configure and rrun
simulations.
97Integrated Workflow System Evolved into SCEC
Earthworks Science Gateway
wave.usc.edu
ia64.sdsc.edu
scecdata.usc.edu
desktop
intensity.usc.edu
Maechling, P., J. Muench, H. Francoeur, D. Okaya,
Y. Cui (2007) SCEC Earthworks Science Gateway
Interactive Configuration and Automated Execution
of Earthquake Simulations on the TeraGrid, In
Proceeding of TeraGrid 2007, Madison Wisconsin
98Simulation Configuration Interface
underlying XML-based description file - no web
page upkeep
99SCEC Earthworks Science Gateway Widening Access
to the TeraGrid
Velocity Model
SeismicWave Prop.
Intensity measures (PGA-PGV
spectral accel.) wave processing (debias,
filter, vel2accel.). PEER
deconvol. archive (dig. libr.) PG map
making. Viz SDSC Viz ISI
CVM 4.0 CVM 3.0 CVM 2.2 Harvard VM. 1D
Hadley- Kanamori. Constant
velocity Top layer halfspace.
TeraShake2 _at_HPCC.usc TeraShake2
_at_Teragrid.sdsc Graves FD _at_HPCC.usc Graves FD
_at_Teragrid.sdsc Carnegie-Mellon FE Caltech
spectral element
100Choices in AWM code and Velocity Model
Same velocity model (CVM3.0) three AWM codes
Same AWM code (Olsen) three velocity models
Hollywood EQ -118.398o, 34.053o 7.00 km depth Mw
4.23, strike165o dip60o, rake0o Region 24 x
24 x 12 km dx 150 m
101Earthworks Gateway Supports On-demand
verification and validation A emphasis for us.
- Research Computing Challenge
- Keeping codes validated as they change
- Verification and validation is a multi-step
process. - Solution
- Implement verification problems as workflows
102Example PEER UHS-1 Verification
Lifelines Program Task 1A01
Mxy
103Example SCEC Research Application Ensemble
Calculations using both MPI and Post processing
- Research Computing Challenge
- Run large MPI jobs that output large data files
- Run 100K serial post processing jobs
- Manage data and provide data discovery.
- Solution
- Formulate calculations as a grid-based workflow.
- Utilized clusters as condor pools
- Use condor glide-ins to reduce job scheduler
delays - Utilize USC High Performance Computing and
TeraGrid computers - Archive results and metadata in database
104Physics-based Probabilistic Seismic Hazard
Calculations - CyberShake Platform
- Objective
- Capability for physics-based probabilistic
seismic hazard calculation in Southern
California, accounting for source complexity and
3D earth structure - Simulates ground motions for potential fault
ruptures within 200 km of each site - 12,700 sources in Southern California from USGS
2002 earthquake source model - Extends USGS 2002 to multiple hypocenters and
slip models for each source - 100,000 ground motion simulations for each site
105Workflow used only on high throughput aspect of
the calculations
The multiple programs that are used to calculate
a CyberShake hazard curve calls for use of
scientific workflow technology.
106SCEC workflows on the TeraGrid Described in
Workflows for e-Science
Executable workflow
Maechling P., E. Deelman, G. Mehta, R. Graves, L.
Zhao, N. Gupta (2007) SCEC CyberShake Workflows -
Automating Probabilistic Seismic Hazard Analysis
Calculations, Workflows for e-Science, Springer
2007, XXII, 530 p., 181 illus., Hardcover ISBN
978-1-84628-519-6
107Pegasus Workflow Mapping
4
1
Original workflow 15 compute nodes devoid of
resource assignment
8
5
9
10
12
13
15
108Distribution of seismogram jobs
70 hours
109(No Transcript)
110Scalability
SCEC workflows run each week using Pegasus and
DAGMan on the TeraGrid and USC resources.
Cumulatively, the workflows consisted of over
half a million tasks and used over 2.5 CPU Years.
Managing Large-Scale Workflow Execution from
Resource Provisioning to Provenance tracking The
CyberShake Example, Ewa Deelman, Scott Callaghan,
Edward Field, Hunter Francoeur, Robert Graves,
Nitin Gupta, Vipin Gupta, Thomas H. Jordan, Carl
Kesselman, Philip Maechling, John Mehringer,
Gaurang Mehta, David Okaya, Karan Vahi, Li Zhao,
e-Science 2006, Amsterdam, December 4-6, 2006,
best paper award
111CyberShake Computational Results for First 2 sites
- First two sites (Pasadena and USC) we used NCSA
SDSC IA-64 machines using - 23 days total runtime
- Condor Glide-in processing approach submitted via
Pegasus. - Failed job recovery
- Retries
- Rescue DAG
112Recent Performance on CyberShake
113Specific Techniques that Pegasus Uses to Support
Our CyberShake Workflows
- Workflows have Too Many Jobs
- Separate in Smaller workflows (Partition) and run
them one after another - Workflows have too many small jobs
- Bundle jobs together
- Scheduling of jobs on cluster
- Schedule Condor Glide-ins on Cluster nodes then
bypass the PBS scheduler
114Large, long running, workflows present Planning
Issues.
- As workflows get large, the decision of when to
convert to a concrete workflow become quite
important because conditions change during
execution of the workflow - Advanced (planning) Decide prior to task being
ready - Deferred Decide only when task is ready
- Mixed mode
- When you separate abstract (logical) workflow
from concrete workflow, you (or job planner e.g.
Pegasus) can decide where to run the jobs at
run-time.
115Outstanding Middleware Implications for 3D Wave
Propagation Simulations
- Schedule reservations on clusters
- Reservations and special queues are often
arranged. - Large file and data movement
- TeraByte transfers require high reliably, long
term, data transfers - Ability to stop and restart
- Can we move restart from one system to another
- Draining of temporary storage during runs
- Storage required for full often exceeds
capability of scratch, so output files must be
moved during simulation
116Major Middleware related issues for
SCEC/CMESupercomputing and Storage
- Globally (TeraGrid wide) visible disk storage
- Well supported, reliable file transfers with
monitoring and restart of jobs with problems are
essential. - Interoperability between grid tools and data
management tools such as SRB must include data
and metadata and metadata search.
117Major Middleware related issues for
SCEC/CMEUsability Related and Monitoring
- Monitoring tools that include status of available
storage resources. - On-the-fly visualizations for run-time validation
of results - Interfaces to workflow systems are complex,
developer oriented interfaces. Easier to user
interfaces needed
118Outline
- SCEC Earthquake System Science
- Earthquake System Science Computing
- Heterogeneous Workflow Research
- SCEC View of the Computing World
119- Simulation program has produced physics-based
seismic hazard research results that advanced
SCECs earthquake system science goals.
120Milestones in Numerical Simulation of Global
Geosystems
- First long-term climate simulations 1965
- Coupled ocean-atmosphere climate model 1969
- First simulations of mantle convection 1974
- Climate models with ocean eddy-scale
resolution 1988 - Realistic model of self-sustaining core
dynamo 1995 - Realistic model of mantle convection 1996
- Fully coupled climate model with interactive
carbon cycle 2004
Japanese Earth Simulator dedicated March, 2002
121The Geoscientific Community Is Pursuing a
Petascale Collaboratory
122SCEC Needs High Performance Computing
- SCECs science program requires high performance
computing. - We are positioning ourselves as a scientific
group that can perform important science on
current and future NSF high performance
computers. - We must show that we are ready to use the next
generation computers. - The next generation of computers are a likely to
be a huge challenge to acquire, operate, and use.
123SCEC NSF Supercomputer Allocations
124Real Crisis With HPC Is With The Software
- Programming is stuck
- Arguably hasnt changed since the 60s
- Its time for a change
- Complexity is rising dramatically
- highly parallel and distributed systems
- From 10 to 100 to 1000 to 10000 to 100000 of
processors!! - multidisciplinary applications
- A supercomputer application and software are
usually much more long-lived than a hardware - Hardware life typically five years at most.
- Fortran and C are the main programming models
- Software can be a major cost component of modern
technologies. - The tradition in HPC system procurement is to
assume that the software is free. - We dont have many great ideas about how to solve
this problem.
125SCEC Pursuing Leadership Class Computer Systems
Data-oriented Science and Engineering
Environment
100 TF Systems 10s of Projects
Key function of the NSF Supercomputer Centers
Provide facilities over and above what can
be found in the typical campus/lab environment
10s of 10 TF Systems 1,000s of Users
HPC Centers
Data (more BYTES)
100s of 1 TF Systems 10,000s of Users
Home, Lab, Campus, Desktop
TraditionalHPC environment
Departmental HPC
GigaFLOPS Millions of Users
Workstations
Compute (more FLOPS)
126AWM Olsen (2006) 40960 Processors
- Regional M8.1 domain
- 32 billion mesh points
- outer/inner scale ratio of 8,000
- 40,960 BGW processors
- 96 efficiency
- 6.1 Tflops
Yifeng Cui, SDSC F06 AGU
40,960 processors
127Along with scaling to large processor count, we
are also improving codes that can express
geometric complexity
128Collection of Codes used to run AWM-Olsen Finite
Difference 3D Wave Propagation Software
129SCEC Computational Platform Concept
- Our experience is that in order to get a research
result, it requires more than a single program. - Input data
- Multiple simulation codes
- Verification and validation work
- Knowledgable users
- Software infrastructure
- Out of this experience, the concept of a
computational platform has emerged to refer to
a collection of codes that can be used to produce
a specific research result.
130SCEC Computational Platform Concept
- Definition of Computational Platform
- A vertically integrated collection of hardware,
software, and people that provides a broadly
useful research capability - Implied capabilities
- Validated simulation software and geophysical
models - Re-usable simulation capabilities
- Imports parameters from other systems. Exports
results to other systems - IT/geoscience collaboration involved in operation
- Access to High-performance hardware and large
scale data and metadata management. - May use Workflow management tools
131SCEC Computational Platforms
132SCEC Simulation Projects begin with Scientific
Objectives
- Extend deterministic simulations of strong ground
motions to 3 Hz for investigating the upper
frequency limit of deterministic ground-motion
prediction. - Improve the resolution of dynamic rupture
simulations by an order of magnitude for
investigating the effects of realistic friction
laws, geologic heterogeneity, and near-fault
stress states on seismic radiation. - Compute physics-based PSHA maps and validate
those using seismic and paleo-seismic data.
133PetaSHA Milestone Simulations
Simulation Volumes V1 Northridge
domain V2 PSHA site volume V3 regional M7.7
domain V4 regional M8.1 domain
http//scecdata.usc.edu/petasha
134(No Transcript)
135(No Transcript)
136Petascale computing will be needed for SHA
simulations
137Possible Science and Cyberinfrastructure Lifecycle
New Science Researchgoals
motivate
enable
New infrastructurecapabilities
New infrastructurecapabilities
New Science Researchgoals
enable
motivate
138Possibly Improved Science and Cyberinfrastructure
Lifecycle Advancement of Knowledge
New Science Researchgoals
New Knowledge and Understanding
New Science Researchgoals
enable
motivate
enable
motivate
New infrastructurecapabilities
New infrastructurecapabilities
139End
140Abstract Workflow Reduction
141Optimizing from the point of view of Virtual Data
Job c
Job a
Job b
Job f
Job e
Job d
Job g
Job h
Job i
- Jobs d, e, f have output files that have been
found in the Replica Location Service. - Additional jobs are deleted.
- All jobs (a, b, c, d, e, f) are removed from the
DAG.
142Planner picks execution and replica
locations Plans for staging data in
Job c
adding transfer nodes for the input files for the
root nodes
Job a
Job b
Job f
Job e
Job d
Job g
Job h
Job i
143Staging data out and registering new derived
products in the RLS
Job c
Job a
Job b
Job f
Job e
Job d
Job g
Job h
Staging and registering for each job that
materializes data (g, h, i ).
Job i
KEY The original node Input transfer
node Registration node Output transfer
node Node deleted by Reduction algorithm
144(No Transcript)