Application Requirements Petaflop Computing - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Application Requirements Petaflop Computing

Description:

Industrial applications include health, equipment monitoring (Rolls Royce ... Related to ASCI Dream physics based stewardship ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 39
Provided by: geoffr1
Category:

less

Transcript and Presenter's Notes

Title: Application Requirements Petaflop Computing


1
Application RequirementsPetaflop Computing
  • Geoffrey Fox
  • Computer Science, Informatics, Physics
  • Indiana University, Bloomington IN 47404
  • gcf_at_indiana.edu

2
Petaflop Studies
  • Recent Livermore Meeting on Processor in Memory
    Systems
  • http//www.epcc.ed.ac.uk/direct/newsletter4/petafl
    ops.html 1999
  • http//www.cacr.caltech.edu/pflops2/ 1999
  • Several earlier special sessions and workshops
  • Feb. 94 Pasadena Workshop on Enabling
    Technologies for Petaflops Computing Systems
  • March 95 Petaflops Workshop at Frontiers'95
  • Aug. 95 Bodega Bay Workshop on Applications
  • PETA online http//cesdis.gsfc.nasa.gov/petaflops
    /peta.html
  • Jan. 96 NSF Call for 100 TF "Point Designs"
  • April 96 Oxnard Petaflops Architecture Workshop
    (PAWS) on Architectures
  • June 96 Bodega Bay Petaflops Workshop on System
    Software

3
Crude Classification
  • Classic Petaflop MPP
  • Latency 1 to 10 microseconds
  • Single (petaflop) machine
  • Tightly coupled problems
  • Classic Petaflop Grid
  • Network Latency 10-100 or greater milliseconds
  • Computer Latency lt1 millisecond (routing time)
  • e.g. 100 networked 10 teraflop machines
  • Only works for loosely coupled modules
  • Bandwidth is not usually the hard problem
  • Current studies largely science and engineering

4
Styles in Problem Architectures I
  • Classic Engineering and Scientific Simulation
    FEM, Particle Dynamics, Moments, Monte Carlo
  • CFD Cosmology QCD Chemistry ..
  • Work ? Memory4/3
  • Needs classic low latency MPP
  • Classic Loosely Coupled Grid Ocean-Atmosphere,
    Wing-Engine-Fuselage-Electromagnetics-Acoustics
  • Few-way functional parallelism
  • ASCI
  • Generate Data Analyse Data Visualize is a
    3-way Grid
  • Classic MPP or few-way distributed not so big MPP

5
Classic MPP Software Issues
  • Large scale parallel successes mainly using MPI
  • MPI Low level and initial effort hard but
  • Portable
  • Package as libraries like PETSc
  • Scalable to very large machines
  • Good to have higher level interfaces
  • DoE Common Component Architecture CCA packaging
    modules will work at coarse grain size
  • Can build HPF/Fortran90 parallel arrays (I extend
    this with HPJava) but hard to support general
    complex data structures
  • We should restart parallel computing research
  • Note Grid is set up (tomorrow) as set of Web
    services this is a totally message based (as is
    CCA)
  • Run time compilation to inline a SOAP message to
    a MPI message to a Java method call
  • Opposite model Message Passing is high level

6
Styles in Problem Architectures II
  • Data Assimilation Combination of sophisticated
    (parallel) algorithm and real-time fit to data
  • Environment Climate, Weather, Ocean
  • Target-tracking
  • Growing number of applications (in earth science)
  • Classic low latency MPP with good I/O
  • Of growing importance due to Moores law applied
    to sensors and large investment in new
    instruments by NASA, NSF
  • Need improved algorithms to avoid being
    overwhelmed by data

7
Styles in Problem Architectures III
  • Data Deluge Grid Massive distributed data
    analyzed in embarrassingly parallel fashion
  • Virtual Observatory
  • Medical Image Data bases (e.g. Mammography)
  • Genomics (distributed gene analysis)
  • Particle Physics Accelerator (100 PB 2010)
  • Classic Distributed Data Grid
  • Corresponds to fields X-Informatics (XBio,
    Laboratory, Chemistry )
  • See http//www.grid2002.org
  • Underlies e-Science initiative in UK
  • Industrial applications include health, equipment
    monitoring (Rolls Royce generates gigabytes data
    per engine flight), transactional databases
  • DoD sees this in places like Aberdeen Proving
    Ground (Test and Evaluation)

8
Styles in Problem Architectures IV
  • Complex Systems Simulations of sets of often
    non-fundamental entities with phenomenological
    or idealized interactions. Often multi-scale and
    systems of systems and can be real Grids
    data-intensive simulation
  • Critical or National Infrastructure Simulations
    (power grid)
  • Biocomplexity (molecules, proteins, cells,
    organisms)
  • Geocomplexity (grains, faults, fault systems,
    plates)
  • Semantic Web (simulated) and Neural Networks
  • Exhibit phase transitions, emergent network
    structure (small worlds)
  • Data used in equations of motion as well as
    initial conditions (data assimilation)
  • Several fields (e.g. biocomplexity) are immature
    and not currently using major MPP time
  • Could set a computational Grid to catch a real
    Grid but many cases will need a real petaflop MPP

9
Styles in Problem Architectures V
  • Although problems are hierarchical and
    multi-scale, not obvious that can use a Grid
    (putting each subsystem on a different Grid node)
    as ratio of Grid latency to MPP latency is
    typically 104 or more and most algorithms cant
    accommodate this
  • X-Informatics is data (information) aspect of
    field X This is X-complexity integrates
    mathematics, simulation and data
  • Military simulations (using HLA/RTI from DMSO)
    are of this style
  • Entities in complex system could be vehicles,
    forces
  • Or packets in a network simulation
  • TE DoD Integrated Modeling and Testing (IMT) is
    also of this data intensive simulation style

10
Societal Scale Applications
  • Environment Climate, Weather, Earthquakes
  • Heath Epidemics
  • Critical Infrastructure
  • Electrical Power
  • Water, Gas, Internet (all the real Grids)
  • Wild Fire (weather fire)
  • Transportation Transims from Los Alamos
  • All parallelize well due to geometric structure
  • Military HLA/RTI (DMSO)
  • HLA/RTI usually uses event driven simulations but
    future could be classic time-stepped
    simulations as these appear to work in many
    cases IF you define at fine enough grain size
  • Homeland Security?

11
Electric Power Details
12
Data Intensive Requirements
  • Grid like accelerator, satellite, sensor from
    distributed resources
  • Particle Physics all parts of process
    essentially independent 1012 events giving 1016
    bytes of data per year
  • Happy with tens of thousands of PCs at ALL
    stages of analyze
  • Size reduction as one proceeds through different
    stages
  • Need to select interesting data at each stage
  • Data Assimilation start with Grid like gathering
    of data (similar in size to particle physics) and
    reduce size by a factor of 1000
  • Note particle physics doesnt reduce data size
    but maintains embarrassingly parallel structure
  • Size reduction probably determined by computer
    realism as much as by algorithms
  • Tightly coupled analysis combining data and
    PDE/coupled ODE

13
SERVO Grid Tiers
14
HEP Grid
15
Particle Physics Web Services
A Service is just a computer process running on
a (geographically distributed) machine with a
message-based I/O model It has input and output
ports data is from users, raw data sources or
other services Big services built hierarchically
from basic services Each service invokes a CPU
Farm
Accelerator Data as a Webservice (WS)
Physics Model WS
DetectorModel WS
ML Fit WS
Calibration WS
Data Analysis WS
PWA WS
Monte Carlo WS
ExperimentManagementWS
Visualization WS
16
Particle Physics
TeraflopAnalysis Portal
Earth Science
PetaflopMPP for Data assimilation
17
Weather Requirements
18
Weather Accuracy Goals
19
Weather Methodology 2010
20
Climate Requirements (NASA)
21
Solid Earth
22
Components
USArray US Seismic Array a continental scale
seismic array to provide a coherent 3-D image of
the lithosphere and deeper Earth SAFOD San
Andreas Fault Observatory at Depth a borehole
observatory across the San Andreas Fault to
directly measure the physical conditions under
which earthquakes occur PBO Plate Boundary
Observatory a fixed array of strainmeters and GPS
receivers to measurereal-time deformation on a
plate boundary scale InSAR Interferometric
Synthetic Aperture Radar images of tectonically
active regions providing spatially continuous
strain measurements over wide geographic areas.
23
EarthScope Integration
  • Structural Representation
  • Structural Geology Field Investigations
  • Seismic Imaging (USArray)
  • Gravity and Electromagnetic Surveying
  • Kinematic (Deformational) Representation
  • Geologic Structures
  • Geochronology
  • Geodesy (PBO and InSAR)
  • Earthquake Seismology (ANSS)
  • Behavioral (Material Properties) Representation
  • Subsurface Sampling (SAFOD)
  • Seismic Wave Propagation
  • Structures Deformation Material properties
  • Process (community models) Prediction

24
a Facility
Facilities in support of Science
an NSF Science Program
  • Data for Science and Education
  • Funding and Management
  • NSF Major Research Equipment Account
  • Internal NSF process
  • Interagency collaboration
  • Cooperative Agreement funding
  • Community-based management
  • MRE - 172 M / 5 years
  • Product - Data
  • Science-appropriate
  • Community-driven
  • Hazards and resources emphasis
  • Cutting edge technology
  • Free and open access

Fundamental Advances in Geoscience Funding and
Management Science driven research
based Peer reviewed Individual
investigator Collaborative /
Multi-institutional Operations - 71 M / 10
years Science - 13 M / year Product -
Scientific Results Multi-disciplinary
trend Cross-directorate encouragement Fundamenta
l research and applications Education and human
resources
25
USArray
26
S an A ndreas F ault O bservatory at D epth
27
PBO A Two-Tiered Deployment of Geodetic
Instrumentation
  • A backbone of 100 sparsely distributed
    continuous GPS receivers to provide a synoptic
    view of the entire North American plate boundary
    deformation zone.
  • Clusters of GPS receivers and strainmeters to be
    deployed in areas requiring greater spatial and
    temporal resolution, such as fault systems and
    magmatic centers (775 GPS units 200
    strainmeters).

28
(No Transcript)
29
Computational Pathway for Seismic Hazard Analysis
Full fault system dynamics simulation
FSM Fault System Model RDM Rupture Dynamics
Model AWM Anelastic Wave
Model SRM Site Response Model
30
Seismic Hazard Map
31
Earth Science Computing
32
Earth Science Data
33
Can it all be done with Grid?For particle
physics yesFor Data Intensive Simulations -- no
34
Societal Scale Applications Issues
  • Need to overlay with Decision Support as problems
    are often optimization problems supporting
    tactical or strategic decision
  • Verification and Validation as dynamics often not
    fundamental
  • Related to ASCI Dream physics based stewardship
  • Some of new areas like Biocomplexity,
    Geocomplexity are quite primitive and not even
    moved to todays parallel machines
  • Crisis Management links infrastructure
    simulations to Collaborative peer-to-peer Grid

35
Interesting Optimization Applications
  • Military Logistics Problems such as Manpower
    Planning for Distributed Repair/Maintenance
    Systems
  • Multi-Tiered, Multi-Modal Transportation Systems
  • Gasoline Supply Chain Model
  • Multi-level Distribution Systems
  • Supply Chain Manufacturing Coordination Problems
  • Retail Assortment Planning Problems
  • Integrated Supply Chain and Retail Promotion
    Planning
  • Large-scale Production Scheduling Problems
  • Airline Planning Problems
  • Portfolio Optimization Problems

36
Decision Application Object Framework
  • Support Policy Optimization and Simulation of
    Complex Systems
  • Whose Time Evolution Can Be Modeled Through a
    Set of Agents Independently Engaging in Evolution
    and Planning Phases, Each of Which Are
    Efficiently Parallelizable,
  • In Mathematically Sound Ways
  • That Also Support Computational Scaling

37
Intrinsic Computational Difficulties
  • Large-scale Simulations of Complex Systems
  • Typically Modeled in Terms of Networks of
    Interacting Agents With Incoherent , Asynchronous
    Interactions
  • Lack the Global Time Synchronization That
    Provides the Natural Parallelism Exploited As
    Data Parallel Applications Such As Fluid Dynamics
    or Structural Mechanics.
  • Currently, the Interactions Between Agents are
    Modeled by Event-driven Methods that cannot be
    Parallelized Effectively
  • But increased performance (using machines like
    the Teragrid) needs massive parallelism
  • Need new approaches for large system simulations

38
Los Alamos SDS Approach
  • Networks of particles and (partial differential
    equation) grid points interact instantaneously
    and simulations reduce to iterating
    calculate/communicate phasescalculate at given
    time or iteration number next positions/values
    (massively parallel) and then update
  • Complex systems are made of agents evolving with
    irregular time steps (cars stopping at traffic
    lights crashing sitting in garage while driver
    sleeps ..)

This lack of global time synchronization stops
natural parallelism in old approachesSDS
combines iterative local planning with massively
parallel updatesMethod seems general
Write a Comment
User Comments (0)
About PowerShow.com