Scalable Numerical Algorithms and Methods on the ASCI Machines

1 / 38
About This Presentation
Title:

Scalable Numerical Algorithms and Methods on the ASCI Machines

Description:

Title: Scalable Numerical Algorithms and Methods on the ASCI Machines Author: Ewing Lusk Last modified by: Tim Stitt –

Number of Views:149
Avg rating:3.0/5.0
Slides: 39
Provided by: Ewing1
Category:

less

Transcript and Presenter's Notes

Title: Scalable Numerical Algorithms and Methods on the ASCI Machines


1
CS61V
Overview of High-Performance Computing
Part II
2
Parallel Programming?
ENIAC, University of Pennsylvania
1946 (http//www.library.upenn.edu/special/gallery
/mauchly/jwmintro.html)
3
The Need For Power
4
Computational Science
  • Traditional scientific and engineering paradigm
  • Do theory or paper design
  • Perform experiments or build system
  • Replacing both by numerical experiments
  • Real phenomena are too complicated to model by
    hand
  • Real experiments are
  • too hard, e.g., build large wind tunnels
  • too expensive, e.g., build a throw-away passenger
    jet
  • too slow, e.g., wait for climate or galactic
    evolution
  • too dangerous, e.g., weapons, drug design

5
Computational Science Examples
  • Astrophysical thermonuclear flashes
  • Nuclear weapons
  • Weather prediction
  • Climate and atmospheric modeling
  • Drug design
  • Blood flow
  • Fluid dynamics (CFD)

6
Fluid Dynamics
  • Hairpin vortex generation
  • Forced convective heat transfer
  • Buoyant convection
  • Rayleigh-Taylor instability

7
Hairpin Vortices - Transition to Turbulence
  • Boundary layer flow past a hemispherical
    roughness element
  • Re200-2000 based on hemisphere height
  • K512-8168 spectral elements of polynomial degree
    N7-15

8
Simulation Cost
  • Cost is O(Re3)
  • Re1K simulation 1 week on 512 processors of
    ASCI Red
  • 50GF, 64 GB
  • Re10K 1 year on all 8192 processors of ASCI
    Red
  • 800GF, 1TB
  • Were really interested in Re1M
  • Cant even think of doing the Re1K problem on a
    uniprocessor machine let alone the 10K or 1M
    problems!

9
The Necessity of Parallel Computing
10
How fast can a serial computer be?
1 Tflop 1 TB sequential machine
r .3 mm
  • Consider the 1 Tflop sequential machine
  • data must travel some distance, r, to get from
    memory to CPU
  • to get 1 data element per cycle, this means 1012
    times per second at the speed of light, c 3e8
    m/s
  • r lt c/1012 0.3 mm
  • Now put 1 TB of storage in a .3 mm2 area
  • each word occupies about 3 Angstroms2, the size
    of a small atom

11
Even if we could make it ...
  • ... itd be too expensive
  • Market forces are dictating use of COTS

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
The Solution ?
  • Add more workers!
  • Use a collection of processors and memory
    modules to work together to solve our problems
  • Supercomputers, MPPs, Clusters, Beowulfs

20
Bad News
21
Still Lots of Work
  • Decide on and implement an interconnection
    network for the processors and memory modules
  • Design and implement system software for the
    hardware
  • Devise algorithms and data structures for solving
    our problems
  • Divide the algorithms and data structures up into
    subproblems
  • Identify the communication that will be needed
    between the subproblems
  • Assign subproblems to processors and memory
    modules

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
Modern Layered Framework
37
(No Transcript)
38
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com