Role of spectral turbulence simulations in developing HPC systems - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Role of spectral turbulence simulations in developing HPC systems

Description:

Role of spectral turbulence simulations in developing HPC systems – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 27
Provided by: newt5
Category:

less

Transcript and Presenter's Notes

Title: Role of spectral turbulence simulations in developing HPC systems


1
Role of spectral turbulence simulations in
developing HPC systems
  • YOKOKAWA, Mitsuo
  • Next-Generation Supercomputer RD Center
  • RIKEN

2
Background
  • Experience of developing the Earth Simulator
  • 40Tflops vector-type distributed-memory
    supercomputer system
  • A simulation code for box turbulence flow was
    used in the final adjustment of the system
  • Large simulation on box turbulence flow was
    carried out.
  • A Peta-flops supercomputer project

3
Contents
  • Simulations on the Earth Simulator
  • A Japanese peta-scale supercomputer project
  • Trends of HPC system
  • Summary

4
Simulations on the Earth Simulator
5
The Earth Simulator
  • It was completed in 2002.
  • 35.86Tflops sustained in LINPACK benchmark was
    achieved.
  • It was chosen as one of 2002 best inventions by
    TIME.

6
Why I did?
  • It is important to make performance evaluation of
    the Earth Simulator at the final adjustment
    phase.
  • Suitable codes should be chosen
  • To evaluate performance of vector processor,
  • To measure performance all-to-all communication
    among compute-nodes through a crossbar switch,
  • To make an operation of the Earth Simulator
    stable.
  • Candidates
  • LINPACK Benchmark?
  • Atmospheric general circulation model (AGCM)?
  • Any other code?

7
Why I did? (contd)
  • Spectral turbulence simulation code
  • Intensive computational kernel a lot of data
    communications
  • Simple code
  • Significance to computational science.
  • One of the grand challenges in computational
    science and high performance computing
  • A new spectral code for the Earth Simulator
  • Fourier spectral method for spatial
    discretization
  • Some techniques (mode truncation and phase shift
    techniques) for aliasing error in calculating
    nonlinear terms
  • Fourth-order Runge-Kutta method for time
    integration

8
Points of coding
  • Optimization to the Earth Simulator
  • Coordinated assignment of calculation to
    three-level of parallelism (vector processing,
    micro-tasking, and MPI parallelization)
  • Higher-radix FFT
  • B/F (data transfer rate between CPU and memories
    vs. operation performance)
  • Removal of redundant processes and variables

9
Calculation for one time step
100
30.7sec
10
3.21sec
Wall time
1
0.1
0.01
64
128
256
512
Number of nodes
10
Performance
100
16.4Tflops
50 of the peak (single precision analytical
FLOP number)
Tflops
10
1
64
128
256
512
Number of PNs
11
Achievement of box turbulence flow simulations
1283
Jimenez et al.(1993) Caltech Delta machine
K I Y (2002) Earth Simulator

Kerr(1985) Cray-1S NCAR
5123
643
20483, 40963
Siggia(1981) Cray-1 NCAR
10243
GotohFukayama(2001) VPP5000/56 NUCC
Number of grid points
323
Yamamoto(1994) Numerical Wind Tunnel
Orszag(1969) IBM 360-95
2403
12
A Japanese Peta-Scale Supercomputer Project
13
Next-Generation Supercomputer Project
  • Objectives are
  • to develop the world's most advanced and
    high-performance supercomputer
  • to develop and deploy its usage technologies as
    well as application software.
  • as one of Japan's Key Technologies of National
    Importance.
  • Period Budget FY2006-FY2012, 1 billion US
    (expected)
  • RIKEN (The Institute of Physical and Chemical
    Research) plays the central role of the project
    in developing the supercomputer under the law.

14
Goals of the project
  • Development and installation of the most advanced
    high performance supercomputer system with
    LINPACK performance of 10 petaflops.
  • Development and deployment of application
    software, which should be made to attain the
    system maximum capability, in various science and
    engineering fields.
  • Establishment of an Advanced Computational
    Science and Technology Center (tentative) as one
    of the Center of Excellences for research,
    personnel development and training built around
    the supercomputer.

15
Major applications for the system
Grand Challenges
16
Configuration of the system
  • The Next-Generation Supercomputer will be a
    hybrid general-purpose supercomputer that
    provides the optimum computing environment for a
    wide range of simulations.
  • Calculations will be performed in processing
    units that are suitable for the particular
    simulation.
  • Parallel processing in a hybrid configuration of
    scalar and vector units will make larger and more
    complex simulations possible.

17
Roadmap of the project
We are here.
18
Location of the supercomputer site, Kobe-City
450km (280miles) west from Tokyo
19
Artists image of a building
20
Photo of the site (under construction)
June 10, 2008
July 17, 2008
Aug. 20, 2008
Photo From South-Side
21
Trends of HPC system
22
Trends of HPC system
  • It will have the large number of processors
    around 1 million or more.
  • Each chip will be multi-core(8, 16, or 32), or
    many-core(more than 64) processor.
  • low performance for each core
  • small main memory capacity for each core
  • fine-grain parallelism
  • Each processor consumes low energy low power
    processor
  • Narrow bandwidth between CPU and main memory
  • Bottleneck of the number of signal pins
  • Bi-sectional bandwidth among compute-nodes will
    be narrow.
  • One-to-one connection is very expensive and
    power-consuming

23
Impact to spectral simulations
  • High performance in LINPACK benchmark
  • The more the number of processors is, the higher
    the LINPACK performance is.
  • It is not necessary that LINPACK performance
    denotes real-world application performance,
    especially spectral simulations
  • Small memory capacity for each processor
  • fine-grain decomposition of space
  • increasing communication cost among parallel
    compute nodes
  • Narrow memory bandwidth and narrow inter-node
    bi-sectional bandwidth
  • memory wall problem and low all-to-all
    communication performance
  • necessity of a low B/F algorithm in place of FFT

24
Impact to spectral simulations (contd)
  • The trend does not completely fit doing 3D-FFT,
    i.e. box turbulence simulations are getting to be
    difficult to perform.
  • We can use more and more computational resource
    near future,
  • But finer resolution simulation by spectral
    methods needs a long-time calculation time
    because of extremely slow of communications among
    parallel compute nodes, and we might not be able
    to obtain the final results in reasonable time.

25
Estimates for more than 40963 simulation
  • If simulation performance with 500TFlops
    sustained can be used,
  • 81923 simulation needs
  • 7 second for one-time step
  • 100TB total memory
  • 8 days for 100,000 steps and 1PBytes for a
    complete simulation
  • 163843 simulation
  • 1 min for one-time step
  • 800TB total memory
  • 3 months for 125,000 steps and 10PB in total for
    a complete simulation

26
Summary
  • Spectral methods is a very useful algorithm to
    evaluate the HPC system.
  • In this sense, the trend of HPC system
    architecture is going to worse.
  • Even if peak performance of the system is so
    high
  • We cannot expect high sustained performance.
  • It may take a long time to finish a simulation
    due to very slow data transfer between nodes.
  • Can we discard spectral methods and change the
    algorithm? Or, we have to
  • put strong pressure on computer architecture
    community, and
  • think of any international collaboration for
    developing the supercomputer system which fit the
    turbulent study.
  • I would think of a HPC system as a particle
    accelerator like CERN.
Write a Comment
User Comments (0)
About PowerShow.com