Title: High Performance Computing Methods
1High Performance Computing Methods
Content and informations
Ralf Gruber, EPFL-STI-LIN ralf.gruber_at_epfl.ch
2HPC methods
AT What you hear today, must not be valid
tomorrow. These days, you must be flexible at any
time. RG Theory is good, examples better, even
if they are only valid at the time they are
executed. HPC methods Numerical
experimentations
4 credits
3Content
Part 1 Computer architectures and
optimisation Part 2 Approximation methods Part
3 Efficient computing Exercises
4Part 1 Computer architectures and optimisation
1.1. Evolution of supercomputing 1.1.1.
Introduction to HPC 1.1.2. Historical aspects
Hardware 1.1.3. Historical aspects Software and
algorthmics 1.1.4. Bibliography 1.2. Single
processor architecture and optimisation 1.2.1.
Memory and processor architectures 1.2.2. Data
representation and pipelining 1.2.3. System
software issues 1.2.4. Application related
single processor parameters 1.2.5. Application
optimisation on a single processor 1.3. Parallel
computer architectures 1.3.1. SMP and NUMA
architectures 1.3.2. Communication network
technologies 1.3.3. Cluster architectures 1.3.4.
Parameterisation of parallel applications
1.3.5. Grid computing
5Part 2 Approximation methods
2.1. Stable numerical approach 2.1.1. grad-div
equations Primal form 2.1.2. Bilinear elements
leading to spectral pollution 2.1.3. Ecological
solution for a Cartesian grid 2.1.4. Problems
with triangular meshes 2.1.5. Spectral pollution
for a non-Cartesian grid 2.1.6. Non-polluting
finite hybrid element method 2.1.7. Dual
formulation approach 2.1.8. Isoparametric
transformation 2.1.9. Bibliography 2.2. Improve
precision by an h-p approach 2.2.1. Standard h-p
finite elements 2.2.2. Hybrid h-p Method 2.3.
Improve precision by mesh adaptation 2.3.1. The
redistribution (r) method 2.3.2. 1D Example
Optimal boundary layer 2.3.3. 2D Example The
Gyrotron
6Part 3 Efficient computing
3.1. Programme environment for rapid
prototyping 3.1.1. Domain decomposition 3.1.2.
Memcom 3.1.3. Astrid 3.1.4. Baspl 3.1.5.
Gyrotron example 3.1.6. 3D example S 3.1.7.
Electrofilter 3.2. Mathematical libraries 3.2.1.
Introduction on direct and iterative solvers
3.2.2. BLAS, LAPACK, ScaLAPACK, ARPACK 3.2.3.
MUMPS Direct matrix solver 3.2.4. PETSc
Iterative matrix solver 3.2.5. FFTW,
PRNG 3.2.6. matlab 3.2.7. Visualisation 3.3.
Parallel computing 3.3.1. Direct matrix
solver 3.3.2. Iterative matrix solver and MPI
implementation
7Exercises
E1 Exercises proposed during course E2
Practical work . by attendees . proposed by RG
8Proposals for practical work (together with
specialists)
Parallel Poisson finite element solver 1. For
existing 2D solver, realise interface to Petsc
2. For existing 2D solver, realise interface to
Mumps 3. For existing 2D solver, realise a
preconditioner Graddiv eigenvalue solvers using
h-p method 4. Convergence study for primal and
dual forms with quadrangles and Cartesian gird 5.
Influence of a non-Cartesian grid 6. Triangular
elements 7. Replace Lapack eigenvalue solver by a
more efficient one (Arpack) Personal programme
8. Optimise existing programme 9. Parallelise an
existant serial programme 10. Replace a solver by
a more efficient one Plasma physics
programme 11. Optimize VMEC 12. Optimize
TERPSICHORE
9Exam
Practical work 60 15 presentation and
questions Questions on the course 40 15
10Dates/Places
Course Thursday 1615-1800 27.10./3.11.05 CM1
06 Friday 1015-1200 28.10.05-27.1.06 ME
B31 Exercises Thursday 1615-1800 10.11.-9.2.
06 CM103 Friday 1015-1200 3.2./10.2.06 ?
11Team
Course Ralf.Gruber_at_epfl.ch (35906) Exercises
Vincent Keller (33856) Ralf
Gruber Trach-Minh Tran MPICH, Math. libraries,
Linux, plasma physics Ali Tolou
(33565) Pleiades, Linux
12Architecture de serveurs Le passé à lEPFL
Computer architectures
buy HPC
year
90
94
98
02
06
86
10
Application-related RD
13Parallel computer architectures accessible by
EPFL
Cluster Site Vendor node procs/node network 1 network 2
NoW LIN-EPFL Logics Pentium 4 1 FE Bus (EPNET) -
Pleiades1 STI-EPFL Logics Pentium 4 1 FE switch -
Pleiades2 STI-EPFL DELL Xeon 64 1 GbE switch -
Mizar DIT-EPFL Dalco Opteron 2 Myrinet -
BlueGene DIT-EPFL IBM Power 4 2 3D Grid/Torus Fat Tree
Horizon CSCS Cray Opteron 1 3D Grid/Torus -
SX-5 CSCS NEC vector 1 SMP -
Regatta CSCS IBM Power 4 1 Colony -
WWW Internet
14Parallel computer architectures accessible by
EPFL
Cluster P R Gflops/s P R Gflops/s M Gwords/s VM f/w C Gwords/s VC f/w L ms B B
NoW 10 6 60 0.8 7.5 0.0032 19200 60 750
Pleiades1 132 5.6 739 0.8 7 0.4 1792 60 750
Pleiades2 120 5.6 672 0.8 7 3.75 179 60 7500
Mizar 160 9.6 1536 1.6 6 10 154 10 2500
BlueGene 4096 5.6 22937 0.7 8 1065 22 2.5 4800
Horizon 1100 5.2 5720 0.8 6.5 1760 3.3 6.8 52000
SX-5 16 8 128 8 1 128 - - -
Regatta 256 5 1300 0.4 12 16 80 10 640
WWW 100/F 8 800 0.016 5000 1000 64000
15Pleiades
16Pleiades 1
LIN offices
Itanium 1.3 GHz/2GB
P2
P1
128.178.87.0
SWITCH ProCurve 5300 (76.8Gb/s, 144 ports FE, 8
ports GBE)
192.168.0.0
17Pleiades 2
128.178.87.0
GbE
SWITCH Black Diamond 8810 (432 Gb/s, 144 GbE
ports)
192.168.0.0
120 Xeon (64bits) 2.8 GHz/4GB
18Processors of Pleiades 1 cluster (November 2003)
132 Pentium 4 (32 bit) 2.8 GHz processors -gt 5.6
Gflop/s peak 2 GB dual access DDR memory (max.
6.4 GB/s) 80 GB disk (7200 turns per
minute) Motherboard based on chipset Intel
875P 0.5 (1-100)/1(101-132) MB secondary
cache Low cost (CHF 1600 per processor) NFS for
I/O Linux SuSE 10.0 ifc and icc compilers from
Intel gcc MKL mathematical library
19Processors of Pleiades 2 cluster (November 2005)
120 Xeon (64 bit) 2.8 GHz servers -gt 5.6 Gflop/s
peak 4 GB dual access DDR memory (max. 6.4
GB/s) 40 GB disk (7200 turns per
minute) Motherboard based on Intel E7520 1 MB
secondary cache NFS for user files, PVFS with 8
I/O nodes for scratch files Low voltage
processors (140 W per server) Linux SuSE
10.0 ifort and icc compilers from Intel gcc MKL
mathematical library
20Software on Pleiades
SuSE linux 10.0 SystemImager OpenPBS resource
management /Maui scheduling with fairshare NFS,
PVFS MPICH, PVM, (MPI-FCI) ifc/icc/gcc
compilers MKL (Blas/Lapack) basic mathematical
library
21Software on Pleiades
petsc, aztec parallel iterative matrix
solvers ScalaPack, mumps direct parallel matrix
solvers arpack, Parpack eigenvalue solver,
serial and parallel nag general numerical
library, serial GSL GNU scientific library fftw
serial and parallel Fast Fourier Transforms ("the
best in the west") sprng serial and parallel
random number generator OpenDX visualisation
system, serial paraview parallel visualisation
system
22Software on Pleiades
memcom/astrid/baspl program environment based
on domain decomposition matlab technical
computing environment Fluent cfx/cfxturbogrid icem
cfd gamess others
23Access to cluster Pleiades
Ali Tolou will open accounts. Please put your
name on the circulating sheet. Test if .bashrc
includes the links to Memcom/Baspl/Astrid,
Matlab, paraview, and ifort compiler module
add smr module add matlab module add
paraview module switch intel_comp/7.1
intel_comp/9.0
24How to get exercises
To get exercises on your account scp -p -r
rgruber/Cours5_6/ . You will get daphne
The Gyrotron simulation programmedivrot Eigenv
alue computation of the 2D grad(div) and
curl(curl) operators DOOR A test example of
parsolh2pm The eigenvalue code using a
H2pMmanuals Manuals of MUMPS, Astrid, petsc,
Aztecparsol Parallel 2D Poisson
solverplayground Benchmark codes, exercises,
and resultsS A test case of the 3D ASTRID
solverStokes The 2D Stokes eigenvalue solver
with div(v)0 basis functionsterpsichore
The 3D ideal MHD stability programmeVMEC The
3D MHD equilibrium programmeThere is a README
in each directory.
25Contributions
W.A. Cooper (benchmarks VMEC/TERPSICHORE) M.
Deville (some tables) J. Dongarra (Top500) S.
Merazzi (MEMCOM/BASPL/ASTRID) A. Tolou
(Pleiades system manager, LINUX) T.-M. Tran
(MPI, math. libraries, benchmarks, Pleiades)
26Announcement of complementary courses
Trach-Minh Tran , CRPP MPI, Introduction à la
programmation parallèle Février
2006 Inscription Josiane.Scalfo_at_epfl.ch,
DIT Some doctoral schools give 2 credits
27Announcement of a "General Course"
Laurent Villard , CRPP and André Jaun,
KTH Numerical methods for PDE On-line course
with exercises 6 credits
28Questions ?