Title: THE EARTH SIMULATOR SYSTEM
1THE EARTH SIMULATOR SYSTEM
- By Shinichi HABATA, Mitsuo YOKOKAWA,
- Shigemune KITAWAKI
- Presented by Anisha Thonour
2Extracted from the government website
A high-end supercomputer (the Earth Simulator) is
just like an Alien with a very big head (brain)
but small arms and legs.
To make the most of its CPU power, thousands of
arms and legs are necessary.
3Definitions
- Super Computer
- A supercomputer is a computer that leads the
world in terms of processing capacity,
particularly speed of calculation, at the time of
its introduction. - Cost is no object with advanced technologies
Dr.Pfeiffer - Parallel Processing
- Processing in which multiple processors work on
a single application simultaneously .
4Cross-sectional View of the Earth Simulator
Building
5Topics to be introduced
- Introduction
- System Overview
- Processor Node
- Interconnection Network
- Performance
- Conclusion
6Introduction
- Global change prediction using computer
simulation - 1000 times faster
- 1997 - February 2002
- 87.5 peak performance(35.86TFLOPS) LINPACK
- 64.9 peak performance(26.58TFLOPS) global
atmospheric circulation model with the spectral
method
7System Overview
- Parallel vector super computer
- 640 processor node and interconnection network
- 1 processor node holds 8 arithmetic processors
and main memory - Peak performance
- Processor node 40TFLOPS
- Achieved performance
- Processor node 35.86TFLOPS
- Interconnection network 640 x 640 non-blocking
crossbar switch - Bandwidth 12.3GB/s
-
8System Overview ctd.
9System Overview ctd..
- 1 cluster consist of 16 processor nodes, a
cluster control station , an I/O control station
and system disk - 640 nodes divided into 40 clusters
- 2 types of clusters S cluster(1), L
cluster(39) - S cluster- 2 nodes are used to for interactive
use and another for small-size batch jobs - User disks - storing user files
- Mass storage system cartridge tape library
system
10ctd.
- Super cluster control station manages all 40
clusters and provide a single system images
operational environment - High-performance and high-efficiency
Architectural features - Vector Processor
- Shared memory
- High-bandwidth and non-blocking interconnection
crossbar network - Parallelizing, high-sustained performance
- Vector processing on a processor
- Parallel processing with shared memory within a
node - Parallel processing among distributed nodes via
the interconnection network
11Processor Node
- Each PN consist of 8AP, a main memory system, a
remote-access control unit and an I/O processor. - Arithmetic processor can deliver up to 8GFLOPS
and there are 8 APs. - It uses a high efficiency heat sink using heat
pipe. - High speed main memory device to reduce the
memory access latency. - Paradigms provided within a processor node is
- Vector processing on a processor.
- Parallel processing with shared memory.
12Processor Node Configuration
13(No Transcript)
14Interconnection Network
- 640 x 640 non-blocking crossbar switch
- Byte-slicing technique
- Control unit and 128 data switch unit
- 320 PN cabinets and 65 IN cabinets
- Each PN cabinets consist of 2 processor nodes and
65 IN cabinets containing the interconnection
network.
15(No Transcript)
16Interconnection Network Wiring
17Inter-node communication mechanism
- Node A requests the control unit to reserve a
data path from node A to node B, and the control
unit reserves the data path, then replies to node
A. - Node A begins data transfer to node B
- Node B receives all the data, then sends the data
transfer completion code to node A.
18Inter-node interface with ECC codes
19Inter-node interface with ECC codes
- To resolve the error occurrence rate problem, ECC
codes are added to the transfer data. - A receiver node detects the occurrence of
intermittent inter-node communication failure by
checking ECC codes, and the error byte data can
almost always be corrected by RCU within the
receiver node. - ECC used for recovering from inter-node
communication failure from a data switch unit
malfunction. - Correction done until switch unit is repaired.
20Barrier Synchronization mechanism using GBC
21Barrier synchronization mechanism using GBC
- GBC-Global barrier counter
- GBF-Global barrier flag
- Barrier synchronization mechanism
- The master node sets the number of nodes used for
the parallel program into GBC within the INs
control unit - The control unit resets all GBFs of the nodes
used for the program - The node, on which task completes, decrements GBC
within the control unit , and repeats to check
GBF until GBF is asserted - When GBC0, the control unit asserts all GBFs of
the nodes used for the program - All the nodes begin to process the next tasks.
- The barrier synchronization time is constantly
less than 3.5µsec
22Bird's-eye View of the Earth Simulator System
23(No Transcript)
24Performance
- Using GBC feature, MPI-Barrier synchronization
time is constantly less than 3.5µsec. - The software barrier synchronization time
increases, or is proportional to the number of
nodes.
25Performance
- The interconnection network is a single stage
network so this performance is always achieved
for every two-node communication.
26Performance
- The ratio of peak performance is more than 85.
- Performance is proportional to the number of
nodes.
27Conclusion
- High-performance and high-efficiency
Architectural features - Vector Processor
- Shared memory
- High-bandwidth and non-blocking interconnection
crossbar network - Parallelizing, high-sustained performance
- Vector processing on a processor
- Parallel processing with shared memory within a
node - Parallel processing among distributed nodes via
the interconnection network - 87.5 peak performance(35.86TFLOPS) LINPACK
- 64.9 peak performance(26.58TFLOPS) global
atmospheric circulation model with the spectral
method
28Applications-Solid Earth Simulation Group
- We are developing new algorithms for the
geophysical simulations as well as new grid
systems in the spherical geometry.
29Solid Earth Simulation Group
30To understand the mechanism of the variability
with time scale from a few days to decades and to
study the predictability in the atmosphere.
31To study the effects of meso-scale phenomena on
the ocean general circulation and the material
transport.
32To understand the mechanism of the variability
and to study the predictability in the coupled
atmosphereocean system.
33References
- http//www.thocp.net/hardware/nec_ess.htm
- http//www.es.jamstec.go.jp/esc/eng/Hardware/in.ht
ml
34Thank You