Title: Computing Hardware
1Computing Hardware
2Overview
- Traditional CPUs
- AMD vs. Intel
- Specialized Processors
- BlueGene
- Cell Broadband engine
- AMD/ATI FireStream
- Nvidia Tesla
- Network internconnets
- Topologies
- Vendors
- 10Gb ethernet
- Infiniband
- Myrinet
- Solid state Storage
3Historic 2P AMD and Intel designs
Source http//www.amd.com/us-en/assets/content_ty
pe/DownloadableAssets/2P_S_WS_Comparison_PID_41460
.pdf
4Historic 4P AMD and Intel designs
Source http//www.amd.com/us-en/assets/content_ty
pe/DownloadableAssets/4P_Server_Comparison_PID_414
61.pdf
5Next generation AMD and Intel
Sources http//www.amd.com/us-en/Processors/Produ
ctInformation/0,,30_118_8796_15224,00.html http//
news.cnet.com/8301-13924_3-10008472-64.html?hhTest
1
6BlueGene/L
7Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
8Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
9Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
10Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
11Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
12Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
13Internal Bandwidth Capacity
Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
14Element Interconnect Bus Data Topology
Source http//www.lanl.gov/orgs/hpc/roadrunner/pd
fs/Roadrunner-tutorial-session-2-web1.pdf
15New Tesla T10
Source http//hothardware.com/Articles/NVIDIA-Edi
tors-Day-The-ReIntroduction-of-Tesla/?page2
16GPGPU
17Network Architecture
- High-performance network Low latency, high
bandwidth, and scalable network. - Latency The amount of time required to
package data for sending over the network. - Bi-section Bandwidth This is calculated by
chopping the network topology in half and seeing
how much data can be transferred through this
imaginary divider. The theoretical maximum
bi-section bandwidth is just half the total
bandwidth of all the computers connected.
18Interconnects and Topologies
- Gigabit Ethernet(Star or Fat Tree)
- Myrinet(Clos)
- Infiniband(Fat Tree)
- SCI(2D/3D Torus)
- Quadrics(Fat tree)
19Star topology
- Bisection bandwidth difficult to scale
- Easy to set up
- No path redundancy
Switch
Computer
20Fat tree
- Aggregate Bandwidth as you move up the tree
- Implicit max scaling for Full Bisection bandwidth
- Limited path redundancy
16x link
4x link
21Torus
- Constant incremental cost as you add nodes
- Automatically scales bisection bandwidth
- Redundant paths
- Wiring nightmare!
- Best for applications that can exploit the
topology - Care must be taken when assigning nodes to a job
- Built-in switch!
Ring
2D Torus
22Clos
- Named for Charles Clos paper in 53.
- Full bisection bandwidth
- Scalable to arbitrary size
- Based on a xbar technology(old-16, new-32)
- Redundant and Reconfigurable
- of Input links of output links
- ACCRE has 2 myrinet Clos networks with 0.5Tb/s
and 1Tb/s
Spreader Network
4 link pairs
1 link pair/host
8 hosts/Xbar16
23Interconnects