Title: Interconnection Network Routing, Topology Design Trade-offs
1Interconnection Network Routing, Topology Design
Trade-offs
- Adopted from CS 258, Spring 99
- U.C. Berkeley Notes
2Interconnection Topologies
- Class networks scaling with N
- Logical Properties
- distance, degree
- Physical properties
- length, width
- Static vs. Dynamic Networks
- Fully connected network
- diameter 1
- degree N
- cost?
- bus gt O(N), but BW is O(1) - actually worse
- crossbar gt O(N2) for BW O(N)
- VLSI technology determines switch degree
3Example Static Network 2-D Mesh Architecture
4Dynamic Network Consists of Switches Switch
Components
- Output ports
- transmitter (typically drives clock and data)
- Input ports
- synchronizer aligns data signal with local clock
domain - essentially FIFO buffer
- Crossbar
- connects each input to any output
- degree limited by area or pinout
- Buffering
- Control logic
- complexity depends on routing logic and
scheduling algorithm - determine output port for each incoming packet
- arbitrate among inputs directed at same output
5Switches
- A 4X4 Crossbar Switch at a node
6More Static Networks Linear Arrays and Rings
- Linear Array
- Diameter?
- Average Distance?
- Bisection bandwidth?
- Route A -gt B given by relative address R B-A
- Torus?
- Examples FDDI, SCI, FiberChannel Arbitrated
Loop, KSR1
7Multidimensional Meshes and Tori
3D Cube
2D Grid
- d-dimensional array
- n kd-1 X ...X kO nodes
- described by d-vector of coordinates (id-1, ...,
iO) - d-dimensional k-ary mesh N kd
- k dÖN
- described by d-vector of radix k coordinate
- d-dimensional k-ary torus (or k-ary d-cube)?
- Ex Intel Paragon (2D), SGI Origin (Hypercube),
Cray T3E (3DMesh)
8Example k-ary 2D array
- Theorem x,y routing is deadlock free
- Numbering
- x channel (i,y) -gt (i1,y) gets i
- similarly for -x with 0 as most positive edge
- y channel (x,j) -gt (x,j1) gets Nj
- similarly for -y channels
- any routing sequence x direction, turn, y
direction is increasing
9Hypercubes
- Also called binary n-cubes. of nodes N
2n. - O(logN) Hops
- Good bisection BW
- Complexity
- Out degree is n logN
- correct dimensions in order
- with random comm. 2 ports per processor
0-D
1-D
2-D
3-D
4-D
5-D !
10N 26 nodesS (sn-1 sn-2 si s2s1s0)D
(dn-1 dn-2 di d2d1d0)E-cube routing For
i0 to n-1 Compare si and di Route along i
dimension if they differ.Distance Hamming
distance between S and D the no. of dimensions
by which S and D differ.Diameter Maximum
distance n log2 N Dimension of the
hypercubeNo. of alternate parts nFault
tolerance (n-1) O(log2 N)
Routing in Hypercube
000gt001gt011gt111 000gt010gt110gt111 000gt100gt10
1gt111
11Properties
- Routing
- relative distance R (b d-1 - a d-1, ... , b0 -
a0 ) - traverse ri b i - a i hops in each dimension
- dimension-order routing
- Average Distance Wire Length?
- d x 2k/3 for mesh
- dk/2 for cube
- Degree?
- Bisection bandwidth? Partitioning?
- k d-1 bidirectional links
- Physical layout?
- 2D in O(N) space Short wires
- higher dimension?
12Trees
- Diameter and ave distance logarithmic
- k-ary tree, height d logk N
- address specified d-vector of radix k coordinates
describing path down from root - Fixed degree
- Route up to common ancestor and down
- R B xor A
- let i be position of most significant 1 in R,
route up i1 levels - down in direction given by low i1 bits of B
- H-tree space is O(N) with O(ÖN) long wires
- Bisection BW?
13Fat-Trees
- Fatter links (really more of them) as you go up,
so bisection BW scales with N - EX CM5
14Butterflies
building block
16 node butterfly
- Tree with lots of roots!
- N log N (actually N/2 x logN)
- Exactly one route from any source to any dest
- R A xor B, at level i use straight edge if
ri0, otherwise cross edge - Bisection N/2 vs n (d-1)/d
15k-ary d-cubes vs d-ary k-flies
- degree d
- N switches vs N log N switches
- diminishing BW per node vs constant
- requires locality vs little benefit to locality
- Can you route all permutations?
16Relationship BttrFlies to Hypercubes
- Wiring is isomorphic
- Except that Butterfly always takes log n steps
17Toplology Summary
Topology Degree Diameter Ave Dist Bisection D (D
ave) _at_ P1024 1D Array 2 N-1 N / 3 1 huge 1D
Ring 2 N/2 N/4 2 2D Mesh 4 2 (N1/2 - 1) 2/3
N1/2 N1/2 63 (21) 2D Torus 4 N1/2 1/2
N1/2 2N1/2 32 (16) k-ary n-cube 2n nk/2 nk/4 nk/4
15 (7.5) _at_n3 Hypercube n log N n n/2 N/2 10
(5)
- All have some bad permutations
- many popular permutations are very bad for meshs
(transpose) - ramdomness in wiring or routing makes it hard to
find a bad one!
18Real Machines
Machine Topology Cycle Time (ns) Channel Width (bits) Routing Delay (cycles) Flit (data bits)
nCUBE/2 Hypercube 25 1 40 32
TMC CM-5 Fat-Tree 25 4 10 4
IBM SP-2 Banyan 25 8 5 16
Intel Paragon 2D Mesh 11.5 16 2 16
Meiko CS-2 Fat-Tree 20 8 7 8
CRAY T3D 3D Torus 6.67 16 2 16
DASH Torus 30 16 2 16
J-Machine 3D Mesh 31 8 2 8
Monsoon Butterfly 20 16 2 16
SGI Origin Hypercube 2.5 20 16 160
Myricom Arbitrary 6.25 16 50 16
- Wide links, smaller routing delay
- Tremendous variation
19How Many Dimensions?
- n 2 or n 3
- Short wires, easy to build
- Many hops, low bisection bandwidth
- Requires traffic locality
- n gt 4
- Harder to build, more wires, longer average
length - Fewer hops, better bisection bandwidth
- Can handle non-local traffic
- k-ary d-cubes provide a consistent framework for
comparison - N kd
- scale dimension (d) or nodes per dimension (k)
- assume cut-through
20Traditional Scaling Latency(P)
- Assumes equal channel width
- independent of node count or dimension
- dominated by average distance
21Average Distance
- ave dist d (k-1)/2
- Higher dimension gt more channels
22Equal cost in k-ary n-cubes
- Equal number of nodes?
- Equal number of pins/wires?
- Equal bisection bandwidth?
- Equal area? Equal wire length?
- What do we know?
- switch degree d diameter d(k-1)
- total links Nd
- pins per node 2wd
- bisection kd-1 N/k links in each directions
- 2Nw/k wires cross the middle
23Discussion
- Rich set of topological alternatives with deep
relationships - Design point depends heavily on cost model
- nodes, pins, area, ...
- Wire length or wire delay metrics favor small
dimension - Long (pipelined) links increase optimal dimension
- Need a consistent framework and analysis to
separate opinion from design - Optimal point changes with technology
24Origin2000 System Overview
- Single 16-by-11 PCB
- Directory state in same or separate DRAMs,
accessed in parallel - Upto 512 nodes (1024 processors)
- With 195MHz R10K processor, peak 390MFLOPS or 780
MIPS per proc - Peak SysAD bus bw is 780MB/s, so also Hub-Mem
- Hub to router chip and to Xbow is 1.56 GB/s (both
are off-board)
25Origin Network
- Each router has six pairs of 1.56MB/s
unidirectional links - Two to nodes, four to other routers
- latency 41ns pin to pin across a router
- Flexible cables up to 3 ft long
- Four virtual channels request, reply, other
two for priority or I/O
26Case Study Cray T3D
- Build up info in shell
- Remote memory operations encoded in address