Interconnection Network Design Contd. - PowerPoint PPT Presentation

About This Presentation
Title:

Interconnection Network Design Contd.

Description:

Cut-through routing or worm hole routing: switch examines the header, decides ... In worm hole routing, when head of message is blocked, message stays strung out ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 29
Provided by: david2174
Learn more at: http://www.cs.ucr.edu
Category:

less

Transcript and Presenter's Notes

Title: Interconnection Network Design Contd.


1
Interconnection Network Design Contd.
  • Adapted from UC, Berkeley Notes

2
Switching Techniques
  • Circuit Switching A control message is sent from
    source to destination and a path is reserved.
    Communication starts. The path is released when
    communication is complete.
  • Store-and-forward policy (Packet Switching) each
    switch waits for the full packet to arrive in
    switch before sending to the next switch (good
    for WAN)
  • Cut-through routing or worm hole routing switch
    examines the header, decides where to send the
    message, and then starts forwarding it
    immediately
  • In worm hole routing, when head of message is
    blocked, message stays strung out over the
    network, potentially blocking other messages
    (needs only buffer the piece of the packet that
    is sent between switches). CM-5 uses it, with
    each switch buffer being 4 bits per port.
  • Cut through routing lets the tail continue when
    head is blocked, storing the whole message into
    an intermmediate switch. (Requires a buffer large
    enough to hold the largest packet).

3
(No Transcript)
4
Store and Forward vs. Cut-Through
  • Advantage
  • Latency reduces from function ofnumber of
    intermediate switches X by the size of the packet
    to time for 1st part of the packet to
    negotiate the switches the packet size
    interconnect BW

5
StoreForward vs Cut-Through Routing
  • h(n/b D) vs n/b h D
  • what if message is fragmented?
  • wormhole vs virtual cut-through

6
(No Transcript)
7
Example
  • Q. Compare the efficiency of store-and-forward
    (packet switching) vs. wormhole routing for
    transmission of a 20 bytes packet between a
    source and destination, which are d-nodes apart.
    Each node takes 0.25 microsecond and link
    transfer rate is 20 MB/sec.
  • Answer Time to transfer 20 bytes over a link
    20/20 MB/sec 1 microsecond.
  • Packet switching nodes x (node delay
    transfer time) d x (.25 1) 1.25 d
    microseconds
  • Wormhole ( nodes x node delay) transfer time
  • 0.25 d 1
  • Book For d7, packet switching takes 8.75
    microseconds vs. 2.75 microseconds for wormhole
    routing

8
Contention
  • Two packets trying to use the same link at same
    time
  • limited buffering
  • drop?
  • Most parallel mach. networks block in place
  • link-level flow control
  • tree saturation
  • Closed system - offered load depends on delivered

9
Delay with Queuing
  • Suppose there are L links per node. Each link
    sends a packet to another link at Lamda
    packets/sec. The service rate (linkswitch) is
    Mue packets per second. What is the delay over
    a distance D?
  • Ans There is a queue at each output link to hold
    extra packets. Model each output link as an M/M/1
    queue with LxLamda input rate and Mue service
    rate.
  • Delay through each link Queuing time Service
    time S
  • Delay over a distance D S x D

10
Congestion Control
  • Packet switched networks do not reserve
    bandwidth this leads to contention (connection
    based limits input)
  • Solution prevent packets from entering until
    contention is reduced (e.g., freeway on-ramp
    metering lights)
  • Options
  • Packet discarding If packet arrives at switch
    and no room in buffer, packet is discarded (e.g.,
    UDP)
  • Flow control between pairs of receivers and
    senders use feedback to tell sender when
    allowed to send next packet
  • Back-pressure separate wires to tell to stop
  • Window give original sender right to send N
    packets before getting permission to send more
    overlaps latency of interconnection with
    overhead to send receive packet (e.g., TCP),
    adjustable window
  • Choke packets aka rate-based Each packet
    received by busy switch in warning state sent
    back to the source via choke packet. Source
    reduces traffic to that destination by a fixed
    (e.g., ATM)

11
Flow Control
  • What do you do when push comes to shove?
  • Ethernet collision detection and retry after
    delay
  • FDDI, token ring arbitration token
  • TCP/WAN buffer, drop, adjust rate
  • any solution must adjust to output rate
  • Link-level flow control

12
Examples
  • Short Links
  • long links
  • several flits on the wire

13
Routing
  • Recall routing algorithm determines
  • which of the possible paths are used as routes
  • how the route is determined
  • R N x N -gt C, which at each switch maps the
    destination node nd to the next channel on the
    route
  • Issues
  • Routing mechanism
  • arithmetic
  • source-based port select
  • table driven
  • general computation
  • Properties of the routes
  • Deadlock feee

14
Routing Mechanism
  • need to select output port for each input packet
  • in a few cycles
  • Simple arithmetic in regular topologies
  • ex Dx, Dy routing in a grid
  • west (-x) Dx lt 0
  • east (x) Dx gt 0
  • south (-y) Dx 0, Dy lt 0
  • north (y) Dx 0, Dy gt 0
  • processor Dx 0, Dy 0
  • Reduce relative address of each dimension in
    order
  • Dimension-order routing in k-ary d-cubes
  • e-cube routing in n-cube

15
Routing Mechanism (cont)
P0
P1
P2
P3
  • Source-based
  • message header carries series of port selects
  • used and stripped en route
  • CRC? Packet Format?
  • CS-2, Myrinet, MIT Artic
  • Table-driven
  • message header carried index for next port at
    next switch
  • o Ri
  • table also gives index for following hop
  • o, I Ri
  • ATM, HPPI

16
Properties of Routing Algorithms
  • Deterministic
  • route determined by (source, dest), not
    intermediate state (i.e. traffic)
  • Adaptive
  • route influenced by traffic along the way
  • Minimal
  • only selects shortest paths
  • Deadlock free
  • no traffic pattern can lead to a situation where
    no packets mover forward

17
Deadlock Freedom
  • How can it arise?
  • necessary conditions
  • shared resource
  • incrementally allocated
  • non-preemptible
  • think of a channel as a shared resource that
    is acquired incrementally
  • source buffer then dest. buffer
  • channels along a route
  • How do you avoid it?
  • constrain how channel resources are allocated
  • ex dimension order
  • How do you prove that a routing algorithm is
    deadlock free

18
Proof Technique
  • resources are logically associated with channels
  • messages introduce dependencies between resources
    as they move forward
  • need to articulate the possible dependences that
    can arise between channels
  • show that there are no cycles in Channel
    Dependence Graph
  • find a numbering of channel resources such that
    every legal route follows a monotonic sequence
  • gt no traffic pattern can lead to deadlock
  • network need not be acyclic, on channel
    dependence graph

19
Example k-ary 2D array
  • Theorem x,y routing is deadlock free
  • Numbering
  • x channel (i,y) -gt (i1,y) gets i
  • similarly for -x with 0 as most positive edge
  • y channel (x,j) -gt (x,j1) gets Nj
  • similarly for -y channels
  • any routing sequence x direction, turn, y
    direction is increasing

20
Deadlock free wormhole networks?
  • Basic dimension order routing techniques dont
    work for k-ary d-cubes
  • only for k-ary d-arrays (bi-directional)
  • Idea add channels!
  • provide multiple virtual channels to break the
    dependence cycle
  • good for BW too!
  • Do not need to add links, or xbar, only buffer
    resources
  • This adds nodes the the CDG, remove edges?

21
Breaking deadlock with virtual channels
22
Adaptive Routing
  • R C x N x S -gt C
  • Essential for fault tolerance
  • at least multipath
  • Can improve utilization of the network
  • Simple deterministic algorithms easily run into
    bad permutations
  • fully/partially adaptive, minimal/non-minimal
  • can introduce complexity or anomolies
  • little adaptation goes a long way!

23
Interconnection Topologies
  • Class networks scaling with N
  • Logical Properties
  • distance, degree
  • Physcial properties
  • length, width
  • Fully connected network
  • diameter 1
  • degree N
  • cost?
  • bus gt O(N), but BW is O(1) - actually worse
  • crossbar gt O(N2) for BW O(N)
  • VLSI technology determines switch degree

24
(No Transcript)
25
(No Transcript)
26
Linear Arrays and Rings
  • Linear Array
  • Diameter?
  • Average Distance?
  • Bisection bandwidth?
  • Route A -gt B given by relative address R B-A
  • Torus?
  • Examples FDDI, SCI, FiberChannel Arbitrated
    Loop, KSR1

27
Multidimensional Meshes and Tori
3D Cube
2D Grid
  • d-dimensional array
  • n kd-1 X ...X kO nodes
  • described by d-vector of coordinates (id-1, ...,
    iO)
  • d-dimensional k-ary mesh N kd
  • k dÖN
  • described by d-vector of radix k coordinate
  • d-dimensional k-ary torus (or k-ary d-cube)?

28
Real Machines
  • Wide links, smaller routing delay
  • Tremendous variation
Write a Comment
User Comments (0)
About PowerShow.com