Title: CSE 291A Interconnection Networks
1CSE 291-aInterconnection Networks
Lecture 10 Flow Control February 21, 2007 Prof.
Chung-Kuan Cheng CSE Dept, UC San Diego Winter
2007 Transcribed by Thomas Weng
2Topics
- Introduction
- Bufferless Flow Control
- Buffered Flow Control
- - Cut-through
- - Wormhole
- - Virtual Channel
3Introduction
- Objective Bandwidth and Latency
- Resources Buffer and Channel
State
Switch
Channel
Channel
Buffer
4Unit of Messages
Message
Message is too long, cut up message
Packet
RI
SN
Basic chunk is Packet
Header
Body (content)
Tail (indicates packet is done)
RI Routing Information SN Sequence
Number Packet should be reasonably long so we
minimize overhead.
5Flit and Phit
Packet
RI
SN
Packet
Tail
Flit
Type
Virtual Channel
Flit (Flow Control Digit)
H, T, or HT, or Body
Phit
Phit (Physical Transfer Digit)
Flits
Packet
Phits
Typical Range
8 bits 1-64 bits
64 bits 16-512 bits
1K 128 bits512K
6Bufferless Flow Control
- What if we have no buffer?
- Why not have buffer? Less power and improved
latency. - Three methods
- 1. Drop the data (easiest way)
- 2. Misroute the data (treat data like hot
potato) - 3. Dropless approach (reservations)
7Drop the Data
- Drop the data Tell sender you dropped data
- 1) Nack Negative Acknowledgement (Tell sender
you dropped the data) - 2) Ack Acknowledgement (The sender resends if
Ack is not received within timeout period)
0
0
c b a o
c
b a
0
0
1
1
h 8 f 0
h
Drop
dropped data
Reverse Channel Ack and flow control signals are
sent in a reverse channel
8Drop the Data (cont)
Suppose channel 2 sent and was rejected
0 F H B B B T H B B B T
R N A
1 F H B B B H B B B T
R N A
2 F H B H B B B T
R N A
3 F H B B B T
R A
0
1
2
3
Dropped data simplifies things, but pay a price
by resending data. Latency is long if data was
rejected.
9Dropless Flow Control
Dropless Flow Control Request propagates from
source to destination and allocates the
channel. Ack is transmitted back to the
source. Packet is sent. A tail flit is sent to
de-allocate the channel.
0 R A D T
1 R A D T
2 R A D T
3 R A D T
4 R A D T
Channel
Total time T0 3Htr L/b
bandwidth
length of packet
hops
latency / hop
10Buffered Flow Control
1. Store Forward 2. Cut-through 3.
Wormhole 4. Virtual Channel
11Store Forward
1. Store Forward Flow Control Each node
receives a packet and then sends it out.
0 H B B B T
1 H B B B T
2 H B B B T
3 H B B B T
Channel
T0 H(tr L/b)
12Cut-through
2. Cut-through Flow Control Each node starts to
send the packet without waiting for the whole
packet to arrive. Cut-through is more efficient
approach. 1) Good performance 2) Large buffer
sizes, consumes more power
Suppose in the middle, we get stuck
0 H B B B T
1 H B B B T
2 H B B B T
3 H B B B T
0 H B B B T
1 H B B B T
2 ---- Not Ready ---- ---- Not Ready ---- ---- Not Ready ---- ---- Not Ready ---- ---- Not Ready ---- H B B B T
3 H B B B T
T0 Htr L/b
13Wormhole
3. Wormhole Flow Control Main difference is that
we just have a little buffer, dont need to store
the entire packet. In terms of performance
bandwidth, its better than cut-through.
States Idle, Wait, Active
L
I
L
W
U
0
0
T B B H
T B B
H
1
1
14Wormhole (cont)
L
W
U
U
A
U
0
0
T B
T
H
B
H
B
B
1
1
U
A
U
0
In
B B H
H
B
B
T
T
1
Out
H
B
B
T
15Virtual Channel
4. Virtual Channel Try to split channel in time
domain. By doing so we can fully utilize
channel since we dont waste it by holding
it. Wormhole Virtual Channel Winner
16Virtual Channel (cont)
2
1
A B
3
4
Input a an a1 a2 a3 a4
Input b bn b1 b2 b3 b4
Interleaved an bn a1 b1 a2 b2 a3 b3 a4
b4
Winner Takes All an a1 a2 a3 a4 bn b1 b2 b3 b4