Title: Asynchronous Communication Mechanisms Using Self-timed Circuits
1Asynchronous Communication Mechanisms Using
Self-timed Circuits
- Fei Xia, Alex Yakovlev, Delong Shang,
- Alex Bystrov, Albert Koelmans,
- David Kinniment
- Asynchronous Systems Laboratory
- University of Newcastle upon Tyne
- Async2000,Eilat-Israel, C
2Objectives
- To study a class of async comms previously used
in (software) systems for embedded applications
for potential use in SOCs
3Objectives
- To study a class of async comms previously used
in (software) systems for embedded applications
for potential use in SOCs - Salient features of this class
- Bulk data transfer (medium,possibly varying, size
frames) - Between independent motive powers (clock
domains), hence need to eliminate mutual blocking - Issues of coherence and freshness of data
4Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
5Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
6Asynchronous Communication
7Asynchronous Communication
Rita (Reader)
8Asynchronous Communication
9Asynchronous Communication
10Asynchronous Communication
11Asynchronous Communication
Is it really Asynchronous Communication?
12Asynchronous Communication
13Asynchronous Communication
14Asynchronous Communication
15Asynchronous Communication
16Asynchronous Communication
17Asynchronous Communication
18Asynchronous Communication
19Asynchronous Communication
20Asynchronous Communication
21Asynchronous Communication
22Asynchronous Communication
23Asynchronous Communication
24Asynchronous Communication
25Asynchronous Communication
26Asynchronous Communication
27Asynchronous Communication
28Asynchronous Communication
29Asynchronous Communication
30Asynchronous Communication
31Asynchronous Communication
32Asynchronous Communication
33Asynchronous Communication
34Asynchronous Communication
Is it really Asynchronous Communication?
35Asynchronous Communication
Bounded buffer is still Synchronous Communication!
36Asynchronous Communication
Solution ?
37Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
38Mechanisms for Async Comm
Solution1 Writer bins the new item when buffer
is full
39Mechanisms for Async Comm
Solution1 Writer bins the new item when buffer
is full
40Mechanisms for Async Comm
Solution1 Reader re-reads the old item when
buffer is empty
41Mechanisms for Async Comm
Solution1 Reader re-reads the old item when
buffer is empty
42Mechs for Async Comm
Solution1 implemented as a non-blocking FIFO
(IEEE TC VLSI Newsletter Fall 1998)
43Mechs for Async Comm
Solution2 Writer overwrites the item when buffer
is full
44Mechs for Async Comm
Solution2 Writer overwrites the item when buffer
is full
But this involves locking the whole buffer!
45Mechs for Async Comm
Is a (non-blocking) FIFO buffer a proper solution
for the News type of data?
46Mechs for Async Comm
No! News maybe out of date when it reaches Reader
47Mechs for Async Comm
- Required Properties
- Total Asynchrony Reader and Writer, independent
motive powers cannot wait - Coherence no data corruption, thus items cannot
be written/read in part - Freshness Reader must read the item written
most recently by Writer
48Data Coherence
49Data Coherence
50Data Coherence
51Data Coherence
52Data Coherence
53Data Coherence
54Data Coherence
55Data Coherence
56Data Coherence Violation
57Data Freshness
58Data Freshness
59Data Freshness
60Data Freshness
61Data Freshness
62Data Freshness
63Data Freshness Violation
64Async Comm Mechanisms
How to maintain Asynchrony, Coherence and
Freshness?
65Async Comm Mechanisms
How to maintain Asynchrony, Coherence and
Freshness?
Control variables
ACM
66Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
67Slot Mechanisms
- How many slots is enough?
68Slot Mechanisms
- How many slots is enough?
- One cannot be both async and coherent
69Slot Mechanisms
- How many slots is enough?
- One cannot be both async and coherent
- Two can be made async and coherent
- but no freshness
70Slot Mechanisms
- Three or Four Slots are sufficient to achieve
freshness - We used algorithms due to Hugo Simpson (BAe)
71Three-slot ACM
Writer
Reader
s1
23.12
s2
27.12
s3
30.12
72Three-slot ACM
Writer
Reader
s1
02.01
s2
27.12
s3
30.12
73Three-slot ACM
Writer
Reader
s1
02.01
s2
27.12
s3
30.12
74Three-slot ACM
Writer
Reader
s1
02.01
s2
02.01
27.12
s3
30.12
75Three-slot ACM
Writer
Reader
s1
02.01
s2
02.01
27.12
s3
30.12
76Three-slot ACM
Writer
Reader
s1
02.01
s2
02.01
03.01
s3
30.12
77Three-slot ACM
Writer
Reader
s1
02.01
s2
02.01
03.01
s3
30.12
78Three-slot ACM
Writer
Reader
s1
02.01
s2
02.01
03.01
s3
30.12
79Three-slot ACM
Writer
Reader
s1
02.01
s2
02.01
03.01
s3
05.01
80Three-slot ACM
Writer
Reader
s1
02.01
s2
03.01
s3
05.01
81Three-slot ACM
Writer
Reader
s1
02.01
s2
03.01
s3
05.01
82Three-slot algorithm
Reader
Writer
wr dninput w0 ln w1 ndiffer(l,r)
r0 rl rd outputdr
n (new), l(last), r(read) 3-valued vars
83Three-slot algorithm
84Four-slot ACM
Writer
Reader
s0
s1
v0
v1
85Four-slot algorithm
Reader
Writer
wr dn,sninput w0 sn sn w1 ln
nr
r0 rl r1 vs rd outputdr,vr
n (new), l(last), r(read) binary vars
86Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
87Implementation of ACM
Data In
Data Out
writer
reader
start
done
start
done
steering
ACM control part
wr-req
Write control
Statement logic (mutex, latches, selectors)
Read control
r0-req
wr-ack
r0-ack
w0-req
rd-req
w0-ack
rd-ack
w1-req
w1-ack
88Implementation of ACM
Data In
Data Out
writer
reader
start
done
start
done
n,r,l
steering
wr-req
Write control
Read control
Statement logic (mutexes, latches, selectors)
r0-req
wr-ack
r0-ack
w0-req
rd-req
w0-ack
rd-ack
w1-req
w1-ack
89Write Control STG
start
done-
start-
done
90Write Control logic direct translation from STG
913-slot ACM design
Rw0
Rr0
write control
mutex
read control
Gw0
Gr0
w0-req/ack
w1-req/ack
r0-req/ack
l
differ reg n
reg l
reg r
r
n
l
r
923-slot ACM design
Rw0
Rr0
write control
mutex
read control
Gw0
Gr0
w0-req/ack
w1-req/ack
r0-req/ack
l
differ reg n
reg l
reg r
r
n
l
r
93Differ and register logic
differ
register
l1
l2
n1
l3
w1-ack
n2
r1
r2
n3
r3
w1-req
943-slot ACM design
Rw0
Rr0
write control
mutex
read control
Gw0
Gr0
w0-req/ack
w1-req/ack
r0-req/ack
l
differ reg n
reg l
reg r
r
n
l
r
95Write control circuit STG
96Write control ckt from Petrify
97Analogue simulation
983-slot vs 4-slot performance
Time for control statements
statements 3-slot min time ns 4-slot min time ns
w0w1 4.19 9.39
r0(r1) 1.38 3.47
99Other analyses of ACM designs
- Response time analysis for Write and Read using
stochastic Petri nets (tool PET by Xie and
Beerel) - The circuit response varies with the relative
frequency of Write/Read, e.g. higher Write
frequency increases the chance for Read to hit
arbitration and hence be delayed.
100Response time analysis
101Other analyses of ACM designs
- Digital simulation using Verilog models for
Writer, Reader and ACM - The circuit is a coherent, fresh and
non-blocking mechanism. Clear indication of data
over-writing (skipping) and re-reading (olding)
102Digital simulation
103Digital simulation
104Digital simulation
105Digital simulation
106Digital simulation
107Digital simulation
108Other analyses of ACM designs
- Stochastic analysis of skipping and olding using
Generlised SPN (GSPN) tool -
109Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
110Comparison with FM solutions
- Fundamental Mode designs for 4-slot were proposed
by H. Simpson and E. Campbell - Writer and Reader time (with individual motive
powers) their wr, w0, w1 and r0, r1, rd
operations, allowing enough time for potential
m/stability to settle on control variables n, r, l
111Comparison with FM solutions
- Self-timed (I/O mode) design
- can potentially run faster than the FM design
- makes it possible to operate in FM with
practically bounded Read and Write control
actions - Theoretical possibility of unbounded
metastability gt trade-off between temporal
independence and data coherence
112Outline
- Asynchronous Communication
- Mechanisms for Async Communication
- Three and Four Slot ACMs
- Speed-independent implementation
- Comparison with FM solutions
- Conclusions
113Conclusions
- Speed-independent VLSI (AMS 0.6mm CMOS)
implementation of 3- and 4-slot ACMs for News
(reference) data transfers - Minimum granularity of blocking a binary
variable - Practical boundedness of Slot Acquisition Time
- For non-handshake interfaces dones can be
dropped - What is the right size of data blocks for such
ACMs?
114VLSI design layout
115VLSI Design layout
116And now
117Hag Sameah!
Everybody to the
C