Title: Advanced VLSI Design
1Advanced VLSI Design
Timing Issues
2Architecture of the Motorola DSP 56K family
- 24-bit general purpose Digital Signal Processors
- It has a dual Harvard architecture optimized for
MAC operations. - It features a three stage instruction pipeline,
which is essentially invisible to the programmer
3Harvard architecture
- Harvard architecture refers to a memory structure
wherein the processor is connected to two
independent memory banks via two independent sets
of buses - The key advantage of the Harvard architecture is
that two memory accesses can be made during any
one instruction cycle
4Major components of the central processing module
- Data Buses
- Address Buses
- Data Arithmetic Logic Unit (data ALU)
- Address Generation Unit (AGU)
- Program Control Unit (PCU)
- Memory Expansion (Port A)
- On-Chip Emulator (OnCE) circuitry
- Phase-locked Loop (PLL) based clock circuitry
5Synchronization
- Well defined ordering of switching events for
circuit to operate correctly - In synchronous system approach---all memory
elements are simultaneously updated using a
global clock. - Register based clocking (robust, reliable) ,
latch based clocking
6Pipelining
- To accelerate the operation of data path,
pipelining is used - Computation is performed in assembly line like
fashion - Pipelined network outperforms original circuit
with respect to speed - Macro pipeline, micro pipeline
7Pipeline PCU MACRO LEVEL
8Pipelined datapathMICRO-LEVEL
9Timing Parameters
- Assume positive edge triggered system
10Timing Definitions
CLK
Register
t
D
Q
t
t
hold
su
D
DATA
CLK
STABLE
t
t
c
q
2
Q
DATA
STABLE
t
11Timing constraint
Minimum cycle time T gt tc-q tsu tlogic
Hold time constraint thold lt t(c-q, cd)
t(logic, cd)
12Clock Non-idealities
- Clock skew
- Spatial variation in arrival time of a clock
transition. - It is caused by mismatches in clock path or clock
load - It can be positive or negative depending upon
routing direction and position of clock source - Clock skew does not result in clock period
variation
13Positive and Negative Skew
14Positive Skew
Launching edge arrives before the receiving edge
15Impact of positive clock skew
Minimum cycle time T ? tc-q tsu tlogic
Worst case is when receiving edge arrives early
(positive ?)
16Race condition
- Hold time constraint
- thold ? lt t(c-q, cd) t(logic, cd)
17Negative Skew
Receiving edge arrives before the launching edge
18Impact of negative clock skew
Minimum cycle time T - ? tc-q tsu tlogic
Worst case is when receiving edge arrives early
(positive ?)
19No Race condition
- Probability of race condition is reduced or nil
- thold - ? lt t(c-q, cd) t(logic, cd)
- System never fails as new data latched on to R1
never gets transferred to R2 as it would turn off
20Clock Non-idealities
- Clock jitter
- Temporal variations of the clock period at a
given point on the chip. i. e Clock period
reduces or expands on a cycle by cycle basis - Absolute jitter (tjitter)---worst case variation
of a clock edge at a given location with respect
to an ideal clock. - Worst case--- Tclk reduces by 2t jitter
- Cycle to cycle jitter (T jitter) ---deviation of
single clock period relative to ideal clock.
21Impact of Jitter---always slows down
22Clock Non-idealities
- Variation of the pulse width
- Important for level sensitive clocking
23Combined Impact
- Minimum time available (neg skew)
- Tclk -d - 2tjitter tc-q tlogic t su
- or
- Tclk tc-q tlogic t su d 2tjitter
24Hold time constraint (pos skew)
thold tjitter d - tjitter tc-q cd tlogic,
cd
- Minimum time available (pos skew)
- Tclk d - 2tjitter tc-q tlogic t su
- Tclk tc-q tlogic t su -d 2tjitter
25Clock Skew and Jitter
Clk
tSK
Clk
tJS
- Both skew and jitter affect the effective cycle
time - Only skew affects race condition
26Sources of skew and jitter
- Clock signal generation
- Manufacturing device variations
- Interconnect variations
- Environmental variations
- Capacitive coupling
- Design clock distribution network carefully
27Latch-Based Design
L2 latch is transparent when clk f 1
L1 latch is transparentwhen clk f 0
f
L1
L2
Logic
Latch
Latch
Latch is a soft barrier
28Performance Similar to
29Slack borrowing
- Enhanced performance due to flexible timing, yet
no design changes - Possible for logic block to utilize time that is
left over from the previous logic block. - Total logic delay can be more than one clock
cycle
30(No Transcript)
31Reg based vs. latch based--example
32Less Tclk
33Maximum slack possible
- Max time that can be borrowed is 0.5 Tclk
- So max logic cycle delay can be 1.5 Tclk
- But for n stages overall delay would be
- n Tclk
34Drawbacks
- We have to use
- two phase clocking scheme,
- Glitches-power dissipation increases
35Asynchronous systems--Self timed approach
- Syn systems
- logical ordering of events by clk. It provides a
time base - Physical timing constraint- next edge comes when
all blocks have reached steady state - ProblemCLB has to wait even though it may finish
earlier. Clock distribution network
36Asynch. designmeeting constraints
- Advnext block can start computation as soon as
previous block has finished. - Problem when to latch the output ? When output
is a correct value? - Remedysystem has to meet timing constraints
37Local signals
- Logical ordering and physical timing --
- START, DONE, -- physical timing
- REQUEST , ACKNOWLEDGE - Logical ordering
38Self timed system
- System generate its own timing signal
39Self timed system --Hand shake protocol
- Hand shaking- synchronize by mutual agreement
- adv.--timing signals generated locallyless prop.
Delay, high speed, no glock routing - Disadv. hand shaking circuit design
40Implementation of HS protocol-2 phase
414 phase protocol
42Dual rail protocol
- I bit information coded using two wires
- Request is merged with data wires
43Bundled data protocol
44Event Logic The Muller-C Element
454-Phase bundled data Protocol--FIFO
462-Phase bundled data Protocol--FIFO
47(No Transcript)
484-Phase dual rail Protocol--FIFO
492-Phase dual rail Protocol--FIFO
Ack
Done / Req
start
data
50Ack
Done / Req
start
data
51(No Transcript)
52(No Transcript)
53(No Transcript)
54Completion Signal Generationno glitches
55Completion Signal in DCVSL
56Self-Timed Adder--example
57Bundled data protocol
58Memory element design
59PERFORMANCE PARAMETERS
- CLOCK LOAD
- NO OF TRANSISTORS
- CLOCKING SCHEME
60Latch versus Register
- Latch
- stores data when clock is low/ HIGH
- Register
- stores data when clock rises
D
Q
D
Q
Clk
Clk
Clk
Clk
D
D
Q
Q
61Storage Mechanisms
Dynamic (charge-based)
Static
CLK
D
Q
CLK
62Static-----Mux-Based Latch-1Q CLK . Q CLK . D
CLK LOAD-4 2 PHASE CLOCKING 10-TRANSISTORS
63Mux-Based Latch(2)-LESS CLK LOAD ,
CLK LOAD-2, 2 PHASE CLOCKING, 6-TRANSISTORS
64Mux-Based Latch(3)-LESS CLK LOAD , Vt DEGRADATION
Non-overlapping clocks
NMOS only
65Master-Slave (Edge-Triggered) Register
Two opposite latches trigger on edge Also called
master-slave latch pair
66Master-Slave Register
Multiplexer-based latch pair
67TIMING METRICS
- T set up I1T1I3I2
- T CLK-Q T3 I6
- T HOLD 0
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
68Reduced Clock Load Master-Slave RegisterSIZING
IMPORTANT-REVERSE CONDUCTION
CLK
T
I
Q
2
3
I
4
CLK
I2 MUST BE WEAK WHEN SLAVE IS ON----REVERSE
CONDUCTION
69TIMING METRICS
- T set up T1I1
- T CLK-Q T2 I3
- T HOLD 0 (OR T1)
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
70Avoiding Clock Overlap
(a) Schematic diagram
CLK
CLK
(b) Overlapping clock pairs
71Non overlapping phases
72TIMING METRICS
- T set up T1I1
- T CLK-Q T2 I3
- T HOLD 0 (OR T1)
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
73Overpowering the Feedback Loop -Cross-Coupled
Pairs
NOR-based set-reset
74Cross-Coupled NAND
Added clock
Cross-coupled NANDs
This is not used in datapaths any more,but is a
basic building memory cell
75Dynamic registers
76TIMING METRICS
- T set up T1
- T CLK-Q I1T2 I2
- T HOLD 0 (OR T1)
- EXACT VALUES CAN BE OBTAINED THROUGH SIMULATION
- IN OVERLAP--
77OVELAPS
78Other Latches/Registers C2MOS
Keepers can be added to make circuit
pseudo-static
79Insensitive to Clock-Overlap
V
V
V
V
DD
DD
DD
DD
M
M
M
M
2
6
2
6
M
0
0
M
4
8
X
X
D
Q
D
Q
M
1
M
1
3
7
M
M
M
M
1
5
1
5
(a) (0-0) overlap
(b) (1-1) overlap
80Dual edge registers
81Single phase clock Latches/Registers TSPC
Negative latch (transparent when CLK 0)
Positive latch (transparent when CLK 1)
82Including Logic in TSPC
Example logic inside the latch
AND latch
83Reduced complexity
84TSPC Register
85Pulse-Triggered LatchesAn Alternative Approach
Ways to design an edge-triggered sequential cell
Master-Slave Latches
Pulse-Triggered Latch
L1
L2
L
Data
Data
D
Q
D
Q
D
Q
Clk
Clk
Clk
Clk
Clk
86Pulsed register-avoid race, single latch
87Pulsed Latches
88Sense amplifier based register