Title: A Novel Clock Distribution and Dynamic Deskewing Methodology
1A Novel Clock Distribution and Dynamic De-skewing
Methodology
- Arjun Kapoor University of Colorado at Boulder
- Nikhil Jayakumar Texas AM University, College
Station - Sunil P. Khatri Texas AM University, College
Station
2Introduction
- Clock Distribution critical in ICs.
- In typical ICs, clock is distributed to several
sites on the IC from one central clock signal. - Requirement is to minimize skew between these
sites. - One of the available networks H-Tree
- Zero skew without considering process variations
- With diminishing feature size, increasing die
size, intra-die variations lead to increased skew
across a die.
3Previous Approaches Hierachical H-tree De-skew
- Phase detectors located on the domain boundaries
of each leg of the H-tree. - Possible worst case skew between 2 neighboring
leaves can be as high as (2n1)D where, - D guardband of the phase detector
- n number of levels
- A Design for Digital Dynamic Clock Deskew,
Dike et.al.
4Previous Approaches Mesh Deskew
- Phase detectors used between each pair of leaf
nodes of the H-tree. - Clock skew between neighboring leaves is now D
(guardband of phase detector). - Clock skew across die is still high -
- mD between any 2 leaf nodes where, m number of
phase detectors between the 2 leaf nodes
- A Design for Digital Dynamic Clock Deskew,
Dike et.al.
5Our Approach
- Clock signal is returned from leaf nodes.
- Single phase detector at center of tree.
- All returned clock signals are compared with the
same delayed reference signal. - De-skewing can be done at boot-up time or
dynamically during free cycles.
6Our Approach
- Use a modified buffered H-tree.
- Have buffers at each level.
- Not typically done due to process variation in
buffers. - Wire width sizing reversed.
- Typical H-tree width decreases with level.
- Our H-tree width increases with level to make
sure buffer at each level sees same load. - We utilize clock shield wires and one phase
detector.
7Network Topology
- Clock assumed to be routed on metal 6.
- Typical H-tree requires clock wire and 2 shield
wires on either side. - We use an additional return wire of same width as
clock wire.
8The H-Tree
- Each section of the H-tree has tri-stateable
inverters in both the forward and return clock
networks. - Forward network always ON.
- Return network only sections on path to be
deskewed turned ON.
9Wire Widths
Sizes(in microns) derived for 20mm x 20mm
die. 1GHz targeted clock frequency.
- Traditional H-tree Wire widths larger at
center, narrower near leaf nodes necessary to
ensure clean signals at leaf nodes. - Our H-tree Wire widths larger near leaf nodes
and narrower at center to ensure each buffer
sees same load.
10Deskewing Operation
- We use only one phase detector unlike previous
deskewing methods. - Clock signal returned from each node compared
with a single reference signal. - Single phase detector at chip center
- Largest skew (after deskewing) between any 2
nodes is not a function of the phase detector
phase detector accuracy/guardband unimportant. - Required delay achieved using tune-able capacitor
bank.
11Deskewing Operation
- Deskewing performed at slower clock rate
- Slower clock required for phase detector to work.
- Minimize cross-talk
- When clock signal returns on return path, forward
path should be stable. - Ensure that half the time period of the clock gt
round trip delay of the clock signal. - Return path is grounded (acts as shield) during
non-deskew mode
12Tune-able Bank at Leaf Nodes
- Capacitors are binary weighted to facilitate
precise control of delay. - Resistor added to increase the incremental delay
per capacitor. - Value of resistor chosen such that slew rate of
last segment is not appreciably changed and
incremental delay is as desired.
13The Phase detector
- Condition LAG O is low at T1 and high at T2 -gt
A lags B, phase detector not tripped. - Phase detector said to be tripped when condition
LAG does not hold. - Delay is incrementally increased till the LAG
condition FAILS to hold (phase detector trips). - Guardband of phase detector is hence unimportant
14Communicating with Tune-able Banks and
Tri-stateable Buffers.
- Use a 2 wire serial communication scheme.
- Use shift registers at each tune-able bank,
tristate-able buffer. - At most 6 bits required to address each
tristate-able node of a 6 level H-tree network. - 7 bits required for a 7 bit capacitor bank.
- First assert reset signal (derived from the
signal wires) then send a 6 bit address (to
address the correct capacitance bank, return
path). Next send 7-bit data (capacitance value)
15Addressing Mechanism
3-level H-tree
16m-bit Decoder to Address the Tristate-able Buffers
- Serial shift registers serially shift in m bits
of the address (m is the level in the H-tree at
which the tri-state buffer is located). - Clocking stopped by last Flip-flop.
- Combinational logic checks if the m-bits in the
shift register match the address of the tri-state
buffer. - HIT signal generated if all m-bits are in and
address is a match
177-bit Decoder for Selecting Capacitance Value
- Data shifted in serially (similar to the scheme
used to address the tri-state buffers). - HIT signal from the decoder of the last
tristate-able buffer produces a reset pulse - Clocking stopped by last Flip-flop (let go again
only when the next HIT signal arrives).
18Overall Operation of the Serial Communication
Scheme
- Follow the sequence of
- Serial-reset transmit address
transmit-data sequence - Each such sequence requires 13 clock cycles
- Each leaf node requires at most 27 (for a 7-bit
capacitor bank) such sequences. - With deskew done at 100Mhz, a 6-level H-tree (64
leaf nodes) would be deskewed in about 1ms.
19Experimental Results
- Simulated process variations (tox,µ, leff, VT)
- - values as suggested by
- Characterization and modelling of clock skew
with process variations, Zarkesh-Ha et.al.
20.Experimental Results
- Compared against traditional (non-buffered)
H-tree with no deskew mechanism (operating at
1Ghz). - 7.9 lower power in our network
- Many small buffers used.
- Wire loads involved are smaller (improvement
would be higher for higher frequencies).
21Conclusions
- We have a novel clock distribution network with
dynamic de-skewing capability - We can de-skew nodes that are skewed by 300ps
down to 3ps - We do this with a 7.9 power reduction and 34
area overhead when compared to a traditional
H-tree
22Thank you.