Title: Circuit
1Circuit Signaling Techniques for On-chip
Interconnects
- Atul Maheshwari
- Dept. of Electrical and Computer Engineering
- University of Massachusetts Amherst
- Advisor Prof. Wayne Burleson
Funded by SRC under contract 766 and 1075
2Outline
- Motivation
- Existing solutions
- Test chip
- Proposed Solutions
- Current-sensing Differential Single-ended
- Transition encoded and current-pulse
- Hybrids based on current-sensing and repeaters
- Phase coding Open Loop, Closed Loop
3Interconnect and device scaling
Courtesy ITRS Roadmap 01
4Research in Interconnect Design
- Materials Copper wire, low-k dielectrics
- Geometry Wire width, Height, Layers,
Shields/Returns - Circuits Repeaters, Boosters, Sense-Amps
- Signaling technique Differential, Low-swing,
Dynamic, Transition/Spatial encoding,
Asynchronous, Multi-level, Wave pipelining
5Existing Solutions
- Dynamic bus Implement repeaters as
dynamic/domino gates - Low Swing bus Use VDDL for the driver/repeaters
6Limitations of Existing Techniques
- Repeaters / Buffers
- Placement constraints
- Source of power and noise
- Scaling issues
- Dynamic bus
- Increased power dissipation
- Noise sensitivity
- Low Swing
- Significant performance penalty
- Increased noise susceptibility
- Boosters
- Process variations might wipe out any gains
- Reduced noise margin
7Contributions
- Novel techniques for on-chip interconnect
- Signaling in current Current-sensing
- Solved several issues with initially proposed
current-sensing - Faster and low-power compared to conventionally
used repeaters - Signaling in time Phase coding
- Transmit multiple bits on a wire to improve
bandwidth and reduce power - A process variation tolerant design based on DLL
- Interconnect test-chip in 0.18? CMOS
- Publications and Patents
8Chronology
- Masters Thesis
- Differential Current-sensing
- PhD proposal
- In-depth analysis and Solution to static power
dissipation - Hybrid circuits
- Phase coded signaling
- PhD Defense
- Test-chip Design and Measurements
- Single ended current-sensing and Transition
encoded current-sensing - Analysis in 90nm CMOS process (_at_ Intel)
9Die photograph
3.3 mm
Repeaters
Current-pulse Signaling
Noise
2.7 mm
Phase coding (Open Loop)
Hybrid Circuit
Differential Current-sensing
Phase coding (Closed Loop)
10Measurement Setup
11Close-up of the probes
12Current-sensing
- Sense current instead of voltage
- Avoids charging and discharging wire capacitance
Current sense Amp
Driver
Initial Cascade
VDD
EQ
M4
M3
M8
OUT
IN
OUT
M1
M2
VDD
M6
M5
VSS
A. Maheshwari and W. Burleson, "Current-sensing
methods for global interconnects in VDSM CMOS",
ISVLSI, April 2001, pp. 66-70
13Improved Receiver (DCSRx)
CLK
I2
VDD
VDD
M4
M3
M12
M11
M14
M13
OUTR
OUTS
I3
M1
M2
OUT
OUT
I3
M10
M9
IN
IN
M15
M16
VDD
I2
M5
M6
VSS
VSS
14Delay Comparison
20 delay benefit, 42 for 2x minimum width wires
15Performance Comparison with Scaling
Repeaters Current-sensing
7 - cycles
9 - cycles
5 - cycles
3 - cycles
1 - cycle
A. Maheshwari, S. Srinivasaraghavan and W.
Burleson, Quantifying the impact of
current-sensing on interconnect delay trends,
ASIC/SOC, Sept 2002
16Power dissipation
10 input data activity
17Measured Power v/s Activity Factor
18Single-ended static power free current-sensing
receiver
19Delay comparison (Simulated)
50 delay benefit on an average in Intel 90nm
(wires with 2x min. width)
20Static power in current-sensing
VDD
EQ
M4
M3
M8
IN
OUT
OUT
M1
M2
Vdd
M6
M5
VSS
- Major source of static power from driver to
receiver - For short wires 1-5mm 60 or more of total
power
21Transition Encoded Current-Sensing
- Send current only when there is a transition
- Hold the bus at GND otherwise
- Encoder and decoder overhead
Current sense Amp
Driver and wire
Encoder
Decoder
IN
OUT
Q
IN
OUT
D
CLK
Transition Encoded Current-sensing
CLK
22Current-pulse signaling
- Send a current pulse when data changes
- Extremely energy efficient
- Noise sensitivity
Current sense Amp
Driver and wire
Encoder
Decoder
IN
OUT
OUT
IN
OUT
CLK
Del
23Power Comparison (Measured)
46 less than repeater insertion at 10 data
activity
24Delay Comparison (Measured)
12 faster than Differential Current-sensing 30
faster than Repeaters
25Hybrids Exploit the benefits of both Repeaters
Current-sensing
LR1
LR2
LC
IN
VDD
EQ
M4
M3
M8
OUT
M1
M2
OUT
Vdd
M6
M5
VSS
Uniform Repeater Insertion
Current-sensing e.g. DCS
How much wire driven by repeaters ?
26Delay for Hybrid with 25 current-sensing
10 faster than Differential Current-sensing 27
faster than Repeaters
27Application of Hybrids A microprocessor cache bus
2mm
2mm
Latch
CLK
SCSRx
Latch
CLK
Cache
CLK
20 performance benefit over dynamic bus in Intel
90nm CMOS technology Reduction of size in the
driver leads to 35 power savings Bus is 256/512
bit wide Large implications
28Noise Analysis
- Repeaters flanked by Repeaters
- Increase in delay 18
- Differential Current-sensing flanked by
Differential current-sensing - Increase in delay 16
- Differential Current-sensing flanked by Repeaters
- Increase in delay 35
- Current-pulse signaling with current-pulse
signaling as neighbor - Decrease in delay 2
29Summary Current-sensing
- Differential Current-sensing is a faster circuit
technique for on-chip interconnects - Single ended current-sensing is faster than
differential current-sensing - Current-pulse signaling and Transition Encoded
signaling eliminate static power dissipation - Current-sensing repeater hybrids provides
solution to placement constrained repeaters
30Die photograph
Repeaters
Current-pulse Signaling
Noise
Phase coding (Open Loop)
Hybrid Circuit
Differential Current-sensing
Phase coding (Closed Loop)
31Phase Coding
- On chip communication getting costlier and slower
- Need for higher bandwidth buses
- Critical to power and performance
- A possible solution send multiple bits
- Transmitting multiple bits in one transition
- Significant power and area savings
- Increased bandwidth
- Phase coding Phase determines the data
32Phase Coding Taxonomy
33Encoder
D
D
D
REF
34Decoder
REF
D
D
D
D
D
Received Signal
D
D
D
D
D
D
Q
Q
Q
Q
Q
Q
Decoder
Output
35Open Loop Phase Coding
Mux
Decoder
Mux
Decoder
OUT0N
Mux
Decoder
Mux
Decoder
Mux
Decoder
REF
IN0N
- Delay elements can be shared across wires
- Supply noise, Process variation etc. can result
in errors
36Measured Results Open Loop
- 16 bit wide, 5mm long bus, 0.27u wide, 0.27u
spacing, shielded, 1GHz - Repeater insertion, Transition encoding used
- Encode in ½ cycle and use ½ cycle for decode
37DLL Based Design
- DLL provides a better control over the phase
generation - The closed loop architecture allows for a
reliable phase generation and detection - DLL is relatively insensitive to parametric and
other variations - Use of DLL would allow for higher levels of
encoding
38Delay Element
- Capacitive load used to achieve delay separation
- The number of loads connected determine the delay
39Reliable Phase Generation
REF
up
Phase Detector
4 bit shift register
down
freeze
REF
DEL8
D
D
D
D
D
D
D
D
- Use the generated phases for encoding or decoding
- DLL part of the design can be shared by various
phase encoders/decoders
40Measured Results Closed Loop
- 16-bit 5mm long bus, 0.27u wide, 0.27u spacing,
shielded, 1GHz - Repeater insertion, Transition encoding used
- Encode in ½ cycle and use ½ cycle for decode
41Summary Phase Coding
- Encoding levels of upto 4-bits/wire explored
- Higher power efficiency for wider buses
- DLL portion at receiver and transmitter can be
shared - Combine with other signaling techniques to
increase power/delay benefits - Needs multiple cycles to transmit data
- Process variation may limit the level of encoding
- Higher levels of encoding can be obtained for
slower clock rates
42Future Work
- Interconnect Synthesis
- Combine current-sensing and phase coded signaling
- Efficient coding scheme for phase coding
- Analytical models for current-sensing
43Thanks
44Publications
- A. Maheshwari and W. Burleson, "Current-sensing
methods for global interconnects in Very Deep
Submicron (VDSM) CMOS", Proceedings of the IEEE
Annual Workshop on VLSI, April 2001, pp. 66-70. - A. Maheshwari and W. Burleson, "Current-sensing
for Global Interconnects, Secondary Design
Issues Analysis and Solutions", Proceedings of
International Workshop on Power and Timing
Modeling, Optimization and Simulation, September
2001, pp. 4.4.1-4.4.10. - A. Maheshwari, S. Srinivasaraghavan and W.
Burleson, Quantifying the impact of
current-sensing on interconnect delay trends,
Proceedings of the IEEE ASIC/SOC conference,
September 2002, pp. 461-465 . - A. Maheshwari and W. Burleson, Repeater and
Current-sensing hybrid circuits for on-chip
interconnects, Proceedings of Great Lake
Symposium on Very Large Scale Integration
Circuits, April 2003, pp. 269-272. - A. Maheshwari and W. Burleson, Differential
current-sensing for on-chip interconnects, IEEE
Transactions on TVLSI, accepted for publication. - V. Venkatraman, A. Maheshwari and W. Burleson,
Mitigating Static power dissipation in
current-sensed Interconnects, Proceedings of
Great Lake Symposium on VLSI, April 2004, pp.
224-229. - A. Maheshwari and W. Burleson, Current-pulse
signaling for on-chip interconnects, to be
submitted to ISSCC - under preparation. - A. Maheshwari and W. Burleson, Phase coded
signaling for low power on-chip bus, to be
submitted to ISSCC - under preparation. - Several SRC reports.
45Test-chip
- 0.18µ TSMC, 6-Metal through MOSIS
- Techniques to be tested
- Repeater insertion
- Differential current-sensing
- Repeater Current-sensing Hybrids
- Current-pulse signaling
- Phase coding
- Open loop
- Closed loop
- Noise Analysis
- Current-status In Fab Expected 1st June 2004
46Preliminary Comparison
- Repeaters
- Noise immune
- Consume less power than current-sensing
- Faster for extremely long wires
- Performance degradation with scaling
- Sources of noise
- Process Variation
- Current-sensing
- Faster than repeaters for single cycle wires
- Performs better as technology scales
- Static power
- Differential-signaling
- Susceptible to noise
47PHCOD Closed loop
Hybrids
DCS
PHCOD Open loop
NOISE
Current Pulse
Repeaters
48Delay comparison
49Delay comparison for 1 clock cycle
Repeaters
Current-sensing
20 delay benefit on an average 45 delay benefit
for wires with 2x width
50Power Comparison
Extremely power inefficient at 10 data activity
due to static power
51Energy Consumption for single ended
current-sensing
Dynamic bus (2x min width)
52
Current-sensing (min width)
75
Current-sensing (2 x min width)
52TECS Delay
30 delay benefit on an average over dynamic
bus 50 delay benefit on an average over
transition encoded dynamic bus
53TECS Energy Consumption
70 energy savings at 10 data activity
54Static Power in Current-sensing
55Hybrid Delay
A. Maheshwari and W. Burleson, Repeater and
Current-sensing hybrid circuits for on-chip
interconnects, GLSVLSI, April 2003
56Hybrids for Placement Constrained Wires
No devices allowed
Block A
Block B
Block A
Block B
Current-sensing
Block A
Block B