Title: Huifang Qin, Kevin Cao, Prof' Jan Rabaey
1Low Voltage CMOS Design
- Huifang Qin, Kevin Cao, Prof. Jan Rabaey
- Berkeley Wireless Research Center
- 1/14/ 2002
2Outline
- Introduction Go low voltage in digital CMOS
design! - Power
- Delay
- Reliability
- Memory side Results from SRAM leakage control
test chip - Data Retention Voltage (DRV)
- Timing overhead in modes switching
- Leakage saving in low voltage standby
- Low standby supply voltage generation
- Logic side Modeling of CMOS circuit delay
statistical distribution under low Vdd - Lognormal model fitted to inverter delay distr.
3Low Voltage CMOS Design Considerations
- Power Power dissipation becoming the main
roadblock towards further integration - E.g. _at_ 1V, 4KB 0.13um SRAM leaks 0.25mW
? The Piconode on chip memory (32KB SRAM) will
leak 2mW, while the power budget for a Picnode is
100uW ? Approximately 50x leakage saving needs
to be achieved.
4Low Voltage CMOS Design Considerations
- Delay Also increasing
- How about chopping it with reliable computation
on unreliable platforms then? - Proposed concept of computation in the ultra
low voltage space - E.g. Traditionally a pipeline had to tolerate the
maximum possible logic delay all the time. How
about now allowing it fail under 20 of chance
with too large data path delay, however armed
with certain failure correction feature.
- Reliability Important issues include
- Memory data preservation under low supply
voltage - Logic delay statistical distribution under low
supply voltage - Failure correction function implementation of
the unreliable platform
5Existing Work and the Goal
- Researches and prototypes on 0.5V circuits /
systems are productive - 0.5V CMOS Logic Delivering 200 Million 8x8 Bit
Multiplications/s V. Dudek, R. Grube, B.
Hofflinger, M. Schau, ISLPED98 - A 0.5V SIMOX-MTCMOS Circuit with 200ps Logic
Gate, T. Douseki, S. Shigematsu, Y. Tanabe, M.
Harada, H. Indkawa, T. Tsuchiye, ISSCC96 - 0.5V 320MHz 8b Multiplexer / Demultiplexer Chips
Based on a Gate Array T. Hirota, K. Ueda, Y.
Wada, K. Mashiko, HH. Hamano ISSCC98 - A 0.5V Power-Supply Scheme for Low Power LSIs
using Multi-Vt SOI CMOS Technology, T. Fuse, A.
Kameyama, M. Ohta, K. Ohuchi, 2001 Symposium on
VLSI - A 0.5V 200MHz 1-Stage 32b ALU using a Body Bias
Controlled SOI Pass-Gate Logic, T. Fuse, Y.
Oowaki etc, ISSCC97
however all the data preservation in these 0.5V
systems are still done in SRAM with 1V supply
- Scale the SRAM supply to 0.5V also or even
300mV, with reliable data preservation?
- Then operate everything under 300mV with the
unreliable platform design to cut delay.
6SRAM Chip and Testing
Smorgas board and SRAM chip in testing
7Switch Capacitor Converter Output
1V
C
C
C
C
C
R mem
Cp
CLK
Equalizing Phase
Charging Phase
CLK
8Switch Capacitor Converter Output
9Dual Supply Scheme 1V / 200mV
10Supply Switching Timing Overhead
1V Normal Operation Vdd
200 mV Standby Vdd
Tvdd_up 17.3 ns Vs. Simulation result 7ns
Tvdd_down 2.5 us
- 4K bytes 0.13um SRAM with 100um wide switch
- While Vdd ramp down time is generally not a issue
in the system design, the 20ns Vdd ramp up time
is still acceptable for a 50ns clock period
(20MHz).
11State 0 Preservation
- DRV for this SRAM cell 270mV
- _at_ 275mV the state 0 is preserved
- _at_ 265mV the state 0 is lost
- (state flipped to 1 when
- Vdd back to 1V)
- Compared with the DRV from simulation 100mV,
the reason for this large discrepancy is still
under investigation
Vthn 430mV Vthp 340mV
12State 1 Preservation
- DRV for this SRAM cell 200mV
- _at_ 205mV the state 1 is preserved
- _at_ 195mV the state 1 is lost
- (cell state stays 0 when Vdd back to 1V)
- Compared with the DRV from simulation 100mV,
the reason for this large discrepancy is still
under investigation -
Output of the Sense Amp. Switches to 0 while
the supply is lower than 400mV, however the state
1 may still be preserved.
t0
t1
Standby Period Ending
State 1 is written back into the cell in the
normal operation
130.13um SRAM Leakage Vs. Vdd
1K Byte SRAM Leakage Vs. Vdd (old plot from
simulation)
4K Byte SRAM Leakage Vs. Vdd (from testing)
Real leakage saving (according to testing
data) is larger than predicted by simulation. _at_
1V, Pleak 0.241 mA 1 V 241 uW _at_ 300mV,
Pleak 0.018 mA 0.3 V 5.4 uW 2.2
241uW _at_ 400mV, Pleak 0.024 mA 0.4 V 9.6 uW
4 241uW
14Outline
- Introduction Go low voltage in digital CMOS
design! - Power
- Delay
- Reliability
- Memory side Results from SRAM leakage control
test chip - Data Retention Voltage (DRV)
- Timing overhead in modes switching
- Leakage saving in low voltage standby
- Low standby supply voltage generation
- Logic side Modeling of CMOS circuit delay
statistical distribution under low Vdd - Lognormal model fitted to inverter delay distr.
15Variations in the Nanometer Regime
- Process Technology
- Lithography (Leff)
- Doping implantation (Vth)
- Oxidation (Tox)
- CMP (t), etc.
- Function Failure
- Non-working device
- Short/open circuits
- Over timing budget
- Logic failure
- Intolerable bias
- Circuit Operation
- Signal coupling (T)
- Power supply (Vdd)
- Chip temperature
- Clock distribution, etc.
- Yield Loss
- Chip speed
- Power consumption
- Sale price
16Technology Trend
- ITRS predicts constant process deviations
- However, process control is much more costly
- Operation-caused variations (Vdd, coupling noise,
Temp.) keep increasing
(Courtesy of S. Nassif, IBM)
17Simulation Methodology
- Monte Carlo analysis is more accurate than worst
case analysis - Capable to handle correlations
- Fast enough for simple critical path structure
- Interconnect coupling modeled
18CMOS Inv. Delay Distribution Off Gaussian
- Simulation Setup
- Monte Carlo Simulation is applied on 130nm /
70nm CMOS circuits with SPICE model. - Input Gaussian random variables are Vdd, Vth,
Leff, Tox etc.
130nm CMOS Inv. Vdd_mean 550mV Vth_mean
280mV Vdd_var 18.3mV Vth_var 28mV
19CMOS Inv. Delay Distribution Off Gaussian
20Lognormal Model Fitting the Delay Distr.
- Here the LOG of delay distribution is Gaussian,
but not the delay itself!
CDF discrepancy between the Gaussian / Lognormal
model and the real experimental distribution.
PDF of Gaussian and Lognormal Modeling of the
Delay Distribution
21Lognormal Model Proved Efficient on 130nm / 70nm
Inverter Delay Distribution
- The lognormal model always provides higher
accuracy under high / low Vdd and Vth values. - The larger ( Vdd Vth ) is, the more
Lognormal the delay distribution is.
Left subplots PDF Right subplots CDF
discrepancy between the Gaussian / Lognormal
model and the real experimental
distribution. Blue lines ? Experimental Red
lines ? Gaussian model PDF Green lines ?
Lognormal model
22Alpha-Power Law Delay Model Predicts the
Lognormal Distribution
Alpha-Power Law Delay Model
Both the real and the alpha-power law distr. are
showing the similar shapes, and being fitted well
by the Lognormal model, with minor CDF deviation.
- Assuming ad 2, the delay is calculated by the
alpha-power law delay model with input Gaussian
variables Vdd and Vth.
23Why the Lognormal?
- Intuitively
- The lognormal model is a model applied widely in
many fields, for those variables that increase
without limits but cannot fall below zero. The
delay distribution is one of such variables, so
that the lognormal becomes a good empirical model
candidate for it. - Since the lognormal model takes care of the
distribution asymmetry, its not surprising
seeing that it always fit the delay distribution
better than the Gaussian model, which assumes
rigid symmetry.
24How should we model more complicated gates / data
paths in the future?
Questions
1. Comprised of a series of adder cells, why
shouldnt the critical path delay become more
Gaussian according to the Central Limit Theorem
(here the delay variable is the sum of multiple
cell delay variables)? 2. What would happen to a
more complicated data path comprised of multiple
gates, more symmetry or less?
Now the adder behaves even off Lognormal
25Future Work
- Memory side
- Clarifying the reason for the discrepancy
between the testing and simulation results of
DRV, then an analytical modeling of DRV. - After we have the memory sleep well under low
voltage, how about the operation under low
voltage too? More issues on speed and
reliability will be concerned. - Modeling of DRV can be extended to register
files and other data retention elements, so that
we can bring more applications into deep sleep,
for example a microprocessor. - Logic side
- Is there a general model that can be fitted to
the delay of various CMOS circuits, small and
big, simple and complicated? - With some good delay modeling then can we start
figuring out the big picture of the reliable
design on unreliable platform, starting from a
simple pipeline