Title: Designing Energy Efficient CMOS Circuits
1Designing Energy Efficient CMOS Circuits
- Vojin G. Oklobdzija
- Advanced Computer Systems Engineering Laboratory
Electrical and Computer Engineering Department - University of California
- Davis, CA 95616
- http//www.ece.ucdavis.edu/acsel
2Summary of the Presentation
- Energy Efficiency of Digital CMOS Circuits
- Power Caused Problems
- Energy-Delay Relationship
- Minimizing Energy for a given delay
- Energy-delay reduction method (20-40)
- Determining the best structures for
high-performance system - Implications on the architecture
3Key Challenges
4MOS Transistor Scaling
From Pat Gelsinger, Intel, DAC 2004 presentation
GATE
Xj
DRAIN
SOURCE
Tox
D
BODY
Leff
Technology has scaled well, and will continue
5CMOS Outlook
From Pat Gelsinger, Intel, DAC 2004 presentation
However
6CMOS Outlook
From Pat Gelsinger, Intel, DAC 2004 presentation
7Gate Oxide is Near Limit
From R. Krishnamurthy, Intel
Intels High K leadership is crucial for the
industry
8Power Will be the Limiter
From R. Krishnamurthy, Intel
1B transistor integration capacity will exist
But the Power
Applications will demand TIPS performance
Challenge Highest performance in the power
envelope
9Delivering Performance in Power Envelope
From Pat Gelsinger, Intel, DAC 2004 presentation
Mobile, Power Envelope 20-30W
Desktop, Power Envelope 60-90W
Server, Power Envelope 100-130W
10Power Trend
Cooling Capacity Of Conventional System
Pentium 4 processor
Business As Usual is Not an Option
Pentium II processor
Pentium processor
Power (W)
486
386
From R. Krishnamurthy, Intel
C scales by 30 per generation but Vcc scales
by 10-15 only! Must maintain or reduce power in
future
11Power Density Will Get Even Worse
- Need to Keep the Junctions Cool
- Performance (Higher Frequency)
- Lower leakage (Exponential)
- Better reliability (Exponential)
Pat Gelsinger, ISSCC 2001
12High-performance Microprocessor Power Density
1000
Rocket Nozzle
Suns Surface
Nuclear Reactor
100
Current Generation
Watts/cm2
Pentium III
Hot Plate
Pentium II
10
Pentium Pro
Pentium
i386
i486
Intel Corp.
1
1.5m
0.25m
0.18m
0.13m
0.1m
1m
0.7m
0.5m
0.35m
0.07m
- Power density is a problem for current
microprocessors
13High-Performance Microprocessor Power Density
Processor thermal map
Intel Corp.
- Local power density is a problem
- Problem areas are datapath elements
14(No Transcript)
15Challenges in High-Performance Design
- Optimization for power not speed - is our goal
- Logical Effort (LE) optimizes for speed,
regardless of power - We have developed a method which
- optimizes for power _at_ given speed
- We are developing new approaches for power
efficiency - (overlooked by delay optimization) applicable
to - - Circuit structures
- - Design techniques
- - Energy-Delay Space
- - Creation of optimal Standard cell (ASIC)
libraries
16Energy-Delay Relationship
17Energy-Delay Relationship of High-Speed Adders
Best Energy
Best Delay
- Best Delay does not guarantee best Energy
18Energy-Delay Product EDP ?
Best EDP
Best Delay
- Energy-Delay Product (EDP) Low-Power comparison
point
19EDP2 ?
Best ED2
Best Delay
- Energy.Delay2 - High-Performance comparison point
20Energy-Delay Space View
Which Design Should be used?
Low-Power Target
High-Performance Target
21Energy-Delay Space View
H is changing, w/ Coutconstant
Low-Power Target
High-Performance Target
- Begin to see characteristics of designs
22Energy-Delay Space View
H is changing, w/ Coutconstant
Low-Power Target
High-Performance Target
- Begin to see characteristics of designs
23Energy-Delay Space View
H is changing, w/ Coutconstant
Low-Power Target
High-Performance Target
- Best High-Performance designs are clearly seen
- Different than what would be chosen from single
point
24Energy-Delay Space View
H is changing, w/ Coutconstant
Low-Power Target
High-Performance Target
- Also determines best design for Low-Power Target
25Energy-Delay Space View
- Must look at Energy-Delay Space of designs
26Contribution of Wire to Delay and Energy should
be examined too
- Wiring varies significantly between designs
27Contribution of Wire to Delay and Energy should
be examined too
10
- Accounts for up to 10 of Total Energy
28Contribution of Wire to Delay and Energy should
be examined too
No Wire
- Without wire, differences appear large
29Contribution of Wire to Delay and Energy should
be examined too
With Wire
- Wire strongly impacts selection of best adders
30Contribution of Wire to Delay and Energy should
be examined too
Wire Resistance Impact
130nm technology
HSPICE
Estimate
1FO4
HSPICE
no wire resistance
- Layout / floorplanning rearrangement could
potentially save 1FO4
31Energy
32Where does Logical Effort lead us?
Most design approaches focus here
- It is possible to lower energy by trading delay?
or
33Design in Energy-Delay Space
Energy
Decrease Stage Effort
If this is your design point the drop is steep
!
LE
But you should not be designing there!!
Increase Stage Effort
For a small sacrifice in delay the energy savings
are big !
Delay
2002 Discovery of the property of Hyperbola R.W.
Brodersen, M.A. Horowitz, D. Markovic, B.
Nikolic, V. Stojanovic, "Methods for True Power
Minimization," International Conference on
Computer-Aided Design, ICCAD-2002, Digest of
Technical Papers, San Jose, CA, November 10-14,
2002, pp. 35-42.
34Cin Minimum size inv
ED3
LE (Dmin)
Cout 100Cin
H Cout / Cin 100
ED2
EDP
ED3
Points obtained through gate sizing
ED2
EDP
ED0.5
ED0.5
constant metric curve
Hardware Intensity, Victor Zyuban, IBM T.J.Watson
35Is it possible to lower the Energy ?
Energy Savings
- Reduce Energy for same Delay!
- Improve Delay for same Energy!
36Achieved Energy Savings in KS and HC Adders
50
45
- Simulation of 64-bit static adders confirms
saving!
37Reduction of Hot-Spots
Reduces Hotspots
Energy Optimized
Delay Optimized
- Energy minimization improves hotspots!
38Achieved Energy Savings in Representative Adders
- Comparison of high-performance 64-bit adders
- - Delay and Energy Optimized
39Accomplishments during the past year(June 30,
2003 to June 30, 2004)
90nm technology
Worst Case Energy Vector With 100 Input Activity
Optimized Design
Energy Saving
Delay Saving
Initial Design
Collaboration with Intel AMR
40Summary
- Design for multi-GHz requires
- Early structure comparison in energy-delay space
- Early Layout/Floorplanning
- Optimization using energy minimization objective
function - LE does not guarantee a good design
- Our method of energy-minimization focuses on
reducing power - Same principles hold for other logic functions
41Future Direction
- Development of Energy friendly design methodology
and CAD tools - Problems to be solved
- Accurate estimation of speed-energy relationship
- Elimination/reduction of hot-spots
- Lowering overall computational energy
- Future
- Design guidelines for creating energy-efficient
circuits - Energy-efficient elements
- Energy-efficient ASIC Libraries
- Tool for minimizing system, circuit, device, and
interconnect power - Reliability
42Students
- Bart Zeydel, PhD (expected 2005)
- high speed digital circuits energy-delay
optimization, design methodology for multi-GHz,
signal-processing data-path (wireless completed,
Intel) - Hoang Dao, PhD (expected 2005)
- optimization for speed and energy, data-path
for 10GHz. - Xiao-Yan Yu, PhD (expected 2006)
- optimization for speed, custom data-path for
10GHz. - Milena Vratonjic, PhD (expected 2007)
- Work on energy-delay optimization, circuits
for wireless data-path. - Marko Aleksic, PhD (expected 2005)
- high speed digital circuits optimization,
pipeline strategies for 10GHz, - continuation from Nikola.
- Christophe Giacomotto, PhD (expected 2007)
- high speed digital circuits fabrication of
multi-GHz datapath elements.
43Technology Transfer and Industrial Interactions
- Intel Corp.
- 2's Complement Multiplier for Wireless Baseband
(to be used in wireless products) - Publication B.R. Zeydel, V.G. Oklobdzija, S.
Mathew, R.K. Krishnamurthy, S. Borkar, A 90nm
1GHz 22mW 16x16-bit 2's Complement Multiplier for
Wireless Baseband, Proceedings of the 2003
Symposium on VLSI Circuits, Kyoto, JAPAN, June 12
- 14, 2003. - Energy Efficient Logic Design VLSI Adder (to be
used in next generation processors) - Publication Vojin G. Oklobdzija, Bart R. Zeydel,
Hoang Dao, Sanu Mathew, Ram Krishnamurthy,
Energy-Delay Estimation Technique for
High-Performance Microprocessor VLSI Adders,
Proceedings of the International Symposium on
Computer Arithmetic, ARITH-16, Santiago de
Compostela, SPAIN, June 15-18, 2003. - Synopsys
- Area-Time Optimal Adder
- Publication Aamir A. Farooqui, Vojin G.
Oklobdzija , Sadiq M. Sait, Area-Time Optimal
Adder with Relative Placement Generator,
International Symposium on Circuits and Systems,
Bangkok, THAILAND, May 25-28, 2003. - Fujitsu America (non-member)
- Flip-Flops and Latches
- Publication N. Nedovic, V. G. Oklobdzija, W. W.
Walker, A Clock Skew Absorbing Flip-Flop, 2003
IEEE International Solid-State Circuits
Conference Digest of Technical papers, San
Francisco, February 9-12, 2003.
44Relevant Publications
- Hoang Dao, Kevin Nowka, Vojin G. Oklobdzija,
Analysis of Clocked timing Elements for DVS
Effects over Process Parameter Variations,
Proceedings of the International Symposium on Low
Power Electronics and Design, Huntington Beach,
California, August 6-7, 2001. - Hoang Q. Dao, Vojin G. Oklobdzija, Application
of Logical Effort on Delay Analysis of 64-bit
Static Carry-Lookahead Adder, 35th Annual
Asilomar Conference on Signals, Systems and
Computers, Pacific Grove, California, November
4-7, 2001. - Xiao Yan Yu, Vojin G. Oklobdzija, William W.
Walker, Application of Logical on Design of
Arithmetic Blocks, 35th Annual Asilomar
Conference on Signals, Systems and Computers,
Pacific Grove, California, November 4-7, 2001. - Hoang Q. Dao, Vojin G. Oklobdzija, Application
of Logical Effort Techniques for Speed
Optimization and Analysis of Representative
Adders, 35th Annual Asilomar Conference on
Signals, Systems and Computers, Pacific Grove,
California, November 4-7, 2001. - Aamir A. Farooqui, Vojin G. Oklobdzija , Sadiq M.
Sait, Area-Time Optimal Adder with Relative
Placement Generator, International Symposium on
Circuits and Systems, Bangkok, THAILAND, May
25-28, 2003. - Xiao Yan Yu, Vojin G. Oklobdzija, William W.
Walker, An Efficient Transistor Optimizer for
Custom Circuits, International Symposium on
Circuits and Systems, Bangkok, THAILAND, May
25-28, 2003. - B.R. Zeydel, V.G. Oklobdzija, S. Mathew, R.K.
Krishnamurthy, S. Borkar, A 90nm 1GHz 22mW
16x16-bit 2's Complement Multiplier for Wireless
Baseband, Proceedings of the 2003 Symposium on
VLSI Circuits, Kyoto, JAPAN, June 12 -14, 2003. - Vojin G. Oklobdzija, Bart R. Zeydel, Hoang Dao,
Sanu Mathew, Ram Krishnamurthy, Energy-Delay
Estimation Technique for High-Performance
Microprocessor VLSI Adders, Proceedings of the
International Symposium on Computer Arithmetic,
ARITH-16, Santiago de Compostela, SPAIN, June
15-18, 2003. - Hoang Q. Dao, Bart R. Zeydel, Vojin G.
Oklobdzija, Energy Minimization Method for
Optimal Energy-Delay Extraction, Proceedings of
the European Solid-State Circuits Conference,
ESSCIRC 2003, Estoril, PORTUGAL, September 16-18,
2003. - Hoang Q. Dao, Bart R. Zeydel, Vojin G.
Oklobdzija, Energy Optimization of
High-Performance Circuits, Proceedings of the
13th International Workshop on Power And Timing
Modeling, Optimization and Simulation, Torino,
ITALY, September 10-12, 2003.
45Relevant Publications
- H. Q. Dao, B. R. Zeydel, V. Zyuban, V. G.
Oklobdija, "A Method for Energy Optimization of
Digital Pipelined Systems", The Fourth Annual IBM
Austin Conference on Energy-Efficient Design,
ACEED 2005, Austin, Texas, March 1-3, 2005. - V. G. Oklobdija, B. R. Zeydel, H. Q. Dao, S.
Mathew, R. Krishnamurthy, "Comparison of
High-Performance VLSI Adders in Energy-Delay
Space", IEEE Transaction on VLSI Systems, Volume
13, Issue 6, June 2005. - B.R. Zeydel, T.T.J.H. Kluter, V. G. Oklobdija,
"Efficient Energy-Delay Mapping of Addition
Recurrence Algorithms in CMOS", International
Symposium on Computer Arithmetic, ARITH-17, Cape
Cod, Massachusetts, USA, June 27-29, 2005.