Title: 2' Physics of Power Dissipation in CMOS FET Devices
12. Physics of Power Dissipation in CMOS FET
Devices
22. Physics of Power Dissipation in CMOS FET
Devices
- For an ideal MIS diode, the energy difference ?ms
between the metal work function ?m and the
semiconductor work function ?s is zero - ?ms ?m - (? Eg/2q ?B) 0 (2.1)
- where ?is the semiconductor electron affinity
(from conduction band to vacuum level), Eg the
band gap (from valence band to conduction band),
?B the potential barrier between the metal and
the insulator, and ?B the potential difference
between the Fermi level EF and the intrinsic
Fermi level Ei.
3The Fermi-Dirac Function
- fFD(E) 1/ (1 exp ((E EF) / kT))
- The Fermi-Dirac distribution function gives the
probability that a certain energy state will be
occupied by an electron. - As in a gas, the electrons in a solid are in
constant motion and consequently changing their
energy and momentum.
4P-type
5CMOS Gate Power equations
- P CLVDD2f 0?1 tsc VDD Ipeak f 0 ? 1 VDD
Ileakage - Dynamic term CLVDD2f 0?1
- Short-circuit term tsc VDD Ipeakf 0 ? 1
- Leakage term VDD Ileakage
6- The Maxwell-Boltzmann statistics relates the
equilibrium hole concentration to the intrinsic
Fermi level - p0 ni exp((Ei EF)/kT) (2.2)
7P substrate (The Fermi level EF in the
semiconductor is now qV below the Fermi level in
the metal gate.)
8P substrate
9- If the applied voltage is increased sufficiently,
the bands bend far enough that level Ei at the
surface crosses over to the other side of level
EF. - This is brought about by the tendency of carriers
to occupy states with the lowest total energy. - In the present condition of inversion the level
Ei bends to be closer to level Ec and electrons
outnumber holes at the surface.
10Ei at the surface now is below EF by an amount of
energy equal to 2 ?B , where ?B is the potential
difference between the Fermi level EF and the
intrinsic Fermi level Ei in the bulk.
11- The value of V necessary to reach the onset of
strong inversion is called the threshold voltage.
12Surface Space Charge Region and the Threshold
Voltage
- Poisson equation
- ? ?D ?(x, y, z) (2.3)
- Where D, the electric displacement vector, is
equal to es E under low-frequency or static
conditions es is the permittivity of Si E the
electric field vector and ?(x, y, z) the total
electric charge density.
13(No Transcript)
14Threshold voltage
- VT
- (2d/ei ) ( q es NA ?B (1 e-2ß?B) )0.5 2?B
- The total voltage needed to offset the effect of
nonzero work function difference and the presence
of the charges is referred to as the flat-band
voltage VFB. - VFB ?ms QTd/ei
15Threshold voltage
- VT
- (2d/ei ) ( q es NA ?B (1 e-2ß?B) )0.5 2?B
VFB
16(No Transcript)
172.2.3.1 Effects Influencing Threshold Voltage
- VT decreases when L (length) is decreased, varies
with Z (width), and decreases when the
drain-source voltage VDS is increased.
18- Drain-induced barrier lowering (DIBL) is the
basis for a number of more complex models of the
threshold voltage shift. - It refers to the decrease in threshold voltage
due to the depletion region charges in the
potential barrier between the source and the
channel at the semiconductor surface.
19- A recent model adopt a quasi two-dimensional
approach to solving the two-dimensional Poisson
equation. - dEx/dx at each point (x, y) can be replaced with
the average of its value at (0, y) and at (W, y)
20Short channel effect
- The minimum value of the surface potential
increases with decreasing channel length and
increasing VDS.
212.2.3.2 Subsurface Drain-Induced Barrier Lowering
(Punchthrough)
- The punchthrough voltage VPT defined as the value
of VDS at which I D, st reaches some specific
magnitude with VGS 0. - The parameter VPT can be roughly approximated as
the value of VDS for which the sum of the widths
of the source and the drain depletion regions
becomes equal to L.
22(No Transcript)
23- If the field in the oxide, Eox, is large enough,
the voltage drop across the depletion layer
suffices to enable tunneling in the drain via a
near-surface trap. - The minority carriers emitted to the incipient
inversion layer are laterally removed to the
substrate, completing a path for a gate-induced
drain leakage (GIDL) current. In CMOS circuits
this leakage current contributes to standby power.
242.3 Power Dissipation in CMOS
- The first ICs ever fabricated used a PMOS
process. This is due to the simplicity of
fabrication of a p-channel enhancement mode MOS
field-effect transistor (PMOST) with threshold
voltage VTp lt 0. - The charge mobility factor caused the move to the
NMOS process. - Then change to CMOS because of the power
dissipation problem.
25- This advantage of CMOS over NMOS has proven to be
important enough that the shortcomings of CMOS
are overlooked. - The CMOS process is more complex than the NMOS,
the CMOS requires use of guard-rings to get
around the latch-up problem, and CMOS circuits
require more transistors than the equivalent NMOS
circuits.
26(No Transcript)
27- The threshold voltages place a limit on the
minimum supply voltage that can be used without
incurring unreasonable delay penalties. - If the threshold voltage is too low, the static
component of the power due to subthreshold
currents becomes significant.
28(No Transcript)
292.3.1 Short-Circuit Dissipation
- The short-circuit dissipation of the gate varies
with the output load and the input signal slope. - The short-circuit dissipation decreases linearly
(roughly) in both absolute terms and a fraction
of the total dissipation as the output load is
increased to a critical value and then it will
increase again rapidly.
30- For simplicity a symmetrical inverter (i.e., ßN
ßp and VTn -Vtp) and a symmetrical input
signal (rise time fall time) are considered. - I ß/2(Vin V T)2 for 0? I? Imax
- Imean 1/T ?0T I(t) dt
- 2 2/T ?t1t2 ß/2 (Vin (t) VT)2 dt
31- Assuming the rising and falling portions of the
input voltage waveform to be linear ramps, - Vin(t) t VDD/t
- Imean 22/T?(Vt/Vdd) tt/2 ß/2(tVT/t VT)2 dt
- Let ? (VT/t)t - VT
32- Imean - 2ß/T?(Vt/Vdd) tt/2 ? d?
- Imean 1/12ß/VDD(VDD VT)3 t/T
- The short-circuit power dissipation of an
unloaded inverter is - PSC ß/12(VDD VT)3 t/T
33- If the inverter is lightly loaded, causing output
rise and fall times that are relatively shorter
than the input rise and fall times, the
short-circuit dissipation increases to become
comparable to dynamic dissipation. - To minimize dissipation, an inverter should be
designed in such a way so that the input rise and
fall times are about equal to the output rise and
fall times.
342.3.2 Dynamic Dissipation
- Assuming that the input Vin is a square wave
having a period T and that the rise and fall
times of the input are much less than the
repetition period, the dynamic dissipation is
given by - PD CL VDD2/T
35(No Transcript)
36- When V VDD, E 0-gt1 CLVDD2.
- When energy stored in a capacitor with
capacitance CL and voltage VDD across its plates
is CL VDD2/2, the rest of the energy, another CL
VDD2/2, is converted into heat.
37Networks of pass transistors
38(No Transcript)
392.3.3 The Load Capacitance
40(No Transcript)
41- The overall load capacitance is modeled as the
parallel combination of 4 capacitors the gate
capacitance Cg, - the overlap capacitance Cov,
- the diffusion capacitance Cdiff,
- and the interconnect capacitance Cint.
42(No Transcript)
432.3.3.2 The Overlap Capacitance
- Cgd1 Cgd2 2 Cox xd W
- Cgd3 Cgd4 Cgs3 Cgs4 Cox xd W
- The total overlap capacitance is simply the sum
of all the above - Cov Cgd1 Cgd2 Cgd3 Cgd4 Cgs3 Cgs4
442.3.3.3 Diffusion Capacitance
- Two components the bottomwall area capacitance
and the sidewall capacitance
452.4.1 Principles of Low-Power Design
- Using the lowest possible supply voltage
- Using the smallest geometry, highest frequency
devices but operating them at the lowest possible
frequency - Using parallelism and pipelining to lower
required frequency of operation - Power management by disconnecting the power
source when the system is idle - Designing systems to have lowest requirements on
subsystem performance for the given user level
functionality
462.4.3 Fundamental Limits
- The limit from thermodynamic principles results
from the need to have, at any node with an
equivalent resistor R to the ground, the signal
power Ps exceed the available noise power Pavail. - The quantum theoretic limit on low power comes
from the Heisenberg uncertainty principle. In
order to be able to measure the effect of a
switching transition of duration ?t, it must
involve an energy greater than h/ ?t - P ? h/ (?t)2 where h is the Plancks constant.
47- Finally the fundamental limit based on
electromagnetic theory results in the velocity of
propagation of a high-speed pulse on an
interconnect to be always less than the speed of
light in free space, c0 - L/t? c0 where L is the length of the interconnect
and t is the interconnect transit time.
482.4.4 Material Limits
- The attributes of a semiconductor material that
determine the properties of a device built with
the material are - Carrier mobility µ
- Carrier saturation velocity ss
- Self-ionizing electric field strength Ec
- Thermal conductivity K
49- Consider an SOI structure by surrounding the
above generic device in a hemispherical shell of
SiO2 of radius ri, indicating a
two-order-of-magnitude reduction in thermal
conductivity.
50- The response time of the global interconnect
circuit is - t (2.3 Rtr Rint) Cint where Rtr is the
output resistance of the driving transistor and
Rint and Cint are the total resistance and
capacitance, respectively, of the global
interconnect.
512.4.7 System Limits
- The architecture of the chip
- The power-delay product of the CMOS technology
used to implement the chip - The heat removal capacity of the chip package
- The clock frequency
- Its physical size
52Energy characterization
- Transition-sensitive energy models
- Single energy tables
- Bit independent modules e.g., flipflops
- Multiple energy tables
- Large bit dependent modules e.g., 32-b adders
- Large multi-element modules e.g., register files
- Transition sensitive energy equations
- System level interconnect capacitance values
- Analytical energy modes
- Cache and main memory
53Transition-sensitive energy model
- Must first design and layout a functional unit
and then simulate it to capture switch
capacitances - Bit independent bus lines, pipeline registers
- One bit switching does not affect other bit
slices operations - Bit dependent ALU, decoders
- Once constructed, the models can be reused in
simulations of other architectures built with the
same technology
54Switch Capacitance Table
55Table Compression
- Problem
- Results in large uncompressed table (e.g., 16-bit
adder ? 232 rows) - Excessive simulation (e.g., 232!)
- Solution
- Clustering Algorithm Reference Huzefa Mehta, et
al. Module Energy Characterization using
Clustering, DAC96 - For 16-bit adder, to keep 12 average error ?
1000 simulation points, 97 rows
5621 Multiplexer Table
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61Memory System Energy Model
- Parameterizable analytical energy models for the
on-chip memories that capture - Energy dissipated by bitlines precharge, read
and write cycles - Energy dissipated by wordlines when a particular
row is being read and written - Energy dissipated by storage cell on access
- Energy dissipated by address decoders
- Energy dissipated by peripheral circuits cache
control logic, comparators, etc. - Off-chip main memory energy is based on
per-access cost
62Cache energy model example
- On-chip cache
- Energy Ebus Ecell Epad
- Ecell ? (Wl_length) (Bl_length 4.8)
(Nhit 2 Nmiss) - Wl_length m (T 8L St)
- Bl_length C / (m L)
- Nhit number of hits Nmiss number of misses
- C cache size L cache line size in bytes
- m set associativity T tag size in bits
- St of status bits per line
- ? 1.44e-14 (technology based cell access cost
of SRAM) - Em 4.95e-9 (technology based access cost of
DRAM)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68(No Transcript)
69Architectural Level Analysis Considerations
- Very computationally efficient
- Requires predefined analytical and
transition-sensitive energy characterization
models - Requires design only to RTL (with some idea as to
the kind of functional units planned) - Coarse grain use of gated clocks implicit
- Reasonably accurate (within 5 - 15 of SPICE)
70- Simulation based so can be used to support
architectural, compiler, OS, and application
level experimentation - WattWatcher (Sente), DesignPower and
PowerCompiler (Synopsys), prototype academic
tools (Wattch Princeton, SimplePower PSU)