2' Physics of Power Dissipation in CMOS FET Devices - PowerPoint PPT Presentation

1 / 70

About This Presentation

Title:

2' Physics of Power Dissipation in CMOS FET Devices

Description:

2. Physics of Power Dissipation in CMOS FET Devices ... The short-circuit dissipation decreases linearly (roughly) in both absolute ... – PowerPoint PPT presentation

Number of Views:2841

Avg rating:5.0/5.0

Slides: 71

Provided by: NTU

Category:

more less

Transcript and Presenter's Notes

Title: 2' Physics of Power Dissipation in CMOS FET Devices

1
2. Physics of Power Dissipation in CMOS FET
Devices
2
2. Physics of Power Dissipation in CMOS FET
Devices

For an ideal MIS diode, the energy difference ?ms
between the metal work function ?m and the
semiconductor work function ?s is zero
?ms ?m - (? Eg/2q ?B) 0 (2.1)
where ?is the semiconductor electron affinity
(from conduction band to vacuum level), Eg the
band gap (from valence band to conduction band),
?B the potential barrier between the metal and
the insulator, and ?B the potential difference
between the Fermi level EF and the intrinsic
Fermi level Ei.

3
The Fermi-Dirac Function

fFD(E) 1/ (1 exp ((E EF) / kT))
The Fermi-Dirac distribution function gives the
probability that a certain energy state will be
occupied by an electron.
As in a gas, the electrons in a solid are in
constant motion and consequently changing their
energy and momentum.

4
P-type
5
CMOS Gate Power equations

P CLVDD2f 0?1 tsc VDD Ipeak f 0 ? 1 VDD
Ileakage
Dynamic term CLVDD2f 0?1
Short-circuit term tsc VDD Ipeakf 0 ? 1
Leakage term VDD Ileakage

The Maxwell-Boltzmann statistics relates the
equilibrium hole concentration to the intrinsic
Fermi level
p0 ni exp((Ei EF)/kT) (2.2)

7
P substrate (The Fermi level EF in the
semiconductor is now qV below the Fermi level in
the metal gate.)
8
P substrate
9

If the applied voltage is increased sufficiently,
the bands bend far enough that level Ei at the
surface crosses over to the other side of level
EF.
This is brought about by the tendency of carriers
to occupy states with the lowest total energy.
In the present condition of inversion the level
Ei bends to be closer to level Ec and electrons
outnumber holes at the surface.

10
Ei at the surface now is below EF by an amount of
energy equal to 2 ?B , where ?B is the potential
difference between the Fermi level EF and the
intrinsic Fermi level Ei in the bulk.
11

The value of V necessary to reach the onset of
strong inversion is called the threshold voltage.

12
Surface Space Charge Region and the Threshold
Voltage

Poisson equation
? ?D ?(x, y, z) (2.3)
Where D, the electric displacement vector, is
equal to es E under low-frequency or static
conditions es is the permittivity of Si E the
electric field vector and ?(x, y, z) the total
electric charge density.

13
(No Transcript)
14
Threshold voltage

VT
(2d/ei ) ( q es NA ?B (1 e-2ß?B) )0.5 2?B
The total voltage needed to offset the effect of
nonzero work function difference and the presence
of the charges is referred to as the flat-band
voltage VFB.
VFB ?ms QTd/ei

15
Threshold voltage

VT
(2d/ei ) ( q es NA ?B (1 e-2ß?B) )0.5 2?B
VFB

16
(No Transcript)
17
2.2.3.1 Effects Influencing Threshold Voltage

VT decreases when L (length) is decreased, varies
with Z (width), and decreases when the
drain-source voltage VDS is increased.

Drain-induced barrier lowering (DIBL) is the
basis for a number of more complex models of the
threshold voltage shift.
It refers to the decrease in threshold voltage
due to the depletion region charges in the
potential barrier between the source and the
channel at the semiconductor surface.

A recent model adopt a quasi two-dimensional
approach to solving the two-dimensional Poisson
equation.
dEx/dx at each point (x, y) can be replaced with
the average of its value at (0, y) and at (W, y)

20
Short channel effect

The minimum value of the surface potential
increases with decreasing channel length and
increasing VDS.

21
2.2.3.2 Subsurface Drain-Induced Barrier Lowering
(Punchthrough)

The punchthrough voltage VPT defined as the value
of VDS at which I D, st reaches some specific
magnitude with VGS 0.
The parameter VPT can be roughly approximated as
the value of VDS for which the sum of the widths
of the source and the drain depletion regions
becomes equal to L.

22
(No Transcript)
23

If the field in the oxide, Eox, is large enough,
the voltage drop across the depletion layer
suffices to enable tunneling in the drain via a
near-surface trap.
The minority carriers emitted to the incipient
inversion layer are laterally removed to the
substrate, completing a path for a gate-induced
drain leakage (GIDL) current. In CMOS circuits
this leakage current contributes to standby power.

24
2.3 Power Dissipation in CMOS

The first ICs ever fabricated used a PMOS
process. This is due to the simplicity of
fabrication of a p-channel enhancement mode MOS
field-effect transistor (PMOST) with threshold
voltage VTp lt 0.
The charge mobility factor caused the move to the
NMOS process.
Then change to CMOS because of the power
dissipation problem.

This advantage of CMOS over NMOS has proven to be
important enough that the shortcomings of CMOS
are overlooked.
The CMOS process is more complex than the NMOS,
the CMOS requires use of guard-rings to get
around the latch-up problem, and CMOS circuits
require more transistors than the equivalent NMOS
circuits.

26
(No Transcript)
27

The threshold voltages place a limit on the
minimum supply voltage that can be used without
incurring unreasonable delay penalties.
If the threshold voltage is too low, the static
component of the power due to subthreshold
currents becomes significant.

28
(No Transcript)
29
2.3.1 Short-Circuit Dissipation

The short-circuit dissipation of the gate varies
with the output load and the input signal slope.
The short-circuit dissipation decreases linearly
(roughly) in both absolute terms and a fraction
of the total dissipation as the output load is
increased to a critical value and then it will
increase again rapidly.

For simplicity a symmetrical inverter (i.e., ßN
ßp and VTn -Vtp) and a symmetrical input
signal (rise time fall time) are considered.
I ß/2(Vin V T)2 for 0? I? Imax
Imean 1/T ?0T I(t) dt
2 2/T ?t1t2 ß/2 (Vin (t) VT)2 dt

Assuming the rising and falling portions of the
input voltage waveform to be linear ramps,
Vin(t) t VDD/t
Imean 22/T?(Vt/Vdd) tt/2 ß/2(tVT/t VT)2 dt
Let ? (VT/t)t - VT

Imean - 2ß/T?(Vt/Vdd) tt/2 ? d?
Imean 1/12ß/VDD(VDD VT)3 t/T
The short-circuit power dissipation of an
unloaded inverter is
PSC ß/12(VDD VT)3 t/T

If the inverter is lightly loaded, causing output
rise and fall times that are relatively shorter
than the input rise and fall times, the
short-circuit dissipation increases to become
comparable to dynamic dissipation.
To minimize dissipation, an inverter should be
designed in such a way so that the input rise and
fall times are about equal to the output rise and
fall times.

34
2.3.2 Dynamic Dissipation

Assuming that the input Vin is a square wave
having a period T and that the rise and fall
times of the input are much less than the
repetition period, the dynamic dissipation is
given by
PD CL VDD2/T

35
(No Transcript)
36

When V VDD, E 0-gt1 CLVDD2.
When energy stored in a capacitor with
capacitance CL and voltage VDD across its plates
is CL VDD2/2, the rest of the energy, another CL
VDD2/2, is converted into heat.

37
Networks of pass transistors
38
(No Transcript)
39
2.3.3 The Load Capacitance
40
(No Transcript)
41

The overall load capacitance is modeled as the
parallel combination of 4 capacitors the gate
capacitance Cg,
the overlap capacitance Cov,
the diffusion capacitance Cdiff,
and the interconnect capacitance Cint.

42
(No Transcript)
43
2.3.3.2 The Overlap Capacitance

Cgd1 Cgd2 2 Cox xd W
Cgd3 Cgd4 Cgs3 Cgs4 Cox xd W
The total overlap capacitance is simply the sum
of all the above
Cov Cgd1 Cgd2 Cgd3 Cgd4 Cgs3 Cgs4

44
2.3.3.3 Diffusion Capacitance

Two components the bottomwall area capacitance
and the sidewall capacitance

45
2.4.1 Principles of Low-Power Design

Using the lowest possible supply voltage
Using the smallest geometry, highest frequency
devices but operating them at the lowest possible
frequency
Using parallelism and pipelining to lower
required frequency of operation
Power management by disconnecting the power
source when the system is idle
Designing systems to have lowest requirements on
subsystem performance for the given user level
functionality

46
2.4.3 Fundamental Limits

The limit from thermodynamic principles results
from the need to have, at any node with an
equivalent resistor R to the ground, the signal
power Ps exceed the available noise power Pavail.
The quantum theoretic limit on low power comes
from the Heisenberg uncertainty principle. In
order to be able to measure the effect of a
switching transition of duration ?t, it must
involve an energy greater than h/ ?t
P ? h/ (?t)2 where h is the Plancks constant.

Finally the fundamental limit based on
electromagnetic theory results in the velocity of
propagation of a high-speed pulse on an
interconnect to be always less than the speed of
light in free space, c0
L/t? c0 where L is the length of the interconnect
and t is the interconnect transit time.

48
2.4.4 Material Limits

The attributes of a semiconductor material that
determine the properties of a device built with
the material are
Carrier mobility µ
Carrier saturation velocity ss
Self-ionizing electric field strength Ec
Thermal conductivity K

Consider an SOI structure by surrounding the
above generic device in a hemispherical shell of
SiO2 of radius ri, indicating a
two-order-of-magnitude reduction in thermal
conductivity.

The response time of the global interconnect
circuit is
t (2.3 Rtr Rint) Cint where Rtr is the
output resistance of the driving transistor and
Rint and Cint are the total resistance and
capacitance, respectively, of the global
interconnect.

51
2.4.7 System Limits

The architecture of the chip
The power-delay product of the CMOS technology
used to implement the chip
The heat removal capacity of the chip package
The clock frequency
Its physical size

52
Energy characterization

Transition-sensitive energy models
Single energy tables
Bit independent modules e.g., flipflops
Multiple energy tables
Large bit dependent modules e.g., 32-b adders
Large multi-element modules e.g., register files
Transition sensitive energy equations
System level interconnect capacitance values
Analytical energy modes
Cache and main memory

53
Transition-sensitive energy model

Must first design and layout a functional unit
and then simulate it to capture switch
capacitances
Bit independent bus lines, pipeline registers
One bit switching does not affect other bit
slices operations
Bit dependent ALU, decoders
Once constructed, the models can be reused in
simulations of other architectures built with the
same technology

54
Switch Capacitance Table
55
Table Compression

Problem
Results in large uncompressed table (e.g., 16-bit
adder ? 232 rows)
Excessive simulation (e.g., 232!)
Solution
Clustering Algorithm Reference Huzefa Mehta, et
al. Module Energy Characterization using
Clustering, DAC96
For 16-bit adder, to keep 12 average error ?
1000 simulation points, 97 rows

56
21 Multiplexer Table
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
Memory System Energy Model

Parameterizable analytical energy models for the
on-chip memories that capture
Energy dissipated by bitlines precharge, read
and write cycles
Energy dissipated by wordlines when a particular
row is being read and written
Energy dissipated by storage cell on access
Energy dissipated by address decoders
Energy dissipated by peripheral circuits cache
control logic, comparators, etc.
Off-chip main memory energy is based on
per-access cost

62
Cache energy model example

On-chip cache
Energy Ebus Ecell Epad
Ecell ? (Wl_length) (Bl_length 4.8)
(Nhit 2 Nmiss)
Wl_length m (T 8L St)
Bl_length C / (m L)
Nhit number of hits Nmiss number of misses
C cache size L cache line size in bytes
m set associativity T tag size in bits
St of status bits per line
? 1.44e-14 (technology based cell access cost
of SRAM)
Em 4.95e-9 (technology based access cost of
DRAM)

63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
Architectural Level Analysis Considerations

Very computationally efficient
Requires predefined analytical and
transition-sensitive energy characterization
models
Requires design only to RTL (with some idea as to
the kind of functional units planned)
Coarse grain use of gated clocks implicit
Reasonably accurate (within 5 - 15 of SPICE)

Simulation based so can be used to support
architectural, compiler, OS, and application
level experimentation
WattWatcher (Sente), DesignPower and
PowerCompiler (Synopsys), prototype academic
tools (Wattch Princeton, SimplePower PSU)

Write a Comment

User Comments (0)