Chapter 3 Power Estimation - PowerPoint PPT Presentation

1 / 82

About This Presentation

Title:

Chapter 3 Power Estimation

Description:

Logic description in structural VHDL or Verilog. Zero-delay or unit-delay timing models ... Architectural description in behavioral VHDL or Verilog or C, C ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 83

Provided by: NTU

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 3 Power Estimation

1
Chapter 3 Power Estimation

Simulation based
Vectors are given, circuit is known, simulation
is performed. The instantaneous currents are
averaged.
Probabilistic analysis based
Averaging of the inputs is performed first,
probabilistic measures are extracted.

2
(No Transcript)
3

Power is also dissipated due to glitching
activity in a circuit. Glitches occur due to
different delays through different paths of the
circuit.
A hazardous transition occurs at the output of
the AND gate due to different delays through two
different paths converging at the inputs to the
AND gate.

4
(No Transcript)
5

The glitches can die while propagating through a
logic gate if the width of the glitch is much
smaller than the inertial delay of the logic gate.

6
3.1 Modeling of Signals

Stochastic process Let g(t), -8? t ?8, be a
stochastic process that takes the values of
logical 0 or logical 1, transitioning from one to
the other at random times.
Strict-sense stationary (SSS) A stochastic
process is said to be strict-sense stationary if
its statistical properties are invariant to a
shift of the time origin. More importantly, the
mean of such a process does not change with time.

Mean ergodic If a constant-mean process g(t) has
a finite variance and is such that g(t) and
g(tt) become uncorrelated as t?8 then g(t) is
mean ergodic.
Definition 3.1 (Signal Probability) The signal
probability of signal g(t) is given by
P(g) lim T ? 8 ?-TT g(t) dt

Definition 3.2 (Signal Activity) The signal
activity of a logic signal g(t) is given by
A(g) lim T ? 8 ng(T)/T
where ng(t) is the number of transitions of g(t)
in the time interval between T/2 and T/2.

If the primary inputs to the circuit are modeled
as mutually independent SSS mean-ergodic 0-1
process, then the probability of signal g(t)
assuming the logic value 1 at any given time t
becomes a constant, independent of time and is
referred to as the equilibrium signal probability
of random quantity g(t) and is denoted by P(g1),
which we refer to simply as signal probability.
Hence, A(g) becomes the expected number of
transitions per unit time.

10
(No Transcript)
11
3.2 Signal Probability Calculation

Inputs Signal probabilities of all the inputs to
a circuit
Output Signal probabilities for all nodes of the
circuit
Step 1 For each input signal and gate output in
the circuit, assign a unique variable.
Step 2 Starting at the inputs and proceeding to
the outputs, write the expression for the output
of each gate as a function of its input
expressions..

Step 3 Suppress all exponents in a given
expression to obtain the correct probability for
that signal.
Reconvergent fanout can produce expressions for
the signal probability of internal nodes having
exponents greater than 1. Intuitively, in
probability expressions involving independent
primary inputs, such exponents cannot be present.

Let f be written in a canonical sum of products
of primary inputs as follows
f Si1p (?k1n sk), where sk is either xk or
xk.
Since the product terms inside the summation are
mutually independent, we have
P(f) Si1p (?k1n P(sk)). This expression is
defined as the canonical sum of probability
products of f.

P(xk) P(xk) P(xk) (1 P(xk))
P(xk) P2(xk)
P(xk) P(xk)
0

15
3.3 Probabilistic Techniques for Signal Activity
Estimation
16
3.3.1 Switching Activity in Combinational Logic

The Boolean difference of fj with respect to xi
is defined as follows
? fj / ?xi fj xi1 ? fj xi0 where ?
denotes the exclusive-or operation.
The Boolean difference signifies the condition
under which output fj is sensitized to input xi.

If the primary inputs xi, i 1, , n, to logic
gate M are not spatially correlated, then the
signal activity at output fj is given by
A(fj) Si1n P(?fj / ?xi) A(xi) (3.6)
P(?fj / ?xi) signifies the probability of
sensitizing input xi to output fj , while
P(?fj / ?xi) A(xi) is the contribution of
switching activity at output fj due to input xi
only.

18
(No Transcript)
19

? fand / ?x1 fand x11 ? fand x10
x2 ? 0
x2
A(fand) p(x2)A(x1) P(x1)A(x2)

Equation (3.6) fails to consider the effect of
simultaneous switching of signals at logic gate
inputs and, hence, can grossly overestimate
signal activity.

The output switching activity is zero

22
3.4 Statistical techniques

The circuit is simulated repeatedly using a logic
simulator and the switching activities at various
nodes are noted.
Statistical mean estimation techniques are used
in determining the stopping criteria in the Monte
Carlo simulations.

23
3.4.1 Estimating Average Power in Combinational
Circuits

Burch et al. experimentally determined that the
power consumed by a circuit over a period t has a
normal distribution.
Let p and s be the measured average and the
standard deviation of the random sample of the
power measured over time T, respectively. Then
with (1 - ?) 100 confidence we can write the
following inequality
? p - Pavg? lt t?/2 s / N1/2

where t?/2 is obtained from a t-distribution with
N 1 degrees of freedom and Pavg is the true
average power.
? p - Pavg? / p lt t?/2 s / (p N1/2) lt ?
? the desired percentage error for the given
confidence level (1 - ?) 100.

25
3.4.2 Estimating Average Power in Sequential
Circuits

The basic idea of Monte Carlo methods for
estimating activity of individual nodes is to
simulate a circuit by applying random-pattern
inputs. The convergence of simulation can be
obtained when the activities of individual nodes
satisfy some stopping criteria.

26
(No Transcript)
27
(No Transcript)
28
3.5 Estimation of Glitching Power

Static Hazard A static hazard is defined as the
possible occurrence of a transient pulse on
signal line whose static value is not supposed to
change.
Dynamic Hazard A dynamic hazard is the possible
occurrence of a spurious transition during the
occurrence of a functional 0 ? 1 or a 1 ? 0
transition.

29
Three-valued logic simulation for AND Gate
30

Logic simulation can be used to detect probable
static hazards by using a six-valued logic.
The estimate is pessimistic because some of these
hazards might not be present under certain delay
conditions.

31
Six-valued logic for Static hazard analysis
32
AND Operation with Six-Valued Logic
33

1000, 1100, 1110 corresponding to fast, medium,
and slow falling signals.
Eight-valued logic is required for logic
simulation to detect dynamic hazard.

34
Eight-valued logic for dynamic hazard analysis
35
3.5.2 Delay Models

A circuit node where two reconvergent paths with
different delays meet may have a large number of
spurious transitions.
However, even in a tree-structured circuit with
balanced paths there can be a large number of
spurious transitions due to slight variations in
delays.

36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
3.5.2.1 Statistical Estimation

Delays are modeled as random variables and should
be generated from time to time along the
simulation.
Whenever we generate a new set of delays, they
correspond to another die or even the same die
but with different operating conditions such as
temperature and power supply voltage.

41
Activity a

a F(PI, D) PI primary input vectors, D a
random vector consisting of all the random
variables of gate delays.

42
(No Transcript)
43
(No Transcript)
44
The difference is due to the glitching activity
45

While the different nonzero-delay models do track
each other (except for one circuit, C6288, which
has a depth of about 120 levels), it is clear
that the nonzero-delay models can produce very
different results compared to the zero-delay
case. The difference is due to the glitching
activity.

46
For some circuits minimum and maximum average
power can vary widely if uncertain specifications
of primary inputs exist.
47
(No Transcript)
48
The delay mismatch of different paths causes
spurious transitions.
49
They are 20 times greater than those obtained
using the zero-delay model.
50
3.8 Power Dissipation in Domino CMOS

Domino logic circuits do not have direct-path
short-circuit currents except when static pull-up
devices are used to moderate the charge
redistribution problem or when clock skew is not
well dealt with.

51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
V
B fixed X1
X3
Z1
A Varies X2
Y
X1
X2
Cy
Z0
X3
Figure 3.36 CMOS gate y (x1 x2)x3
55
3.10 Power Estimation at the Circuit Level

The gate presents a variable capacitance to the
power/ground rails. The magnitude of this
capacitance depends on the logic values at the
input to the gate.
Two signals A and B are to be connected to the
two equivalent inputs x1 and x2 of the gate in
Figure 3.36 such that very often A has a
transition and B stays zero then A should be
connected to x2 and B to x1 as this results in
lower power consumption than the other case.

56
3.11 High Level Power Estimation
57

The signal probabilities of the lower order bits
of a word are essentially uncorrelated in space
and time with a signal probability of 0.5 and
switching activity of 0.25 and are essentially
independent of the data distributions.
The higher order bits show complete dependence
because they represent the sign extensions in
twos complement representation.

58
3.12 Information-Theory-Based Approaches

The output entropy of Boolean functions can be
used to predict the average minimized area of
CMOS combinational circuits.
If x is a random variable with a signal
probability p, then the entropy of x is defined
as
H(x) p log2 1/p (1 p) log2 1/(1 p)

For a discrete variable x, which can take n
different values, the entropy is defined as
H(x) Si1n pi log2 1/pi, where pi is the
probability that x takes the ith value xi.
Given the input signal probabilities of 0.5, the
output entropy of the boolean function can be
used to predict the area of its average minimized
implementation as
A K (2n / n) H(Y) where A is the area of the
implementation and K is the proportionality
constant, Y f(X).

60
(No Transcript)
61
(No Transcript)
62
RTL Register Transfer Level

A high level technique to estimate power can be
used in the following three steps
Determine the input/output entropies of
combinational logic block by running RTL
simulation of sequential circuits.
From the input/output entropies, determine the
switching activity, area, and estimate of average
power.
Combine with latch and clock power to determine
the total power dissipation.

Two approaches to determining lower bounds of
maximum dynamic power in static CMOS circuits
deterministic (automatic test generation based)
and
simulation-based approaches.

The instantaneous power dissipation due to two
consecutive input binary vectors is proportional
to
Pi Sfor all gates T(g) C(g), where C(g)
denotes the output capacitance of gate g and T(g)
is a binary variable that indicates whether gate
g switches or not corresponding to the two input
vectors.

To justify the transition i.e., to see if T()
1 is achievable, the modified justification
mechanism in a 5-V D algorithm (an ATG algorithm
for stuck-at faults) is used.
In CMOS circuits, the capacitive load of a logic
gate can be approximated by the fanout of the
gate.
Pi Sfor all gates g(V1) ? g(V2) F(g), where
V1, V2 denote two consecutive input binary
vectors to the circuit, g() represents the
boolean function of gate g in term of PIs, and
F(g) denotes the number of fanouts of gate

The justification mechanism in the algorithm
includes two processes backtracing and
implication.
Two composite values to be in conflict if they
have 0 and 1 at the same position.
Experiments show the test generation approach is
superior to the traditional simulation-based
technique in both efficiency and the quality of
the results.

67
(No Transcript)
68

Each gate is associated with a stack to store all
the composite logic values a/b that have been
assigned to g a and b denotes g(V1) and g(V2),
respectively. The variables a and b can be 1, 0,
or u (unknown). At each gate, the top of the
stack stores the most recently updated value for
the gate.

69
After assigning a rising transition (0/1) to x,
y(V2) is forced to be 1.
70
(No Transcript)
71
Circuit level simulation

Extract circuit netlist description from layout
Captures internal (diffusion) and external
(wiring and gate fanout) capacitances
Run an analog simulation
Characterization of device models (nfets, pfets)
Solution of large system of equations so very
computationally intensive ( lt few thousand
transistors)
Can accurately estimate (within a few ) dynamic
and leakage power dissipation
HSPICE, spectre (Cadence), PowerMill (Synopsys)

72
Gate level simulation

Perform logic simulation to obtain the switching
events for each net (signal)
Logic description in structural VHDL or Verilog
Zero-delay or unit-delay timing models
Determine frequency of each net fy ty/(2T),
where ty if the number of logic switches of net y
and T is the simulation time, to compute dynamic
power
Pdyn ?CyVDD2fy
Pre-layout so must estimate Cy

73
Gate internal and leakage power

Use gate characterization (E(g, e)) and logic
simulation event count (f(g, e)) to calculate the
gates dynamic internal power (short circuit and
charging/discharging of internal capacitors)
Pint ?? E(g, e) f(g, e)
During simulation record the fraction of time
T(g, s)/T that each gate g stays in a particular
states s to calculate leakage power
Pleak ?? E(g, s) f(g, s)/T

74
Capacitance estimation

Device (diffusion and gate) capacitance
Depends on width/length of driving gates
source/drain diffusion and fanout gates
Part of characterization of cell based designs
Wiring capacitance
Depends on placement and routing
Wire load predict wire length of a net from the
number of pins incident to the net
Mapping table can be constructed from historical
data of existing designs

75
(No Transcript)
76
Gate level simulation considerations

Simulation vectors need to be chosen carefully
(application dependent)
Internal power really depends on operating
voltage, temperature, process, ?
multidimensional characterization
Accuracy within 5-10 of HSPICE
Signal glitches may not be modeled precisely
(glitches depend on delays in the circuit)

77
Gate level probabilistic analysis

For each internal net y determine the signal
probability of the net wrt to the given signal
probabilities of the primary inputs
From the signal probabilities determine the
transition density D(y) of each internal net y
Compute the total power
P ? 0.5 CyVDD2 D(y)
Pre-layout so must estimate Cy

78
Determining signal probabilities

Signal probability definition
P1 t1/(t0 t1) and P0 1 P1
Propagate the given statistical quantities from
the primary inputs to the internal signal nets
and outputs of the circuit
Propagate quantities using probabilistic signal
propagation model

79
Signal propagation model

Apply Shannons decomposition to the n-input
Boolean function y f(x1, , xn)
Y xifxi !xif!xi, where fxi(f!xi) is the new
Boolean function obtained by setting xi 1 (xi
0) in f(xi, , xn)
P(y) P(xifxi) P(!xif!xi) P(xi)P(fxi )
P(!xi)P(f!xi)
Apply recursively (note P(!xi) 1 P(xi))

80
Determine transition density

For a transition (1-to-0 or 0-to-1) to have
occurred fxi ? f!xi 1 the Boolean difference of
y wrt xi denoted dy/dxi
P(dy/dxi) is the probability that dy/dxi
evaluates to 1 and D(xi) is the transition
density of xi
Then the total transition density of the net y is
D(y) ? P(dy/dxi) D(xi)

81
Gate level probabilistic analysis considerations

Computationally efficient
Must only compute signal probabilities and
transition densities for each net to evaluate
P ? 0.5 CyVDD2 D(y)
Assumes given correct signal probabilities for
primary inputs (and if wrong, large errors are
possible)
Given average power dissipation values

82
Architectural level simulation

Perform RTL simulation to obtain the input
activity for each major functional unit
Architectural description in behavioral VHDL or
Verilog or C, C
Energy characterization of functional units
Transition-sensitive energy models
System busses
ALUs, register file, pipeline registers
Analytical energy models
Caches, DRAMs