Title: 7 Lowenergy Computing Using Energy Recovery Techniques
17 Low-energy Computing Using Energy Recovery
Techniques
- Consider a pMOS pass transistor as shown in
Figure 7.1b. - When the voltage at the power/clock terminal
swings from 0 to Vdd to charge node capacitance
through a transistor channel, there is a voltage
drop in the channel due to the channel resistance.
2(No Transcript)
3- The energy recovery techniques are sometimes
referred to as adiabatic or quasi-adiabatic
computing. - One can achieve very low energy dissipation by
slowing down the speed of operation and only
switching transistors under certain conditions.
4- Consider the amount of energy dissipated when
charging capacitance C from 0 to Vdd in time T
with a linear power supply voltage. - RC (dVc/dt) Vc ?,
- where
- ? 0 t lt 0,
- (Vdd/T) t 0 ? t lt T,
- Vdd t ? T
5(No Transcript)
6- The perfect controlling signal should hold a
correct and constant voltage during the
restoration process. Such a controlling signal
can be inversely generated from the next stage
signals, that is, from the inverse logic function
F2-1, if the logic function F2 itself is
reversible.
7- The solution of the above equation is given by
- Vc 0 t lt 0,
- ? (RC/T)Vdd(1 et/RC), 0 ? t lt T,
- ? (RC/T)Vdd(1 eT/RC) e(t - T)/RC, t ? T
8- The energy dissipation in the above charging
process can be calculated as follows - Elinear ?0? iVRdt ?0T iVRdt ?T? iVRdt
- Elinear(RC/T)CVdd2(1 RC/T RC/T eT/RC)
- The energy dissipation through the dissipative
medium can be made arbitrarily small by making
the transition time T arbitrarily large. - For Figure 7.1c
- Edissipated ? Elinear Eth
97.2 Energy Recovery Circuit Design
- To change a nodes voltage with associated
capacitance C, as shown in Figure 7.2a, Vdd Q (
C Vdd 2) of energy is extracted from the Vdd
terminal. Half of the energy (1/2 C Vdd 2) is
stored in the capacitance temporarily, and the
other half is dissipated in the path. - Later, when this node is connected to the ground,
the stored energy is again dissipated. - In a cycle, all VddQ of energy is converted into
heat.
10- The power supply/clock waveform ? can be divided
into 4 phases - Idle phase when ? 0
- Evaluation phase when ? goes up from zero to Vdd
- Hold phase when ? Vdd
- Restoration phase when ? goes from Vdd down to
zero
11(No Transcript)
12Adiabatic dynamic logic (ADL)
- The ADL combines adiabatic theory with
conventional dynamic CMOS.
13(No Transcript)
14- The precharge phase is defined by the clock swing
from zero to Vdd when the diode is turned on and
the output voltage Vout follows the clock swing
to Vdd VD, where the voltage drop across the
diode. - In the evaluate phase, the clock voltage ramps
down from Vdd to 0. Notice that the diode is in
the reserve-bias condition and the output will
follow the clock down to zero if Vin is high.
15- When the output of the first stage is latched,
the second stage starts evaluating, the second
stage should not undergo any nonadiabatic
transition.
16(No Transcript)
17(No Transcript)
187.3.4 Energy Recovery SRAM Core
- Fig. 7.16 shows the energy recovery SRAM
organization. Compared to the standard CMOS SRAM,
a row driver is inserted that generates the
appropriate voltage signals to drive the memory
core. - Sense amplifiers are replaced by the voltage
level shifters.
19(No Transcript)
20- A conventional SRAM ties Vhi and Vlow to the
supply Vdd and ground, respectively. - The dominant component of energy dissipation
arises from switching the large capacitance on
the bit lines and the word lines.
21(No Transcript)
22- Vhi, Vlow, and Vword of the enabled row may be
controlled independently by global supply lines
Ghi, Glow, and Gword, respectively. - For the unselected rows, Vhi, Vlow, and Vword are
connected to the static power supply lines Shi
5v, Slow 2v, and ground, respectively. - Bit line precharge circuits of the standard CMOS
SRAM are not required in adiabatic SRAM.
23- Assume Vt 1v
- The bit-line is assumed precharged midway to 2v.
- A read operation starts with the row selection
being applied and the Vword being smoothly ramped
up to 3v by Gword.
24(No Transcript)
25Column-Activated Memory Core
- The unique modification from the core
configuration of Figure 7.17 is that Vhi and Vlow
now run vertical and are generated by column
driver circuitry, which can be implemented analog
to the row driver circuitry. - The selection signals generated from the column
address decoder enable the driver for the
selected columns. - A row driver is still required because Vword runs
horizontally.
26- The column driver circuitry ensures that Vhi is
at Shi 5V, Vlow is at Slow 2V, and Vword is
pulled down to ground by the row driver
circuitry. - A read operation starts with Vword in the
selected row being ramped up to 2.5V by Gword. - In the selected columns Vhi and Vlow are ramped
down to 3V and 0V, respectively.
27(No Transcript)
28The column-activated approach consumes slightly
less energy than the regular row-activated
organization
29The decoder starts out in the rest state with
Vlow at Vdd and all rows lines at Vdd vth.
After the address signals settle down, Vlow
gradually swing down to zero, all the rows follow
excepted the selected row.
30(No Transcript)
31(No Transcript)
32(No Transcript)
33- Node A is pulled up very rapidly and the stable
state is achieved immediately hence the short
circuit current is negligible. - Subsequently, the two access transistors are
turned off to isolate the level shifter from the
bit-lines such that A and A hold their states
while the bit lines return to the rest state.
34- The transmission gate constructs of the buffer
ensures that the charging and discharging
processes are performed adiabatically, and
simulation results indicate that more than 90 of
energy recovery can be achieved. - The major portion of charging and discharging is
performed adiabatically.
35(No Transcript)
36(No Transcript)
377.3.9 Optimal Voltage Selection
- The energy dissipation
- E T-gate 2 C l2 Vdd 3 / (kn kp)T(Vdd Vth)2
- The energy dissipation on a bit line
- Ebit Cbit 2 (Vdd Vth) / 2 kn (1 ln 2)T
38(No Transcript)
39- Although the topology of the cell in design 1 is
identical to a standard six-transistor RAM cell,
the fact that separate Vhi and Vlow lines are
needed for each row results in additional area
overhead.
40(No Transcript)
41The fraction of energy recovered at various
speeds of operation
42Approximately 90 of energy can be recovered for
both organizations when the stimulus has a
transition time of 10ns.
43The energy dissipation of the NAND array decoder
is far less than the NOR array decoder.
447.4 Supply Clock Generation
- The underlying idea of adiabatic clock generation
circuits is to use a resonant driver. - In Figure 7.32, the circuit oscillates between
zero and 2Vref. The circuit starts to oscillate
when S0 is turned on and ceases oscillating when
S0 is turned off.
45- There is a pull-up path and a pull-down path that
can replenish the energy dissipated by the
resistances in the load.
46- The pull-up pMOS transistor Sp is turned on and
the pull-down nMOS transistor Sn is turned off
when voltage at node y is higher than Vref. - Conversely, the pull-down path is on and the
pull-up path is off when voltage at y is below
Vref.
47- The control signals at Sp and Sn are 180 out of
phase.
48problems
- The finite resistance of S0 decreases the energy
efficiency substantially. - Sp and Sn have to be generated by extra
circuitry. - Requires an additional reference voltage source
Vref. - Generates a single-phase clock, not enough for
general adiabatic circuits.
49The underlying idea of adiabatic clock generation
circuits is to use a resonant driver.
50- For resonant circuits, a sinusoidal waveform has
the highest energy recycling percentage. - Any other waveform contains a base sinusoidal
component and higher order harmonics. - The component of base frequency f0 can be
efficiently recycled.
51(No Transcript)
52A pull-up path is used in one branch while a
pull-down branch is used in the other branch.
53Both pull-up and pull-down paths are used in each
branch thus the generated waveforms are closer
to the sinusoidal curve.
54- While the sinusoidal waveform is important for
higher energy efficiency, this scheme has a
severe shortcoming. - Within a certain period of time in every cycle,
both nMOS and pMOS transistors are on when the
voltage V1 or V2 is in the vicinity of Vdd/2.
Hence, sort-circuit current dissipates a
significant amount of energy.
55(No Transcript)
56- The oscillator generates two complementary phases
of nearly sinusoidal waveforms. - P1 and P2 and N1 and N2 are used for energy
replenishment and frequency phase lock-up. - The circuitry starts to oscillate when the
control signal enable 1 and ceases to oscillate
when enable 0 - The PLL samples the clock signal at the load and
produces two control signals C1 and C2 at the
frequency of the reference clock, which in turn
forces the circuitry to oscillate at the
frequency of the reference clock.
57- The optimal rise and fall times should be close
to 25 of a clock cycle such that each replenish
transistor is on for approximately 50 of time.
58advantages
- The replenishing transistors turn on and off
gradually so that only small interferences are
imposed on resonant circuitry, which ensures that
the waveforms at both sides of the inductor are
nearly sinusoidal. - The pMOS and nMOS replenishing transistors in one
side are never turned on simultaneously to
prevent the short-circuit current.
59Obtained from simulations for R R1 R2 0.5?
and C C1 C2 100pF. It is essential to
minimize R in the supply clock distribution
network to obtain high energy efficiency.