Title: Low Power Techniques
1Low Power Techniques
2Low Power Techniques
- 1. Introduction
- 2. ? low power ???
- 3. Future Opportunities for Low-Power
- 4. How to reduce power
31. Introduction
1) Drivers for IC progress
- Silicon is the winner, and among many, CMOS is
the winner. - So will it be at least for next 25 years.
4- Theres no show stopper! (in technology)
- ex. ??/???(min. switching energy, power
dissipation) - ????(?? ??)
- material, etc.
- Except for Multi-Billion investment cost!
- Moores law will keep being honored.
- Why? 1. No insurmountable obstacle exists.
- 2. People believes behaves
accordingly. - Huge opportunity exists only if we do good in
exploiting - 1) cross-breeding, co-utilization and
co-development among interactable technologies - 2) Technology sharing using network
52) Big Picture If power reduction is THE goal,
you need to visit all areas to achieve it.
6Analogy Vertical engineer vs. horizontal
engineer IF you want to sell graphic chip, you
need to do anything to help achieve it, from
design, application to marketing, etc.
72. ? low power???
- 1) Battery ?? ?? slow ! 5-8? ??/200yrs
- 200?? ???? 25 watt.hour/kg
- ? now lithium polymer ?? 200 watt.hour/kg
- ?? ??? ?????? 30??? 106?(CPU??) ? 3??? 4?(Memory
density) ? Still wild wild frontier stretching
before us! - 2) ??? ??
- You dont want big cooling tower for each ICs !
- 3) Energy ??
- minimize the amount of energy consumption, and
recirculation period, otherwise our earth will be
EXHAUSTED. - 4) Convenience
- too many wires around mess
83. Future Opportunities for Low-Power
- 1) PDA(Personal Digital Assistant)
- telephone, pager, pen-based input, schedule
keeper, audio/video entertainment fax, video
camera, data security with fingerprint and/or
voice recognition, speech recognition, appl. S/W,
teleconferencing - 2) Tablet(descendent of current Notebook)
9- 3) Virtual Reality(VR) headset for Games
- allows you to move around, only if theres no
wire. - delegate complex processing to fixed server,
while - performing only video decompression.
- 4) Military
- No chance for wires, No heavy batteries was
your too busy. - Information warfare
- 1) Soldier locates enemy tank using laser
rangefinder with GPS - 2) request(for airstrike) to control officers
- 3) aircraft nearby gets command
10- 5) Pico-cell based home network for Games
- Get all available service,
- Allow all possible communications among home
devices, - But with no messy wires.
11- 6) Medical Uses
- pace maker(implanted)
- health monitor
- hearing aids
- 7) GPS(for traveller/explorer, driver(car, ship,
boat, soldiers ) - 8) RF ID(for identifying people, animal, cars)
- passive type resonant LC circuits
- active type(no battery, draws RF power from RF
field) - 9) Smart Cards
- ???, Cash drawing
- encryption, COS(card OS)
124. How to reduce Power
- By all means possible, algorithm, S/W,
architectures, data representation, logic
circuits place route, clock, process, library,
material - 1) algorithm
- adjusting of taps(N) in FIR filters by
measuring noise power. -
N10
transfer function
N6(low power)
13- 2) Software similar to the case when reducing
code size improving speed of execution - instruction selection and ordering ? compilers
job - to minimize Bus switching
- minimize memory space access (reduce cache
miss) - codesign for low power
- slow down clock
- halt clock
- lower VDD
- Shut down
143) Architectures
- Parallel architecture
- Switching Power
? f ? VDD
Sacrifice area for low power
15- Pipelining
- i) VDD? ? ?? speed? ? ?.
- ii) pipeline stage ?? n?? ?? ? stage? logic
complexity? ? ??, ??? speed(throughput)? n
?? ?. - iii) ? speed? ??? ?? ?.(?? pipelining overhead,
- ex ? stage delay?mismatch . )
Latches
VDD f
16- BUS??? switching power ??? ???
- Effective capacitance
- ? activity-driven bus placement
- priority for placing bus(route, layer)
Decreasing ?(activity)
SRAM data
mostly READ operation mostly sequential access
address bus ? small
Phys. Cap.
Display data
? large
Distance from core to pads
17- ?V(voltage swing) reduction
- - low-swing bus
- ex. GTL(Xerox)
- CTT(Mosaid)
- JTL(Jedec)
- LVTTL, LVCTT .
- - Charge-recycling bus
18- BUS invert encoding
- - send inverted signals when majority of bits
are switching, and de-invert.
19 204) Data representation
- Gray code vs. binary 2s(or 1s) compl.
- of toggles ratio
- signed magintude vs. 2s compl.
- Zero-crossing ? sign-bit Zero crossing ?
full switching - ? ??.
215) Logic
- Signal gating masking unwanted switching
activities from propagating forward, causing
unnecessary power dissipation. - Additional power due to control signal generation
should be small. Frequency of control signal
needs to be slower than the signal frequency.
22- Logic encoding binary vs. Gray code for counters
23(No Transcript)
24- State encoding
- E(M1) expectation of of switchings per
transition - 2(0.30.4)1(0.10.1)1.6
- E(M2) 1(0.30.40.1)2(0.1)1.0
- - assigning dont cares to either 1 or - for
low switching
25- Precomputation logic
- saves power by masking uninfluential input
signals into the combinational logic with g(x),
precomputation logic. - I.e., for the out put f(x), there may be some
conditions under which f(x) is independent of
some set of input signals latched in R2, which
can be disabled according to g(x).
26- ex.) Binary comparator f(A,B) 1 if AgtB
- g(x) An?Bn
27- Systematic method to derive a pre-computation
function, g(x), given f(x), R1 and R2 - Let f(p1, pm, x1, , xn) be Boolean function
where p1,, pm are pre-computed inputs
corresponding to R1, and x1,,xn are gated inputs
corresponding to R2. - Let fxi(fxi)be the Boolean function obtained by
setting xi1(xi0) in f. - Define Uxi f ( universal quantification of f
w.r.t. xi ) fxi fxi - Then Uxi f 1 implies f1 regardless of the
value of xi, because Uxif1 means fxi fxi 1 in
the Shannons decomposition of f w.r.t. xi - fxifxi xifxi
28- Let g1 Ux1 Ux2 Uxn f
- Then g1 1 implies that f1 regardless of the
values of x1 xn.I.e., g11 is one of the
conditions where f is indep. of the input values
of x1 xn. - Similarly, g0 Ux1 Ux2 Uxn f g01 implies
that f0 regardless of x1,xn. - Then gg1g0 is the pre-computation
function.I.e. if g 1, we can disable the
loading of x1,xn into R2 because output f is
independent of gated inputs. - G, computed this way, may not be the unique
pre-computation function, but it contains the
most number of 1s in its truth table among all
pre-computation functions.
29- Examples 1)Precomputation architecture based on
Shannons decomposition - f(x1,,xn) xi fxi xifxi
30- Ex 2)Latch-based pre-computation architecture
316) Low Power Circuits
- Use static rather than dynamic
- to avoid unnecessary precharge
- low static power
- self reverse bias for reducing subthreshold
current
32- Compromise between dynamic and leakage power
dissipation
33- Multi-VT(threshold) speed-critical part
low VT - power-critical part high VT
- - by back-gate bias routing difficult
- - by additional implant
- Adiabatic Computing
- Power dissipation is due to voltage
- drop on R ? reduce it!
- by gradual rise fall of inputs
- ? multi-step clock ??
34- Delay vs. power supply voltage(Td vs. VDD)
Td ? VDD-1
35- Power delay product(Energy) vs. delay for various
circuits
367) Power reduction in clock network
- Why bother with clock network?
- In synchronous circuit, clock is generally the
highest frequency signal. - And, clock typically drives a large load as it
has to reach many sequential elements. - In alpha chip, power consumption in the clock
network is 40 of total. - Clock gating
- Most popular method for power reduction of clock
signals - effective when some functional module(ALU, memory
or FPU, etc) is not required for some extended
period. - Gated clock suffers additional gate delay due to
gating function.
37- Reduced clock swing
- Conventional vs. half-swing clocking
38- Charge sharing circuit for half-swing clock
? VH ? 0.5 Vdd if CACB gtgt C1, C2, C3, C4
39- Simple charge sharing circuit
40- Tri-state keeper circuit
- Floating node with its potential somewhere
between GND and VDD is noise-sensitive and can
cause DC power dissipation in the fanin gate - Floating bus suppressor circuit
41- Blocking gate
- Fanin gates connected to a node floating( as it
is powered down) can experience large
short-circuit current. - Use a blocking NAND gate as below
42- Reduction of switching activity
- guarded evaluation
- adding latches or blocking gates before C/L if
its outputs are not used. - Ex).
43- Careful bus multiplexing for vely correlated
data stream - Aggressive bus multiplexing for -vely correlated
data stream
448) process
conflict
- VDD reduction ? reduce VT
- Standby current? ???. ? VT not too small
- leakage ?? ?? ? junction profile, high
subthreshold swing - switching power ?? ? parasitic C ??
- (high-speed? ?? goal??)
- retrograded channel
- trench
- sidewall pacer for S/D implant
459) Library
- Small size, various sizes for tr. sizing for
delay balancing long intercon. on low C-layer - to reduce glitch
- to reduce buffer size
10) Material low e inter-layer dielectric low
? material for intercon ? copper