PowerEfficient Architecture: A Primer and Some Research Ideas - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

PowerEfficient Architecture: A Primer and Some Research Ideas

Description:

The Good News: Technology and architecture improvements give us exponential performance ... Analogy: Car has best gas mileage when idling, so don't move? ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 15
Provided by: margaretm7
Category:

less

Transcript and Presenter's Notes

Title: PowerEfficient Architecture: A Primer and Some Research Ideas


1
Power-Efficient ArchitectureA Primer and Some
Research Ideas
  • Prof. Margaret Martonosi
  • Dept. of Electrical Engineering
  • Princeton University

2
CPUs in the 1990s
  • The Good News
  • Technology and architecture improvements give us
    exponential performance improvements
  • 2X performance every 18 months
  • The Bad News
  • Increases in power dissipation have a slower
    doubling rate, but are also exponential

3
Why care about Power?
  • Battery-operated devices
  • You carry the energy you useCurrent battery
    technology weighs roughly 20 Watt-hours per pound
  • Important even in non-battery devices though...
  • Heat dissipation
  • packaging
  • cost
  • Environment!?!
  • Current delivery Getting tough to deliver large
    currents into chip 30W at 3V is 10Amps!

4
Power Dissipation Basics
  • In current CMOS circuits
  • static power is negligible
  • leakage currents
  • turn off the clock, and power dissipation -gt 0
  • dynamic power matters
  • charging/discharging capacitance in circuit
  • when bits change, power is dissipated

5
Power-Efficient CPUs Background
  • Reducing dynamic power dissipation
  • Pd proportional to CV2Nf
  • Lots of work ongoing at the circuits level
  • lots of work on reducing V (square law helps)
  • almost all CPU supply voltages are lt 3V now...
  • Can always reduce f useful but hurts
    performance!
  • Analogy Car has best gas mileage when idling, so
    dont move?!?
  • Reducing N
  • clock gating...

6
Power Optimization More Global Approaches
  • Rather than attacking C,V,f, N within a
    particular pre-set architecture, look at ways of
    dramatically restructuring architectures to
  • Streamline work and bit transitions to exactly
    what application needs
  • Shorten wires
  • Replace broadcast structures with point-to-point
    structures
  • Look at ways of performing several operations in
    parallel at a slower clock rate

7
Operand-Value-Based Analysis and Optimizations
  • Currently Some compiler and hardware
    optimizations are specific to operand values
  • Constant propagation Algebraic simplification
  • Null arithmetic elimination
  • Our Goals
  • Tailor computations to particular categories of
    operand values being computed on
  • e.g. narrow-bitwidth operands
  • Consider both software and hardware mechanisms
  • also both compile-time and run-time mechanisms

8
Example Narrow-Width Operands
  • 64-bit architectures are largely motivated by
    addresses getting larger data has not increased
    as quickly
  • Multimedia instruction set extensions like MMX
    try to parallelize operations on narrow-width
    operands
  • Works well when programmer gives sufficient type
    information to infer operand sizes
  • but programmers arent always fastidious about
    defining variables to be as small as possible.
  • Goal Harness optimizations for narrow-width
    operands even when programmer hasnt defined
    quantities as narrow-width.

9
Motivation Narrow-Width Operands are Common!
Cumulative Percent Occurrence
  • Multimedia instruction sets (eg. MMX) take
    advantage of them for sub-word parallelism
  • But, general-purpose apps also have many
    operations that turn out to have small (lt16 bits)
    operand sizes

10
Optimizing for Narrow-width OperationsWhat to
do?
  • When performing operations with narrow-width
    operands, the upper bits are not needed in the
    computation
  • What can be done to avoid the wasted upper bits
    of work?
  • Save power through selective clock gating
  • Increase performance through MMX-style packing

11
Clock Gating Architecture
  • We propose selective clock gating based on the
    operand values.
  • Observe operand values going to/from registers
  • For operations with 2 narrow-width operands,
    disable upper bits of the functional unit.

12
Operand-Based Clock Gating Power Savings
Results
Integer Unit Power Consumption (mw)
  • Total power saved 50-60 of the power consumed
    by the integer execution unit
  • High performance microprocessors integer
    execution unit 10-15 of overall power
  • 5-10 of total power
  • In VLIW and DSPs, this number is likely to be
    even larger

13
More Power-Efficient Architecture Work
  • Other Value-based optimizations
  • Dynamic strength reduction
  • to eliminate unneeded ALU ops
  • Explicit Value Steering
  • Register renaming, operand bypassing, reservation
    stations are all dynamic techniques that
    support passing data output by one calculation
    directly to the inputs of another.
  • Look at more explicit, program-controlled ways of
    doing this with lower power.

14
Power Efficiency A Systems-Level Approach
  • Compiler/CPU interaction Need to give compiler
    hooks into the hardware for optimizing program
    power
  • System-level Power management OS and run-time
    systems can manage power by prioritizing
    activities. Some activities need not be done as
    often if power is a concern.
  • E.g. Turn off MS Words spell-checking on a
    long-distance flight
  • Communication/Computation Tradeoffs Power really
    exacerbates all of the standard Comm/Comp
    tradeoffs. Need to study protocols and OS issues
    related to this.
Write a Comment
User Comments (0)
About PowerShow.com