Title: A Comprehensive Approach to DRAM Power Management
1A Comprehensive Approach to DRAM Power Management
- Ibrahim Hur and Calvin Lin
- IBM Austin
- The University of Texas at Austin
2Power Consumption
- Power has become increasingly important
- Designers consider limiting performance to stay
in power budgets - DRAM power is significant
- 45 of total system power
(Lefurgy et al., IEEE Computer 2003)
3DRAM Power Management
- Two possible goals
- Reduce DRAM power with minimal performance
degradation - Stay within some given DRAM power budget while
degrading performance as little as possible
4First Goal Reducing DRAM Power
- Put certain ranks of DRAM into low-power mode
- Entering and exiting has overhead
- Ranks must remain in low-power mode for some
minimum number of cycles - How to enter and exit low-power mode?
- Enter and exit too frequently ? increased DRAM
latency - Enter and exit too infrequently ? less power
savings - We introduce a new power-down approach and a new
memory scheduler
5Second Goal Reducing DRAM Power Arbitrarily
- Power Shifting (Felter et al., ICS 2005)
- Dynamically assign power budgets to CPU and DRAM
- What if DRAM power reduction from low-power modes
is not sufficent? - Throttle memory commands in the memory controller
- How can we throttle the system accurately?
6Performance Effects
- Throttling degrades performance
T
7What is the Optimal Throttling Degree?
- Inaccurate throttling
- Power consumption is over the budget
- Unnecessary performance loss
Application 1
App. 2
A B
8Outline
- The Problem
- Reducing DRAM Power
- Memory Throttling
- Our approach A comprehensive solution
- Queue-Aware Power-Down Mechanism
- Power/Performance-Aware Scheduling
- Adaptive Memory Throttling
- Results
- Conclusions
9Adaptive Memory Throttling
- A novel approach to perform accurate memory
throttling - Embeds a linear estimation model inside the
memory controller - Simple and low cost
10Processors/Caches
Reads/Writes
Power Target
determines how much to throttle, at every 1
million cycles
Model Builder (a software tool, active only
during system design/install time)
Throttle Delay Estimator
Read Write Queues
decides to throtle or not, at every cycle
Scheduler
Throttling Mechanism
sets the parameters for the delay estimator
Memory Queue
MEMORY CONTROLLER
DRAM
11Throttling Mechanism
- Stall all traffic from the memory controller to
DRAM for T cycles for every 10,000 cycle intervals
. . .
active
stall
active
stall
T cycles
T cycles
10,000 cycles
10,000 cycles
time
- How to calculate T (throttling delay)?
12Delay Estimator
- Calculates the throttling delay, T, using a
linear model - Input Power threshold and information about
memory access behavior of the application - Output Throttling delay
- Calculates the delay periodically (in epochs)
- Assumes consecutive epochs have similar behavior
- Epoch length is long (1 million cycles) overhead
is small - What are the features and the coefficients of the
linear model?
13Model Building
- An offline process performed during system
design/installation - Step 1 Perform experiments with various memory
access behavior - Step 2 Determine models and model features
- Needs human interaction during system design time
- Step 3 Compute model coefficients
- Solution of a linear system of equations
14Model Building Step 1Experiments
15Model Building Step 2Determining Features
- Model features that we determine
- Power threshold
- Number of Reads
- Number of Writes
- Bank conflict information
- Possible Models
- T1 Uses only Power threshold
- T2 Uses Power, Reads, Writes
- T3 Uses all features
16Model Building Step 3 Determining Coefficients
- Step 1 Set up a system of equations
- Known values are measurement data
- Unknwons are model coefficients
- Step 2 Solve the system
17Accuracy of the Models
R20.191 R20.122
R20.003
18Evaluation
- Used a cycle accurate IBM Power5 simulator that
IBM design team uses - Simulated performance and DRAM power
- 2.1 GHz, 533-DDR2
- Evaluated single thread and SMT configurations
- Stream
- NAS
- SPEC CPU2006fp
- Commercial benchmarks
19The IBM Power5
- 2 cores on a chip
- SMT capability
- 300 million transistors
Memory Controller
(1.6 of chip area)
20Memory System Parameters
21Results
22Results-2 Performance Effects
23Conclusions
- Introduced three techniques for DRAM power
management - Adaptive Memory Throttling
- Queue-Aware Power-Down (not presented)
- Power-Aware Scheduler (not presented)
- Evaluated on a highly tuned system, IBM Power5
- Simple and accurate
- Low cost
- Some other results in the paper (not presented)
- Energy efficiency improvements from our
Power-Down mechanism and Power-Aware Scheduler - Stream 18.1
- SPECfp2006 46.1
24