Title: Kevin Skadron
1The Laboratory for Computer Architecture at
Virginia (LAVA)
- Kevin Skadron
- University of Virginia
- Department of Computer Science
2Why We Care About Thermal Management...
Source Toms Hardware Guidehttp//www6.tomshardw
are.com/cpu/01q3/010917/heatvideo-01.html
3Dynamic Thermal Management
- Dynamically adjust execution to control
temperature - Avoid catastrophic failure (heat sink, fan)
- Permit the use of a less expensive thermal
package - Design for less than the worst case
- Package costs 1 / W above 40 W
- Peak power as high as 130 W in 1-2 generations
(SIA roadmap) - Temperatures over 100C
4Dynamic Thermal Management
- Deal with hot spots
- Localized heating occurs much faster than
chip-wide - Chip-wide treatment is too conservative
- Prove temperature will be safely bounded
5Thermal Modeling
- Want a fine-grained model of temperature
- Power dissipation too indirect, not easy to
measure in HW
6Ohms Law for Temperature
- V ? temp
- I ? power
- R ? thermal resistance
- C ? thermal capacitance
- RC ? time constant
- I ?t V ?t
- ?V ------- --------
- C RC
- Lets us compute stepwise changes in temperature
for any granularity at which we can get P, T, R,
C - steady-state V IR (T PR)
7Thermal Modeling
- Use thermal resistance and capacitance of Si
- Develop computationally efficient model based on
lumped values - Pi
?t Ti ?t - ?Ti -------- ---------
- Ci RiCi
- Integrate in Wattch (power/performance
simulator) - Time evolution of temperature is driven by unit
activities and power dissipations on a
per-cycle basis - Detect hot spots and activate thermal response
- Typical time constant 10-100 ?s
8Fetch Toggling
- Fetch toggling
- disable fetch every N cycles
- 4/5, 2/3, 1/2, 1/3, 1/5,
IF
ID
EX
MEM
WB
9Fetch Toggling
- Fetch toggling
- disable fetch every N cycles
- 4/5, 2/3, 1/2, 1/3, 1/5,
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
10Fetch Toggling
- Fetch toggling
- disable fetch every N cycles
- 4/5, 2/3, 1/2, 1/3, 1/5,
- How to set the fetch rate?
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
11Feedback-Control of Fetch Toggling
- Formal feedback control
- PID m KC (e KI?e Kdde/dt)
- easy to compute
- toggling f(m)
e
m
setpoint
P
T
ActuatorI-fetch toggling
Thermaldynamics
Controller
Temp. sensor
measured T
12Other Thermal-Management Techniques
- Fetch toggling
- Fetch throttling
- Decode throttling
- Speculation control
- Frequency/voltage scaling
13Per-Structure Response
- Hot spots
- Branch predictor (probed every cycle)
- Load-store queue
- L1 D-cache (for high-BW apps)
- most major structures are a hot spot for at
least one SPEC2k app - Modified Wattch
- Sampling rate 1000 cycles (RC of hot spots is
10-100 ?s) - Base temp. of 100?C (SIA roadmap)
- Emergency threshold of 108? (Yuan/Hong SEMI-THERM
01) - Set point of 107.9?
14Thermal Modeling Where to go from here?(i.e.,
lots of research questions)
- Floor-planning issues and granularity of lumped
R/C values - Thermal coupling among blocks
- Response lag in temperature sensors
- Validation techniques
- Visualization
- How to deal with large time scales?
15Thermal Management Where to go from here?
(i.e., lots more research questions)
- New mechanisms
- Characterize benchmarks
- When to use frequency/voltage scaling
- Faster HW techniques for sensing temperature
changes - Robust response despite sensor lag
- Hot spots
- Temperature effects on leakage current
- Joint control of temp., power, and performance
16Thermal Management Where to go from here?
(i.e., lots more research questions)
- New mechanisms
- When to use clock scaling
- Robust response despite sensor lag
- Temperature effects on leakage current
- Joint control of temperature, power, and
performance
17Summary
- New tools for thermal management
- Models
- Mechanisms
Source Toms Hardware Guidehttp//www6.tomshardw
are.com/cpu/01q3/010917/heatvideo-01.html
18Backup slides
19Performance Loss
Performance loss reduced by 65