Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection

1 / 30
About This Presentation
Title:

Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection

Description:

Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection Hamid Noori , Maziar Goudarzi , Koji Inoue , and –

Number of Views:121
Avg rating:3.0/5.0
Slides: 31
Provided by: pars197
Category:

less

Transcript and Presenter's Notes

Title: Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection


1
Improving Energy Efficiency of Configurable
Caches via Temperature-Aware Configuration
Selection
  • Hamid Noori , Maziar Goudarzi , Koji Inoue ,
    and
  • Kazuaki Murakami
  • Speaker Tohru Ishihara
  • Institute of Systems Information
    Technologies/KYUSHU, Japan
  • Kyushu University, Japan

2
Outline
  • Background
  • Motivation
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

3
Outline
  • Background
  • Motivation
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

4
Background(1/2)
The dynamic energy per a cache access
The leakage power of a cache memory
5
Background(2/2)
6
Outline
  • Background
  • Motivational Example
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

7
Motivational Example (1/3)
8
Motivational Example (2/3)
Total dynamic energy for executing a program
Total static energy for executing a program
9
Motivational Example (3/3)
Minimum-energy cache size
10
Outline
  • Background
  • Motivation
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

11
Problem Definition (1/3)
  • Objective function total memory energy
  • Cache dynamic energy
  • Cache static energy
  • Off-chip memory access energy
  • Energy consumption during processor stall

Main memory
CPU
I-
D-
12
Problem Definition (2/3)
  • energy_memory(C, Temp, Tech)
  • energy_dynamic(C, Tech) energy_static(C, Temp,
    Tech) (1)
  • energy_dynamic(C, Tech)
  • cache_accesses(C) energy_cache_access(C, Tech)
    cache_misses(C) energy_miss(C,Tech)
    (2)
  • energy_miss(C, Tech)
  • energy_off_chip_stall energy_cache_block_refill(
    C, Tech) (3)
  • energy_static(C, Temp, Tech)
  • executed_clock_cycles(C) clock_period
  • leakage_power(C, Temp, Tech) (4)

13
Problem Definition (3/3)
  • For a given application, processor architecture,
    technology, and valid configurations of the
    configurable cache, find a valid cache
    configuration that results in minimum energy
    consumption in a specific temperature over the
    entire execution of the given application.

14
Outline
  • Background
  • Motivation
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

15
Architecture
  • TACC
  • BCC (proposed by Zhang et al. 1)
  • Cache size (way shutdown)
  • Number of ways (way concatenation)
  • Line size
  • Thermal sensor
  • Accessible port for reading the thermal sensor

1 C. Zang, F. Vahid and W. Najjar,.A Highly
Configurable Cache Architecture for Embedded
Systems, ACM Trans. on Embedded Computing
Systems, vol.4, no.2, May 2005
16
Reconfiguration Flow
17
Outline
  • Background
  • Motivation
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

18
Experiment Setup (1/2)
  • Mibench
  • Simplescalar
  • Cache hit one clock cycle
  • Cache miss 100 clock cycles
  • Clock freq of the base processor 200 MHz
  • CACTI 4.2
  • Target technology 70nm (Vdd0.9)
  • BCC (16KB)
  • 16KB (4-, 2-, 1-way)
  • 8KB (2-, and 1-way)
  • 4KB (1-way)
  • The line size for each of the configurations can
    be 8-, 16-, or 32-byte.

19
Experimental Setup (2/2)
  • Base Configurable Cache (BCC)
  • It has the same architecture proposed by Zhang et
    al. 1
  • It supports a limited set of configurations
  • It is configured for each application for
    corner-case (i.e. leakage at 100C)
  • Temperature-Aware Configurable Cache (TACC)
  • TACC is configured for each execution of an
    application considering the chip temperature at
    that time

1 C. Zang, F. Vahid and W. Najjar,.A Highly
Configurable Cache Architecture for Embedded
Systems, ACM Trans. on Embedded Computing
Systems, vol.4, no.2, May 2005
20
Energy Performance Evaluation
  • Energy Saving

100
Performance Enhancement
100
21
Data and Instruction Cache
D qsort djpeg lame dijkstra patricia sha adpcm crc fft
0C 16K, 32, 2 16K, 32, 2 16K, 32, 4 16K, 32, 2 16K, 32, 2 16K, 32, 2 8K, 32, 2 8K, 32, 2 16K, 32, 4
20C 8K, 32, 2 16K, 32, 2 16K, 32, 4 16K, 32, 2 16K, 32, 2 8K, 32, 1 8K, 32, 2 8K, 32, 2 16K, 32, 4
40C 8K, 32, 2 16K, 32, 2 16K, 32, 4 8K, 32, 2 16K, 32, 2 4K, 32, 1 8K, 32, 2 8K, 32, 2 16K, 32, 4
60C 8K, 32, 2 16K, 32, 2 16K, 32, 2 8K, 32, 2 8K, 32, 2 4K, 32, 1 4K, 16, 1 8K, 32, 2 8K, 32, 2
80C 8K, 32, 2 8K, 32, 2 16K, 32, 2 8K, 32, 2 8K, 32, 2 4K, 32, 1 4K, 16, 1 4K, 32, 1 8K, 32, 2
100C 4K, 32, 1 8K, 32, 2 8K, 32, 2 8K, 32, 2 8K, 32, 2 4K, 32, 1 4K, 32, 1 4K, 32, 1 8K, 32, 2


I basimath qsort djpeg lame dijkstra blowfish rijndael gsm fft
0C 16K, 8, 4 16K, 8, 4 16K, 32, 1 16K, 32, 2 16K, 32, 1 16K, 16, 2 16K, 32, 1 16K, 16, 4 8K, 32, 1
20C 16K, 16, 4 16K, 16, 4 16K, 32, 1 16K, 32, 2 16K, 32, 1 16K, 16, 2 16K, 32, 1 16K, 32, 2 8K, 32, 1
40C 16K, 16, 4 16K, 16, 4 8K, 32, 2 8K, 32, 2 8K, 32, 2 16K, 32, 2 16K, 32, 1 16K, 32, 2 8K, 32, 1
60C 16K, 16, 4 16K, 16, 4 8K, 32, 2 8K, 32, 2 8K, 32, 2 16K, 32, 2 16K, 32, 1 8K, 32, 2 8K, 32, 1
80C 16K, 32, 4 16K, 32, 4 8K, 32, 2 8K, 32, 2 8K, 32, 2 8K, 32, 2 16K, 32, 1 4K, 32, 1 8K, 32, 1
100C 16K, 32, 4 16K, 32, 4 8K, 32, 2 8K, 32, 2 8K, 32, 2 8K, 32, 2 16K, 32, 2 4K, 32, 1 8K, 32, 1
22
Energy Saving
23
Performance Enhancement
24
Outline
  • Background
  • Motivation
  • Problem Definition
  • Proposed Approach
  • Architecture
  • Reconfiguration Flow
  • Experimental Results
  • Conclusions

25
Conclusions
  1. Importance of temperature-aware configurable
    cache for finer technologies. Up to 61 (17 on
    average) energy consumption in 70nm technology
    for instruction cache
  2. Data cache is more easily affected by temperature
    than instruction cache. Using a configurable data
    cache, up to 77 (36 on average) energy can be
    saved in 70nm technology.
  3. The TACC improves the performance for instruction
    cache up to 28 (5 on average) and for data
    cache, it is up to 17 (8.1 in average).

26
  • Thank you for your attention
  • Please ask any questions to noori_at_c.csce.kyushu-u.
    ac.jp

27
Backup slides
28
(No Transcript)
29
(No Transcript)
30
ARM7TDMI ARM966E-S
130nm Power consumption 7.98 mW 62.5 mW
130nm Frequency 133 MHz 250 MHz
90nm Power consumption 7.08 mW 51.7 mW
90nm Frequency 236 MHz 470 MHz
Write a Comment
User Comments (0)
About PowerShow.com