But... - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

But...

Description:

Use timestamps as replacement metric. Uppsala ... Elbow cache, 7-step feedback, 5-bit timestamp. Uppsala Architecture Research Team ... Timestamps are useful. ... – PowerPoint PPT presentation

Number of Views:11
Avg rating:3.0/5.0
Slides: 28
Provided by: mathias86
Category:
Tags: timestamps

less

Transcript and Presenter's Notes

Title: But...


1
Mommy, mommy! I want a hardware cache with few
conflicts and low power consumption that is easy
to implement!
But... Thats three wishes in one!!!
2
Refinement and Evaluation of theElbow
CacheorThe Little Cache that could
  • Mathias Spjuth

3
2-way Set Associative Cache
A
A
B
B
E
E
C
C
D
D
F
F
H
G
G
H
Memory References A
Memory References A-B
Memory References A-B-C
Memory References
Memory References A-B-C-D
Memory References A-B-C-D-E
Memory References A-B-C-D-E-F
Memory References A-B-C-D-E-F-G
Memory References A-B-C-D-E-F-G-H
4
Conflicts (cont.)
  • Traditional way of reducing conflicts is to use
    set associative caches.
  • Lower miss rate (than direct-mapped)
  • -- Slower access
  • -- More complexity (uses more chip-area)
  • -- Higher power consumption

5
2-way Skewed Associative Cache
A
B
Cache Bank 1
E
C
A
D
C
F
F
G
H
G
Cache Bank 2
H
D
B
E
Memory References A
Memory References A-B
Memory References A-B-C
Memory References
Memory References A-B-C-D
Memory References A-B-C-D-E
Memory References A-B-C-D-E-F
Memory References A-B-C-D-E-F-G
Memory References A-B-C-D-E-F-G-H
6
2-way Skewed Associative Cache
A
B
Cache Bank 1
E
C
A
D
C
F
F
G
H
G
H
No Conflicts!
Cache Bank 2
H
D
B
E
Memory References A
Memory References A-B
Memory References A-B-C
Memory References
Memory References A-B-C-D
Memory References A-B-C-D-E
Memory References A-B-C-D-E-F
Memory References A-B-C-D-E-F-G
Memory References A-B-C-D-E-F-G-H
7
Skewed associative caches
  • Uses different hashing (skewing) functions for
    indexing each cache bank
  • Lower missrate (than set-assoc.)
  • More predictable
  • -- Slightly slower (hashing)
  • -- Cannot use LRU replacement
  • -- Cannot use VI-PT

8
Elbow Cache
  • Improve the performance of a skewed associative
    cache by reallocating blocks within the cache.
  • By doing so we get a broader choice of which
    block to choose as the victim.
  • Use timestamps as replacement metric.

9
Finding the victim
  • Two methods
  • Look-aheadConsider all possible placements
    before the first reallocation is made.
  • FeedbackOnly consider the immediate placements,
    then iterate.

10
2-way Elbow Lookahead Cache
A
B
Cache Bank 1
E
C
A
D
D
F
F
X
X
G
H
G
Replacement paths F-B-A E-D-H
Cache Bank 2
H
C
B
E
Memory References A
Memory References A-B
Memory References A-B-C
Memory References
Memory References A-B-C-D
Memory References A-B-C-D-E
Memory References A-B-C-D-E-F
Memory References A-B-C-D-E-F-G
Memory References A-B-C-D-E-F-G-H-X
11
2-way Elbow Feedback Cache
A
B
Cache Bank 1
E
C
A
D
D
F
F
X
G
H
G
Temp. Register
Cache Bank 2
H
C
B
E
X
Memory References A
Memory References A-B
Memory References A-B-C
Memory References
Memory References A-B-C-D
Memory References A-B-C-D-E
Memory References A-B-C-D-E-F
Memory References A-B-C-D-E-F-G
Memory References A-B-C-D-E-F-G-H-X
12
Finding the victim (cont.)
  • Look-ahead
  • Most optimal
  • -- Difficult to implement (gt1 transformation)
  • Feedback
  • Easy to implement (feed victim back to write
    buffer)
  • -- Needs extra space in the write buffer

13
Replacement Metrics
  • Enhanced-Not-Recently-Used (NRUE)
  • The best policy for skewed caches known so far.
  • Each block contains two extra bits, a
    recently-used and very-recently-used bit, that
    are set on access to the block.
  • These bits are regularly cleared. The
    very-recently-used bit is cleared more often.
  • First, try to find a victim with no bit set.
  • Then one with only the recently-used bit set.
  • Then use random replacement.

14
Timestamps
Increase counter on every cache allocation
TA
A
10100 100000
10100 100001
10100 100010
Counter
Tcurr
TB
B
Timestamp
Data

Tcurr TA if Tcurr gt TA
Dist(A)
Tmax Tcurr TA if Tcurr lt TA
15
Timestamps
Timestamp ticks
Tmax
0
Dist(A) gt Dist(B) A older than B
Dist(A) lt Dist(B) B older than A
16
Implementation
  • Lookahead
  • At most one transformation (4 possible victims)
    each replacement.
  • Do the transformation and load the new data at
    the same time.

17
Implementation
  • Feedback
  • Up to 7 transformations (max. 8 possible victims)
    each replacement.
  • Temporary victims are moved to the write buffer,
    before reallocation.
  • Extra control field in write buffer.

18
Feedback
Y
writemem
21
21
Xid1
Xid2
b Step DataTag TmSt

DataTag TmSt
DataTag TmSt
ATmSt
BTmSt
Read
X
Write
Write Buffer
CTmSt
Bank I
Bank II
j

1

s
i
N
v
22
k
b
readmem
X
19
Test Configurations
  • Set associative 2-way, 4-way, 8-way, 16-way
  • Fully associative cache
  • Skewed associative, LRU
  • Skewed associative, NRUE
  • Skewed associative, 5-bit timestamp
  • Elbow cache, 1-step lookahead, 5-bit timestamp
  • Elbow cache, 7-step feedback, 5-bit timestamp

20
Test Configurations (2)
  • General configuration
  • 8 KB, 16 KB, 32 KB cache size
  • L1 data cache with 32 byte block size
  • Write Back No Allocate on Write infinite
    write buffer (all writes ignored)

Miss Rate Reduction (MRR) MRR (MRref
MR)/MRref
21

22

23
Conclusions
  1. For a 2-way skewed cache, timestamp replacement
    gives almost the same performance as LRU.
  2. Timestamps are useful.
  3. A 2-way elbow cache has roughly the same
    performance as an 8-way set associative cache of
    the same size.

24
Conclusions (2)
  1. The lookahead design is slightly better than the
    feedback.
  2. There are drawbacks with all skewed caches
    (skewing delays, VI-PT).
  3. If the problems can be solved, the elbow cache is
    a good alternative to set associative caches.

25
Future Work
  • Power awareness
  • How does an elbow cache stand up against
    traditional set associative caches when power
    consumptions is considered?

26
Links
  • UART web
  • www.it.uu.se/research/group/uart/

27
?
Write a Comment
User Comments (0)
About PowerShow.com