Title: COE 308: Computer Architecture T041 Dr' Marwan AbuAmara
1COE 308 Computer Architecture (T041)Dr. Marwan
Abu-Amara
- Chapter 3 Cache Memory Systems (cont.)
2Write Operations
- Need to make sure that data is identical between
cache MM ? when writing to cache, we must
maintain same copy in MM - Two mechanisms to achieve this goal
- Write-through
- Write-back (Copy back)
3Write-Through
- Every write operation to cache is also done to MM
at the same time (note time to write to MM is
longer than time to write to cache) - Average access time of write-through with
transfers from MM to cache on all misses (read
write) - Let w write references ? (1 w) read
references - Let tb time to transfer block to cache
4Write-Through (cont.)
ta
- ? ta tc (1 h) tb w (tm tc)
- The equation above assumes fetch to cache on
write (write allocate) - The equation for ta with no fetch on write
(non-write allocate) is as follows
5Write-Through (cont.)
ta
- ? ta tc (1 w) (1 h) tb w (tm tc)
6Write-Back (Copy Back)
- Write operation to MM is only done at block
replacement time - Average access time (block is written back to MM
irrespective of whether the block has been
altered simple write-back) - ta tc (1 h) tb (1 h) tb tc 2 (1
h) tb
7Write-Back (Copy Back) (cont.)
- Average access time (block is written back to MM
only when altered Let wb probability that a
block has been altered) - ta tc (1 h) tb wb (1 h) tb
- tc (1 h) (1 wb) tb
- Best case wb ? 0 ? ta tc (1 h) tb
- Worst case wb ? 1 ? ta for simple write-back
8Replacement Policy
- How to displace a block in cache if cache is
full? - Not needed for Direct mapping!
- Replacement algorithm must be implemented in H/W
- 3 types
- Random
- FIFO
- LRU
- Most common is LRU is implemented in H/W using
- Aging Counters
- Register Stack
- Reference Matrix
- Approximate method
9Aging Counters
- Use a counter with each cache block
- Increment counters at hit or miss as follows
- Hit
- Counter with block hit is reset to 0 (MRU)
- Counters having smaller value than the hit block
counter originally are incremented by 1 - Counters having larger value than the hit block
counter originally are not changed - Miss
- Cache is not full
- Counter with incoming block is reset to 0
- All other counters are incremented by 1
10Aging Counters (cont.)
- Miss
- Cache is full
- Block with counter set at maximum value is chosen
to be replaced, then reset its counter to 0 - All other counters are incremented by 1
- Counter with largest value identifies LRU
11Aging Counters Example
4-way set associative cache ? Need 2-bit
counter for each block Address Subsequent
Tag Hit/Miss C0 C1 C2 C3 Actions Initial. 0
0 0 0 4 Miss 0 1 1 1 Block 0
filled 5 Miss 1 0 2 2 Block 1
filled 4 Hit 0 1 2 2 Block 0
accessed 6 Miss 1 2 0 3 Block 2
filled 7 Miss 2 3 1 0 Block 3
filled 5 Hit 3 0 2 1 Block 1
accessed 8 Miss 0 1 3 2 Block 0
replaced 9 Miss 1 2 0 3 Block 2 replaced
12Register Stack
- A set of n-bit registers is formed (1 for each
block in the set) - MRU is recorded at top of stack, and LRU is
recorded at bottom - Value held in 1 register is passed to next
register when a new block is accessed - If a block hit occurs then move value to top
move down (towards the bottom) all values that
used to be above the value with the hit - When cache is full miss occurs, replace block
with value at bottom
13Reference Matrix
- Matrix of status bits
- Use the upper triangular matrix of a B ? B matrix
formed without a diagonal, where B is the of
blocks to consider - When ith block is referenced
- All bits in ith row are set to 1
- All bits in ith column are set to 0
- LRU block is one which has all 0s in its row
all 1s in its column
14Reference Matrix Example
4-way set associative cache ? Need 4 x 4 Matrix
0
0
0
0
1
1
1
1
3
3
15Reference Matrix Example (cont.)
1
1
1
0
1
1
1
1
1
0
1
0
0
0
0
1
0
0
0
2
2
16Reference Matrix Example (cont.)
0
1
0
1
1
0
1
1
0
1
0
1
1
0
17Cache Example
- Consider the execution of the following program
of a 4 ? 6 array, A - For (i 0 i 3 i )
- Sum 0
- For (j 0 j 5 j )
- Sum Sum A(i, j)
- EndFor
- Average Sum/10
- For (k 5 k 0 k --)
- A(i, k) A(i, k)/Average
- EndFor
- EndFor
18Cache Example (cont.)
- A is stored in MM in a column-major order. Assume
that there are 8 blocks in the cache, each is one
word, and that when needed LRU replacement policy
is used. Show the cache contents for the cases i,
j 0,0 ? 5 and i, k 0,5 ? 0 for - Direct mapping
- Fully associative mapping
- Compute also the of cache hit made, block
replacement, and the cache utilization in each
case
19Cache Example (cont.)
- Solution
- A is stored in MM as shown to the right
- For Direct mapping
- Hit ratio 2 / 12 0.17
- Block replacements 8 (or 8 / 12 67)
- Utilization 2 / 8 25
20Cache Example (cont.)
- Solution
- A is stored in MM as shown to the right
- For Fully Associative mapping
- Hit ratio 6 / 12 0.5
- Block replacements 0 (or 0 / 12 0)
- Utilization 6 / 8 75