Title: P1246990939erTfL
1Â Â Â Â CACHE MEMORY BOOK by JIM
HANDY PowerPoint presentation   student
arko Acimovic zareac_at_galeb.etf.bg.ac.yu
 professor dr Veljko Milutinovic assistant
Gvozden Marinkovic Â
Belgrade 20.12.2002
2Address Buffers
CPU
Main Memory
Address
Address
DRAM Array
DRAM Refresh Logic
Data Buffers
Data
Data
Address Buffers
DMA Controler
System Bus
Address
Data Buffers
TYPICAL PROCESSOR SYSTEM, WITH CPU, DMA DEVICE,
AND BUFFERS
DMA device
Data
2 / 90
3Buffers
- Buffers are inserted between CPU and DRAM
- Buffers increase small current available from CPU
- They isolate the CPU from memory during DRAM
refresh cycle - Fixed propagation delay, no advance in technology
3 / 90
4Memories
- Typical SRAM has write data set-up time 5ns and
hold time 0 ns - Read access time is fixed, no advance in
technology - Most of CPU clock is consumed by these static
times
4 / 90
5The Concept of Locality
- Nearly all code is extremely repetitive
- We can use that fact for increasing computer
throughput - LOCALITY IN SPACE most of code executes out of
small repetitive area - LOCALITY IN TIME a CPU is much more likely to
access a memory location which it acessed 10
cycles ago then one accessed 10000 cycles ago
5 / 90
6Cache memory
- Repetitive portion of code should be executed out
of fast memory - Spliting memory space
- Virtual memory disk vs. operative memory
- Cache memory slow vs. fast memory
6 / 90
7CAM (Content Addressable Memory)
Compare Address
Encoder
COMPARATOR REGISTER 0
Match0
N-bit Binary Output
COMPARATOR REGISTER 1
Match1
MatchDetect Output
Match2
COMPARATOR REGISTER 2
.
.
Match3
COMPARATOR REGISTER 3
.
Match0
COMPARATOR REGISTER 3
In content addressable memory (CAM), an
address presented to the compare bus is compared
simultaneously with the contents of every memory
location. If there is a match at any location,
the address of that location is sent to the
N-binary output (translation of CPU address to
cache address)
7 / 90
8CPU Address
C P U
Address
Read Address
Cache Data Memory
System Bus
CAM
COMPARE ADDRESS
CACHE ADDR
CACHE READ HIT
Cache Controler
Read Data
Main memory address
C P U
Address
CPU Address
UPDATE CYCLE
Cache Data Memory
CAM
COMPARE ADDRESS
CACHE ADDR
REPLACEMENT DATA
System Bus
Cache Controler
REPLACEMENTADDRESS
8 / 90
9Cache Data and Cache-Tag Memories
- Fully associative cache, slow and expensive
- Sequent instructions are placed randomly
- All addresses must be checked for hit
- First we translate CPU address, then we go for
data - Set-associative approach exploits space locality
- CPU address goes parallely to TAG and DATA memory
- Tag bits determine whether we have hit or not
9 / 90
10Cache Tag RAM
Address Pins (Set Bits)
8K x 8 bit High-Speed Resetable Static RAM Array
ADDRESS
Reset
RESET
Write Enable
WRITE
Chip Select
DATA OUT
DATA IN
Comparator
Match Output
Output Enable
Data Pins (Tag Bits)
10 / 90
11C P U
Address
FULLY ASSOCIATIVE
Cache Data Memory
CAM
COMPARE ADDRESS
CACHE ADDR
System Bus
Replacement Address
Cache Controler
Data
Address
C P U
SET ASSOCIATIVE
Cache Data Memory
TAG ADDRESS
SET ADDRESS
CACHE -TAG DIRECTORY
System Bus
Replacement Address
Cache Controler
Data
11 / 90
12The Caches Interface to the CPU
- Main memory is too slow for CPU
- Fast CPU vs. slow buffers
- CPU wait state generation
- Split transactoins, processor issues comand, then
removes itself from the bus - Ready signal, its input signal to let know CPU
that data is ready
12 / 90
13Wait State Generation
Address Buffers
C P U
ADDRESSDECODER
.
.
System Bus
.
Ready Inputs from slower devices addresses
Ready Inputs from faster devices addresses
READYINPUT
Shift Register
CARRy OUTPUT
SYSTEM CLOCK
Data Buffers
13 / 90
14Look Aside vs. Look Through
- Initialization of both main memory access and
cache memory access - This is Look Aside useful when Cahe Miss occurs
- Single processor, single-tasking sytems
- Caches intervene in all CPU to main memory
transactions Look Through - Look Through first looks for data in the cache
- Dramatically reduces main memory buss trafic
14 / 90
15Read Hit
Address Buffers
C P U
Address
Main Memory
Cache -Tag
Cache Data Memory
Bus Buffers DISABLED
Cache Controler
System Bus
Data
Data Buffers
1.Tag matches address 2.Read data from cache data
memory to CPU 3.Disable system buffers
15 / 90
16Read Miss
Address Buffers
C P U
Address
Main Memory
Cache -Tag
Cache Data Memory
Bus Buffers ENABLED
Cache Controler
System Bus
Data
Data Buffers
1.Tag doesnt match address 2.Read data from
main memory to CPU via bus buffers 3.Write new
data into cache data memory 4.Write new address
into cache-tag
16 / 90
17Write Hit
Address Buffers
C P U
Address
Main Memory
Cache -Tag
Cache Data Memory
Bus Buffers ENABLED
Cache Controler
System Bus
Data
Data Buffers
Tag matches write address Write new data into
data memory Write through buffers into main memory
17 / 90
18Write Miss Strategies
- Cache ignores Write Misses, data goes directly to
main memory - Another technique is to invalidate cache
- Third technique is to write into cache weather
hit or miss - Last course of action is used in multiprocessors
18 / 90
19Write Miss
Address Buffers
C P U
Address
Main Memory
Cache -Tag
Cache Data Memory
Bus Buffers ENABLED
Cache Controler
System Bus
Data
1.Tag does not match write address 2.Write
through buffers into main memory 3.Cache data
memory undisturbed
Data Buffers
19 / 90
20Logical vs. physical caches
- Upstream cache (on the CPU side), downstream
cache (on MMU side) - Virtual or logical cache vs. physical
(downstream) cache - Slower RAMs can be used in logical design
- This is advantage if both CPU and MMU are on the
different chips - When CPU and MMU reside on the same chip, logical
cache cant be designed - Problems arise in logical caches with address
aliasing or synonyms - The best design is physical design
20 / 90
21C P U
Memory Management Unit (MMU)
Address Buffers
VIRTUAL ADDRESS BUS
PHYSICAL ADDRESS BUS
Address (Index Bits)
System Bus
LogicalCache
Data Buffers
Logical Cache
21 / 90
22C P U
Address Buffers
Memory Management Unit (MMU)
VIRTUAL ADDRESS BUS
PHYSICAL ADDRESS BUS
Address (Index Bits)
System Bus
PhysicalCache
Data Buffers
Physical Cache
22 / 90
23Associativity
- Direct mapped cache is one-way set-associative
cache - DM cache is very simple and fast, but has a
problem of trashing - Create a copy of direct mapped cache, with one
more TAG and DATA - Two different data dont content for the same
location - N-way set-associative cahe N DATA, TAG memories
with N comparators - Increase of size of cache reduces hit rate
- Component count is a problem with increasing
associativity - Duble the size of direct mapped cache is the best
solution for better hit rate
23 / 90
24Critical Timing Paths
- Direct mapped cache has two critical timing paths
(next slide) - In set associative cache, data RAMs can be
enabled after a cache hit has been detected - Set associative cahe has extremely critical time
path - It flows through cache-tag RAM, comparators,
controller and cache DATA (next slide)
24 / 90
25CPU
CONTROL LOGIC
CRITICAL TIMINGPATH 1 (dir-mapp)
READY
CACHE TAG RAM
CRITICAL TIMINGPATH 2 (dir-mapp)
CACHE DATA RAM
ADDRESS
DATA
EXTREMELY CRITICAL TIMING PATH (set-assoc)
CPU
CONTROL LOGIC
READY
CACHE TAG RAM
OUTPUT ENABLE 1
OUTPUT ENABLE 2
CACHE DATA RAM 1
CACHE DATA RAM 2
ADDRESS
DATA
25 / 90
26Unified vs. Split Caches
- One cache for data, one for instructions
- Trashing is the same as in two-way
set-associative cache - Possible independent implementations of data and
instruction caches - Separate cache for data with temporal locality,
separate for data with spatial locality
26 / 90
27Write Through vs. Copy Back
- Write through strategy has three scenarios, but
main memory is allways updated - If Write Hit update cahe, if write miss ignore
cache - Another implementation invalidates cache if Write
Miss - Cache is always updated, whether Writ Miss or
Write Hit occurs - In Copy Back mechanism, data is not always
written in main memory - Implemented in multiprocessors, reduces bus
traffic - Very complex mechanism of eviction of data in
main memory - The simpliest implementation of Copy Back is with
Dirty Bit
27 / 90
28Size of Block
- The bigger block, the smaller tag memory
- If block is bigger, more words could be
transffered durring one bus cycle - Size of block affects write strategies
- Write allocation if Write Miss get the block
from memory, then write to it - Sectoring blocks are divided in sectors or
sub-blocks - Smallest writeable unit (usually word) has its
own Valid bit - Valid bit would be set only for the word being
written into cache
28 / 90
29Multiword block replacements
C P U
28
28
MSBs
30
Address
128 bit Wide Cache Data Memory
CACHE TAG
Address Buffers
2 LSBs
28
Main Memory
Addr
System Bus
2
128
CACHECONTROLER
Data
SELECT
MULTIPLEXER
32
128
32
Data
32
32
Data Buffers
29 / 90
30Write Buffers and Block Buffers
- CPU writes data to cache and to data buffer
- CPU continues to work with cache, buffer
downloads data to main memory - Main memory could be inconsistent
- Multilevel write buffer small fully associative
victim cache - Byte gathering when more bytes are written in
same address - Buffers in copy back mechanism eviction of data
is hidden from CPU
30 / 90
31Byte Gathering
Address
Data
Level
CYCLE 1 Write least significant byte to
address 09AF 45ED
3 2 1 0
N/A
N/A
N/A
N/A
N/A
N/A
09AF 45ED
N/A N/A N/A Valid
Address
Data
Level
CYCLE 2 Write word to to address 0000 0000
3 2 1 0
N/A
N/A
N/A
N/A
0000 0000
Valid Word
09AF 45ED
N/A N/A N/A Valid
Address
Data
Level
CYCLE 3 Write second least significant byte to
address 09AF...
3 2 1 0
N/A
N/A
N/A
N/A
0000 0000
Valid Word
09AF 45ED
N/A N/A Valid Valid
Address
Data
Level
3 2 1 0
CYCLE 4 Gain control over bus. Write buffer
location 0 to mem.
N/A
N/A
N/A
N/A
N/A
N/A
0000 0000
Valid Word
31 / 90
SYSTEM BUS
32Concurrent line write-back with line buffer
C P U
Address Buffers
Address
Main Memory
Cache-Tag Memory
Cache Data Memory
Address Register
System Bus
Missed word
Block Buffer
Data Buffers
Miss detected output missed address to main
memory and address register
32 / 90
33Concurrent line write-back with block buffer
C P U
Address Buffers
Address
Main Memory
Cache-Tag Memory
Cache Data Memory
Address Register
System Bus
Missed word
Block Buffer
New Data
Data Buffers
Load main memory data into block buffer and
CPU. CPU can continue to execute
33 / 90
34Concurrent line write-back with block buffer
C P U
Address Buffers
Tag Bits
Address
Main Memory
Set Bits
Cache-Tag Memory
Cache Data Memory
Address Register
System Bus
Missed word
Block Buffer
Dirty Data
Data Buffers
Output Dirty blocks tag bits, address registers
set bits, and data RAMs dirty data to main
memory
34 / 90
35Concurrent line write-back with block buffer
C P U
Address Buffers
Tag Bits
Address
Main Memory
Set Bits
Cache-Tag Memory
Cache Data Memory
Address Register
System Bus
New Data
Block Buffer
Data Buffers
Write block buffer data into tag and data RAM
35 / 90
36Concurrent line write-back with WRITE buffer
C P U
Latching Address Buffers
System Bus
Address
Main Memory
Address
Cache-Tag Memory
Cache Data Memory
Address
Write Buffer
Data
Missed Data
Data
Data Buffers
Miss detected Output missed address to main
memory
36 / 90
37Concurrent line write-back with WRITE buffer
C P U
Latching Address Buffers
Dirty Data Set Bits
System Bus
Address of missed line
Dirty Data Tag Bits
Address
Main Memory
Address
Cache-Tag Memory
Cache Data Memory
Address
Write Buffer
Data
Missed Data
Data
Dirty Data
Write into write buffer - CPUs Set bits -
Cache-tag RAMs tag bits - Dirty line fromcache
data RAM
Data Buffers
37 / 90
38Concurrent line write-back with WRITE buffer
C P U
Latching Address Buffers
System Bus
Address of missed line
Address
Main Memory
Address
Cache-Tag Memory
Cache Data Memory
Address
Write Buffer
Data
New Data
Data
Data Buffers
Update cache-tag RAM with missed address. Update
cache data RAM with main memory response. Feed
data to CPU.
38 / 90
39Concurrent line write-back with WRITE buffer
C P U
Latching Address Buffers
Completely different address
System Bus
Address
Main Memory
Address
Cache-Tag Memory
Cache Data Memory
Address
Write Buffer
Data
New Data
Data
Data Buffers
CPU continues to operate out of cache. Bus
buffers turned off. Write buffer updates main
memory with stored Dirty line.
39 / 90
40 Miss Rate Reduction Techniques
- Increase block size
- Increase associativity (8-way set associative
fully associative direct-mapped cache of size N
has about the same miss rate as 2-way
set-associative cache of size N/2) - Victim caches
- Pseudo-associative cache one location more is
checked in case of miss,
usually with MSB - hardware prefetching of instructions
- software prefetching, data is stored either in
registers or in memory - Code optimization
2.
for(i0 ijj1) Xij 2 Xij
1.
for(j0 jii1) Xij 2 Xij
40 / 90
41Cache Miss Penalties Reduction Techniques
- Write buffers reduce miss penalty, problems with
data inconsistency - Increasing size of blocks reduces tag size miss
penalty is reduced by sectoring of blocks - Early restart or critical word first CPU rather
needs just one word than whole block - In pipeline hits under miss when cahe miss
occurs, CPU is not stalled, but continues to read
data - Secondary cache
41 / 90
42Hit Rate Reduction Techniques
- Hit time is critical directly affects CPU clock
period - Smaller caches have smaller hit time
- Logical caches have smaller hit time problems
with aliasing - Pipelining Writes for Fast Write Hits
42 / 90
43Maintaining Coherency in Cached Systems
- Cache coherency data in cache and main memory
are under tight control that stale and current
data are not confused - Protocol determines interaction between cache,
CPU, main memory and other bus masters - Cache strategy determines interaction between CPU
and cahe - DMA activity
- Mechanism of snooping the bus
43 / 90
44Snoop Line Invalidation Using Multiplexed Cache
Tag RAM
Address Buffers
DMA Controler
CPU
Address
Address
Cache Data Memory
DMA Device
Data
Multiplexer
System Bus
Cache Directory
Main Memory
Data
Address
Data
Data Buffers
44 / 90
45DMA Controler
DMA Device
Data
Address
System Bus
Address Buffers
Isolation Buffers
CPU
Address
Cache Data Memory
Cache Directory
Data Buffers
Main Memory
Data
45 / 90
46Dual Cache-Tag
- Instead of multiplexer one more Cache Tag RAM is
added Snoopy - Faster response, asynch mode, greater flexibilty
for the same component count - One cache tag snoops System bus, the other one
CPU bus - Both caches have same contents
- Snoopy Tag snoops System Bus when some other
master owns the bus
46 / 90
47Write new address into both cache directoryand
snoop directory while accessing main memory
DMA Controler
System Bus
Address Buffers
CPU
DMA Device
Address
Snoop Directory
Cache Data Memory
Cache Directory
Data Buffers
Main Memory
Data
47 / 90
48When CPU is not using the bus, simultaneous
DMA/snooping and CPU/cahe operations can occur
DMA Controler
System Bus
Address Buffers
CPU
DMA Device
Address
Snoop Directory
Cache Data Memory
Cache Directory
Data Buffers
Main Memory
Snoop Hit ??
Data
48 / 90
49Snooping address in a logical cache
Address Buffers
C P U
Tag Bits
DMA Controler
Memory Management Unit
Address
Page Number Address
Cache Directory
DMA Device
Page Offset Address
Data
Set Bits
Cache Data Memory
Comparator
SNOOP HIT ???
System Bus
Main Memory
Address
Data
Data
Data Buffers
49 / 90
50Multiprocessors
- Tightly coupled vs. Loosely coupled
Multiprocessors - Coherency domain noncacheable space in memory
for communication - Compiler generates and stores bits to tell which
memory locations are cacheable - Naïve protocols, which dont depend on compilers
- The real problem is how to handle Write Cycles
50 / 90
51CPU1
CPU3
CPU2
LOOSELY COUPLED SYSTEM
Communication Device
Communication Device
Communication Device
Main Memory1
Main Memory2
Main Memory3
CPU1
CPU2
CPU3
CPUn
Cache 2
Cache 3
Cache n
Cache 1
Main Memory
TIGHTLY COUPLED SYSTEM
51 / 90
52Write Through Caches
- Coherency could be preserved if all Write Cycles
are placed on main memory bus - All caches snoop the bus
- Bus is taken by 10 Write Cycles 5 of 90
Read Cycles (Read Hits) - Every processor takes 14.5 of the bus, we might
have 6 processors on the bus - The problem is wait state generation
- Processors might content for the bus
52 / 90
53Copy Back Caches
- Copy Back caches allow inconsitent main and cache
meory - Shared data must be same in all caches
- In some moments of time Copy Back updates main
memory - Very special CPU instruction Purge for updating
main memory - Special tabels for keeping track of the location
and coherency of cached copies
53 / 90
54Write Once Protocol
- First write updates main memory
- other CPUs snoop the bus, so they invalidate
their copies - Second write doesnt updates main memory
- If first CPU wrote only once, and second CPU
wants to read, it updates its copy from main
memory - If first CPU wrote several times, and second
wants to read, first CPU must somehow intervene - First CPU must send fresh data to second CPU
- Indirect intrevention vs. direct intervention
54 / 90
55Redefinition of States
- We might introduce a new states, but...
- We incode several new states by two bits
1. Invalid
2. Valid-and-never-written
-to-by-any-CPU
3. Valid-and-written-to-once-by-this-CPU
4. Valid-and-written-to-more-than-once-by-this-CPU
- Fourth state is sometimes called Private
55 / 90
56Indirect Intervention
- Snoop Read Hit occurs in first cache
- First cache stalls second one, an then updates
main memory - Second cache updates itself from main memory
- This is indirect intervention, because second
cache updates itself from main memory - Bus is very busy in indirect intervention
- Direct protocol disables main memory and updates
second cache directly from first cache
(reflection)
56 / 90
57Reflection Protocole
- Main memory might have snoop mechanism
- Snoop Read Hit occurs in first cache
- First cache disables main memory and sends fresh
data to second cache - Memory snoops the bus and grabs one copy to
update itself - State of the data in first cache is changed from
Valid-and-written-to-more-than-once-by-this-CPU
to Valid-and-never-written-to-by-any-CPU because
all caches have same copy
57 / 90
58Ownership protocoles
- Copy Back technique
- We define four states
1. Invalid
2.
Valid-and-never-written-to-by-any-CPU
3. Valid-and-written-to-once-by-this-CPU
4. Valid-and-written-to-more-than-once-by-thi
s-CPU - The first processor which modifies the data is
the owner of that data - Only the owner updates main memory when its block
is replaced - Sometimes it is not necessary to invalidate
copies of owning data in other caches - Two more states are introduced instead of fourth
4a. Valid-and-written-to-more-than-once-by-this-CP
U-unsnooped 4b.
Valid-and-written-to-more-than-once-by-this-CPU-sn
ooped - When data is already invalidated in other caches
or when only owner has a copy of the data, then
data is in 4a. state - If some other cache has a copy, the data in the
owner is in 4b. state
58 / 90
5959 / 90
60Second CPU is disabled
Second CPU writes. This copy is not valid any
more
60 / 90
6161 / 90
6262 / 90
63MOTOROLA 68020 DIRECT-MAPPED
LOGICAL CACHE
63 / 90
6468020
System Tag Address Bits
68541 Memory Management Unit (MMU)
Local Tag Address Bus
17
A15-31
IDT74FCT646A System Address Buffers
Local Set Address Bus
13
A2-14
System Set Address Bits
IDT7164 8K x 8 Fast SRAM
IDT7174 8K x 8 Integrated Cache-Tag RAM
System Bus
Control Logic
Misc. control
CPU Control
Cache Control
IDT74FCT646A System Data Transcievers
32
D0-31
Local Data Bus
Block diagram of a simple 32K-byte,
direct-mapped, logical write-through cache for
the 68020 microprocessor
64 / 90
6568020
- Logical cahe, addresses are not translated,
faster response - PAL components are used for cache controler
- Cache is flushed by software during context
switch - 68020 supports unaligned write cycles
- Signals CPUSPACE, IOEN, CACHE.E I AS disables
hit during I/O or coprocessor operation - Enable and Disable flush cache before new session
- Hit Miss logic samples tag Bus Retry cycle is
generated if theres no cache hit
65 / 90
66From CPU
66 / 90
6767 / 90
68Cache read miss cycle timing waveforms
68 / 90
6969 / 90
70386 DIRECT MAPPED CACHE
70 / 90
71386
System Tag Address Bits
Local Tag Address Bus
17
A15-31
IDT74FCT646A System Address Buffers
Local Set Address Bus
13
A2-14
System Set Address Bits
IDT7174 8K x 8 Integrated Cache-Tag RAM (Snoop
Tag)
IDT7174 8K x 8 Integrated Cache-Tag RAM
IDT7164 8K x 8 Fast SRAM
System Bus
Control PALs
CPU Control
Misc. control
Cache Control
IDT74FCT646A System Data Transcievers
32
D0-31
Local Data Bus
71 / 90
72386
- 386 MMU is on chip, cache is physical
- Cache coherency is preserved by dual cache tag
- Design is transparent no difference between main
memory and cache, except speed - 386 cache doesnt need operative system to
monitor DMA activity - Signal CE1 disables Tags responses to
noncacheable addresses - Snoop Hits flush both Cache Tag and Snoop Tag
72 / 90
7373 / 90
386 Cache Tag
7474 / 90
386 Cache Data
7575 / 90
386 Snoopy Tag
7676 / 90
7777 / 90
7868030 DIRECT-MAPPED CACHE
78 / 90
7968030
Local Tag Address Bus
15
A17-31
Local Set Address Bus
IDT74FT373A System Address Buffers
15
A2-16
IDT6178 4K x 4 Fast SRAM (Valid Bits)
IDT6178 4K x 4 Integrated Cache-Tag RAM
IDT7198 16K x 4 Fast SRAM
System Bus
Control PALs
CPU Control
Misc. control
Cache Control
IDT74FCT543A System Data Transcievers
32
D0-31
Local Data Bus
Block diagram of a 64K-byte, posted
write-through, direct-mapped, physical secondary
cache for the 68030 microprocesor
79 / 90
8068030 Cache Tag
80 / 90
8181 / 90
68030 Cache Data
8282 / 90
8383 / 90
84COPY-BACK SECONDARY CACHE FOR THE INTEL i486DX
84 / 90
85i486
Local Tag Address Bus
15
A17-31
Buffered Set Address Bus
IDT74FT373A System Address Buffers
A2-16
15
IDT74FCT543A Set Bit Latch
System Bus
IDT7174 8K x 8 Integrated Cache-Tag RAM
IDT71589 32K x 9 Bursting Self-Timed SRAM
IDT6167 16K x 1 (Dirty Bit)
IDT79R3020 Write Buffers
74F579 Copyback Flush Counter
Control PALs
CPU Control
Misc. control
Cache Control
36
D0-31 DP0-3
Local Data Bus
IDT74FCT543A System Data Transcievers
Block diagram of a 128K-byte, direct-mapped,copy-b
ack secondary cache for the i486
microprocesor
85 / 90
8686 / 90
8787 / 90
8888 / 90
89READ CACHE DOUBLEWORD 1 AND LOAD WRITE BUFFER
WRITE BUFFER FULL
EVICTION AND COPY BACK
0
WRITE BUFFER FULL
READ CACHE DOUBLEWORD 2 AND LOAD WRITE BUFFER,
INVALIDATE PRIMARY CACHE
DIRTY LINE WRITE BUFFER NOT FULL
1
READ CACHE DOUBLEWORD 3 AND LOAD WRITE BUFFER
WRITE BUFFER FULL
WRITE BUFFER NOT FULL
2
WRITE BUFFER FULL
READ CACHE DOUBLEWORD 4 AND LOAD WRITE BUFFER
LINE INVALID
WRITE BUFFER NOT FULL
3
READ MISS
MEMORY NOT READY
WRITE BUFFER NOT FULL
4
CACHE UPDATE
MEMORY READ DOUBLEWORD 1 CACHE WRITE
MEMORY NOT READY
MEMORY READY
5
MEMORY READ DOUBLEWORD 2 AND CACHE WRITE
MEMORY NOT READY
MEMORY READY
6
MEMORY NOT READY
MEMORY READ DW 3, CACHE WRITE, UPDATE TAG,
CLEAR DIRTY BIT
7
MEMORY READY
89 / 90
READ CACHE DOUBLEWORD 4, CACHE WRITE
IDLE
90IDLE
0
TEST DIRTY BITS
CLEAN OR INVALID AND TERMINAL COUNT
FLSHCLK0 COPYBACK0
1
INCREMENT TAG -ADDRESS COUNTER
FLSHCLK1 COPYBACK0
2
CHECK DIRTY AND VALID BITS
EVICTION AND COPYBACK
DIRTY
CLEAN OR INVALID
FLSHCLK0 COPYBACK1
READ CACHE DW1 AND LOAD WRITE BUFFER
3
WRITE BUFFER NOT FULL
WRITE BUFFER FULL
FLSHCLK0 COPYBACK1
4
READ CACHE DW2 AND LOAD WRITE BUFFER
WRITE BUFFER NOT FULL
WRITE BUFFER NOT FULL AND TERMINAL COUNT
WRITE BUFFER FULL
WRITE BUFFER NOT FULL
FLSHCLK0 COPYBACK1
READ CACHE DW3 AND LOAD WRITE BUFFER
5
WRITE BUFFER FULL
WRITE BUFFER NOT FULL
READ CACHE DW4 AND LOAD WRITE BUFFER
6
FLSHCLK0 COPYBACK1
90 / 90
WRITE BUFFER FULL