Title: Design and Verification of Adaptive Cache Coherence Protocols
1 - Design and Verification of Adaptive Cache
Coherence Protocols - Arvind and Xiaowei Shen
- MIT Lab for Computer Science
- IBM T.J.Watson Research Center
2Three Issues
- What memory model should be supported?
- Commit-Reconcile Fences (ISCA'99)
- What adaptivity can be provided?
- Cachet (ICS'99)
- How adaptive protocols can be designed and
verified?
3Sequential Consistency
- In-order instruction execution
- Atomic loads and stores
Architects and compiler writers want to violate
SC for performance
? weaker memory models reorder memory
accesses break atomicity of stores
4Weaker Memory Models
Alpha, Sparc PowerPC, ...
Store is globally performed
Write- buffers
TSO, PSO, RMO, ...
RMOWO?
SMP, DSM
- Hard to understand and remember
- Unstable - Modèle de l année
5CRF Our Solution
- Implementation-independent
- Precise and easy-to-understand
- Scalable and efficient implementations
- Stable
6Roadmap
- The Cachet adaptive protocol
7The CRF Model
semantic cache
- Decompose Load and Store
- Store(a,v) ? StoreL(a,v) Commit(a)
- Load(a) ? Reconcile(a) LoadL(a)
8CRF LoadL and StoreL
proc
proc
LoadL(a)
StoreL(a,v)
. . .
Cell(a,v,-)
Cell(a,v,D)
shared memory
- LoadL reads from sache if the address is cached
- StoreL writes into sache and sets the state to
Dirty
9CRF Commit and Reconcile
proc
proc
Commit(a)
Reconcile(a)
Commit(a)
Reconcile(a)
. . .
Cell(a,v,D)
Cell(a,v,C)
Cell(a,v,C)
shared memory
- Commit stalls if the address is cached in Dirty
- Reconcile stalls if the address is cached in Clean
10CRF Background Operations
proc
proc
proc
. . .
. . .
Cell(a,5,C)
Cell(b,8,D)
Cell(c,7,C)
Cell(b,8,C)
Cache
Writeback
Purge
- Cache (retrieve) a Clean copy from memory
- Writeback a Dirty copy to memory
- Purge a Clean copy from sache
11CRF Fences
- Instructions can be reordered except for
- data dependence
- StoreL(a,v) Commit(a)
- Reconcile(a) LoadL(a)
Commit(a1)
Fencewr (a1, a2)
Reconcile(a2)
12CRF Definition in TRS
- CRF-Loadl Rule
- Site(sache, ltt,Loadl(a)gtpmb, mpb, p) if
Cell(a,v,-) ? sache - ? Site(sache, pmb, mpbltt,vgt, p)
- CRF-Storel Rule
- Site(Cell(a,-,-)sache, ltt,Storel(a,v)gtpmb,
mpb, p) - ? Site(Cell(a,v,Dirty)sache, pmb, mpbltt,Ackgt,
p) - CRF-Commit Rule
- Site(sache, ltt,Commit(a)gtpmb, mpb, p) if
Cell(a,-,Dirty) ? sache - ? Site(sache, pmb, mpbltt,Ackgt, p)
- CRF-Reconcile Rule
- Site(sache, ltt,Reconcile(a)gtpmb, mpb, p) if
Cell(a,-,Clean) ? sache - ? Site(sache, pmb, mpbltt,Ackgt, p)
- CRF-Cache Rule
- Sys(m, Site(sache, pmb, mpb, p) sites) if
a ? sache - ? Sys(m, Site(Cell(a,ma,Clean)sache, pmb,
mpb, p) sites) - CRF-Writeback Rule
- Sys(m, Site(Cell(a,v,Dirty)sache, pmb, mpb, p)
sites) - ? Sys(mav, Site(Cell(a,v,Clean)sache, pmb,
mpb, p) sites) - CRF-Purge Rule
13CRF A Universal Interface
RC Program
CRF Program
Translation Scheme
CRF Protocol
- A CRF protocol is automatically a protocol for
any memory model whose programs can be translated
into CRF programs
14Translation Schemes
- Sparc's RMO CRF
- Load(a) Reconcile(a) LoadL(a)
- Store(a,v) StoreL(a,v) Commit(a)
- Membar LoadLoad Fencerr(,)
-
- Release Consistency CRF
- Load(a) LoadL(a)
- Store(a,v) StoreL(a,v)
- Release(s) Commit() Fencerw(,s)
Fenceww(,s) Unlock(s) - Acquire(s) Lock(s) Fencerr(s,)
Fencerw(s,) Reconcile()
15Roadmap
- The CRF memory model
- Store(a,v) ? StoreL(a,v) Commit(a)
- Load(a) ? Reconcile(a) LoadL(a)
- Cachet An adaptive protocol for CRF
16Need for Adaptive Protocols
- Memory access patterns may differ for different
programs, or different data structures - producer-consumer, migratory, ...
- A fixed protocol is usually optimized for some
specific access patterns - invalidate vs. update
An adaptive protocol can change its behavior
based on observed access patterns
17Adaptive Protocols
mandatory rules
policies
- A fixed protocol is defined by a set of mandatory
rules
- Adaptivity is provided by a set of voluntary
rules which can be invoked at any time
- A policy determines how voluntary rules should be
invoked
18Base A Directory-less Protocol
Site A
Site B
proc
proc
StoreL(a,2)
Reconcile(a)
Commit(a)
LoadL(a)
Cell(a,1,Clean)
Cell(a,1,Clean)
Cell(a,2,Dirty)
Cell(a,2,Clean)
Cell(a,2,Clean)
Purge
Writeback
Cell(a,1)
Cell(a,2)
Base
19Voluntary Writeback Rule
Site A
Site B
proc
proc
Cell(a,1,Clean)
Cell(a,2,Dirty)
Cell(a,2,Clean)
Writeback
Cell(a,1)
Cell(a,2)
Policy issue when should this rule be invoked?
- address a-1 is being written back
- address a is unlikely to be modified
Base
20The Writer-Push Protocol
Site A
Site B
proc
proc
StoreL(a,2)
Reconcile(a)
Commit(a)
LoadL(a)
Cell(a,1,Clean)
Cell(a,1,Clean)
Cell(a,2,Dirty)
Cell(a,2,Clean)
Cell(a,2,Pend)
Cell(a,2,Clean)
behaves as a Nop !
Writeback
Purge-Req
?
?
Cell(a,1) A, B
Cell(a,2) A
B
Writer-Push
21Voluntary Cache Rule
Site A
Site B
proc
proc
Cell(a,1,Clean)
Cell(a,1,Clean)
Cache
Cell(a,1) B
A
Policy issue when should this rule be invoked?
- a request for address a-1 is received
- a cache had the address purged recently
(update protocol)
Writer-Push
22Cachet Seamless Integration of Multiple
Micro-protocols
Base
Migratory
Writer-Push
- Different caches may use different
micro-protocols for the same address
simultaneously
- Based on observed access patterns, a cache can
switch from one micro-protocol to another via
voluntary downgrade or upgrade operation
23Downgrade Upgrade Operations
Invalid
24A Limited Directory Protocol
Site A
Site B
Site C
proc
proc
proc
LoadL(a)
Cell(a,1,Clean)
Cell(a,1,Clean)
Cell(c,2,Dirty)
Cell(a,-,Pend)
Cell(a,1,Clean)
WP
Base
WP
WP
Cache-Req
Downgrade-Req
Cache-Rep
Cell(a,1) B C
A
- Downgrade a victim copy from WP to Base
Suppose no extra directory space is available
25Roadmap
- The Cachet adaptive protocol
- Verifrication of Cachet
- Soundness Cachet implements CRF
- Liveness each processor makes progress
What is the theorem to be proved? How to make
verification tractable?
26Verification of Cachet
Cachet
- Cachet is one protocol for verification
- Policy does not affect soundness/liveness
- ? one verification for a set of protocols
27The Simulation Theorem
Any move in CACHET can be simulated by moves in
CRF
How to define the mapping function?
28The Mapping Function
a 10
a 10
1. Send Cache-Req
2. Deliver Cache-Req
3. Receive Cache-Req Send Cache-Rep
4. Deliver Cache-Rep
5. Receive Cache-Req
29Forward Draining
Specification
confluent
Implementation
Forward draining completes partially executed
operations
30Backward Draining
Specification
non-confluent
Implementation
Backward draining cancels partially executed
operations
31Combination of Forward and Backward Draining
s3
B1
A2
s1
s6
A1
B2
B1
A2
s0
s8
s4
B1
A2
A1
B2
s2
s7
B2
A1
s5
- Backward draining A and B
- Forward draining A, backward draining B
- Forward draining B, backward draining A
32The Imperative--Directive Design Methodology
Integrated protocol
Imperative rules
Directive rules
- Imperative rules specify all the coherence
actions that can affect soundness
- Directive rules are used to invoke imperative
rules at appropriate times
33Example Cache Miss
Site A
Home
What happens if the memory does not have a valid
copy?
Miss
What happens if the memory receives more than one
request?
34Example Cache Miss
Site A
Home
Miss
liveness
soundness
The Imperative--Directive methodology can
dramatically simplify protocol verification
35Using TRS to Specify, Verify and Synthesize
Protocols
independent of of processors, cache size, etc.
CRF Model
TRS Verification
CACHET Protocol
TRS Synthesis
36Final Words
- Full verification of sophisticated protocols like
Cachet must be a part of the design process
- Good design methodology dramatically simplifies
protocol design and verification - Imperative--Directive
- Mandatory vs. Voluntary
37References
- 1 "Commit-Reconcile Fences A New Memory
Model for Architects and Compiler Writers",
Xiaowei Shen, Arvind and Larry Rudolph. In
Proceedings of the 26th International Symposium
on Computer Architectures, Atlanta, Georgia, May
1999. - 2 "CACHET An Adaptive Cache Coherence
Protocol for Distributed Shared-Memory Systems",
Xiaowei Shen, Arvind and Larry Rudolph. In
Proceedings of the 13th ACM-SIGARCH International
Conference on Supercomputing, Rhodes, Greece,
June 1999. - 3 "Design and Verification of Adaptive Cache
Coherence Protocols", PhD Thesis, Department of
Electrical Engineering and Computer Science,
Massachusetts Institute of Technology, January
2000. - http//www.csg.lcs.mit.edu/Users/xwshen/pu
blications.html - (The thesis subsumes references
1 and 2) - 4 "Specification of Memory Models and Design
of Provably Correct Cache Coherence Protocols",
Xiaowei Shen and Arvind, MIT CSG Memo 398, June
1997. - http//www.csg.lcs.mit.edu/pubs/csgmemo.ht
ml