Title: Spin Locks and Contention
1Spin Locks and Contention
- Companion slides for
- The Art of Multiprocessor Programming
- by Maurice Herlihy Nir Shavit
- Modified by Rajeev Alur
- for CIS 640, University of Pennsylvania
2Muddy childrens puzzle(Common Knowledge)
- A group of kids are playing. A stranger walks by
and announces Some of you have mud on your
forehead - Each kid can see everyone elses forehead, but
not his/her own (and they dont talk to one
another) - Stranger says Raise your hand if you conclude
that you have mud on your forehead. Nobody does.
- Stranger keeps on repeating the statement.
- If k kids have muddy foreheads, then exactly
these k kids raise their hands after the stranger
repeats the statement exactly k times
Art of Multiprocessor Programming
2
3Muddy childrens puzzleWhy does this happen?
- For every k
- If gtk kids have muddy foreheads, then in the
first k-1 rounds nobody raises hands - If k kids have muddy foreheads, then in the k-th
round, exactly muddy kids raise their hands - This claim can be proved by induction on k
- Base case k1
- Inductive case (assume for k, and prove for k1)
Art of Multiprocessor Programming
3
4What is the role of strangers statement?
- Let p stand for gt 0 kids have muddy foreheads
- Assuming gt1 kids are muddy, stranger announcing p
does not add to anyones information - However, without strangers announcement, nobody
will ever raise their hands - So whats going on
- Well, the base case for our proof fails, but
exactly what information do kids acquire from the
strangers announcement?
Art of Multiprocessor Programming
4
5Common Knowledge
- E p Everybody knows p
- E E p Everybody knows that everybody knows p
- Ek p defined similarly (k repetitions)
- C p p is common knowledge limit of Everybody
knows that everybody knows . - For k 2, each kid knows p, but not Ep, and after
strangers announcement, each kid knows E p - If k kids are muddy, before announcement, each
kid knows Ek-1 p, but not Ek p - Stranger makes p the common knowledge
Art of Multiprocessor Programming
5
6Mutual ExclusionFocus so far Correctness
- Models
- Accurate
- But idealized
- Protocols
- Elegant
- Important
- But used in practice
7New Focus Performance
- Models
- More complicated
- Still focus on principles
- Protocols
- Elegant
- Important
- And realistic
8Kinds of Architectures
- SISD (Uniprocessor)
- Single instruction stream
- Single data stream
- SIMD (Vector)
- Single instruction
- Multiple data
- MIMD (Multiprocessors)
- Multiple instruction
- Multiple data.
9Kinds of Architectures
- SISD (Uniprocessor)
- Single instruction stream
- Single data stream
- SIMD (Vector)
- Single instruction
- Multiple data
- MIMD (Multiprocessors)
- Multiple instruction
- Multiple data.
Our space
(1)
10MIMD Architectures
memory
Shared Bus
Distributed
- Memory Contention
- Communication Contention
- Communication Latency
11Today Revisit Mutual Exclusion
- Think of performance, not just correctness and
progress - Begin to understand how performance depends on
our software properly utilizing the
multiprocessor machines hardware - And get to know a collection of locking
algorithms
(1)
12What Should you do if you cant get a lock?
- Keep trying
- spin or busy-wait
- Good if delays are short
- Give up the processor
- Good if delays are long
- Always good on uniprocessor
(1)
13What Should you do if you cant get a lock?
- Keep trying
- spin or busy-wait
- Good if delays are short
- Give up the processor
- Good if delays are long
- Always good on uniprocessor
our focus
14Basic Spin-Lock
CS
Resets lock upon exit
spin lock
critical section
15Basic Spin-Lock
lock introduces sequential bottleneck
CS
Resets lock upon exit
spin lock
critical section
16Basic Spin-Lock
CS
Resets lock upon exit
spin lock
critical section
17Basic Spin-Lock
CS
Resets lock upon exit
spin lock
critical section
Notice these are distinct phenomena
18Test-and-Set Primitive
- Boolean value
- Test-and-set (TAS)
- Swap true with current value
- Return value tells if prior value was true or
false - Can reset just by writing false
- TAS aka getAndSet
19Test-and-Set
public class AtomicBoolean boolean value
public synchronized boolean getAndSet(boolean
newValue) boolean prior value value
newValue return prior
(5)
20Review Test-and-Set
public class AtomicBoolean boolean value
public synchronized boolean getAndSet(boolean
newValue) boolean prior value value
newValue return prior
Package java.util.concurrent.atomic
21Review Test-and-Set
public class AtomicBoolean boolean value
public synchronized boolean getAndSet(boolean
newValue) boolean prior value value
newValue return prior
Swap old and new values
22Test-and-Set
AtomicBoolean lock new AtomicBoolean(false) b
oolean prior lock.getAndSet(true)
23Test-and-Set
AtomicBoolean lock new AtomicBoolean(false) b
oolean prior lock.getAndSet(true)
Swapping in true is called test-and-set or TAS
(5)
24Test-and-Set Locks
- Locking
- Lock is free value is false
- Lock is taken value is true
- Acquire lock by calling TAS
- If result is false, you win
- If result is true, you lose
- Release lock by writing false
25Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
26Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
Lock state is AtomicBoolean
27Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
Keep trying until lock acquired
28Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
Release lock by resetting state to false
29Space Complexity
- TAS spin-lock has small footprint
- N thread spin-lock uses O(1) space
- As opposed to O(n) Peterson/Bakery
- How did we overcome the W(n) lower bound?
- We used a combined read-write operation
30Performance
- Experiment
- n threads
- Increment shared counter 1 million times
- How long should it take?
- How long does it take?
31Mystery 1
TAS lock Ideal
time
What is going on?
threads
(1)
32Test-and-Test-and-Set Locks
- Lurking stage
- Wait until lock looks free
- Spin while read returns true (lock taken)
- Pouncing state
- As soon as lock looks available
- Read returns false (lock free)
- Call TAS to acquire lock
- If TAS loses, back to lurking
33Test-and-test-and-set Lock
class TTASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(true) while (state.get()) if
(!state.getAndSet(true)) return
34Test-and-test-and-set Lock
class TTASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(true) while (state.get()) if
(!state.getAndSet(true)) return
Wait until lock looks free
35Test-and-test-and-set Lock
class TTASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(true) while (state.get()) if
(!state.getAndSet(true)) return
Then try to acquire it
36Mystery 2
TAS lock TTAS lock Ideal
time
threads
37Mystery
- Both
- TAS and TTAS
- Do the same thing (in our model)
- Except that
- TTAS performs much better than TAS
- Neither approaches ideal
38Opinion
- Our memory abstraction is broken
- TAS TTAS methods
- Are provably the same (in our model)
- Except they arent (in field tests)
- Need a more detailed model
39Bus-Based Architectures
cache
cache
cache
Bus
memory
40Bus-Based Architectures
Random access memory (10s of cycles)
cache
cache
cache
Bus
memory
41Bus-Based Architectures
- Shared Bus
- Broadcast medium
- One broadcaster at a time
- Processors and memory all snoop
cache
cache
cache
Bus
memory
42Bus-Based Architectures
- Per-Processor Caches
- Small
- Fast 1 or 2 cycles
- Address state information
cache
cache
cache
Bus
memory
43Jargon Watch
- Cache hit
- I found what I wanted in my cache
- Good Thing
- Cache miss
- I had to go all the way to memory for that data
- Bad Thing
44Caveat
- This model is still a simplification
- But not in any essential way
- Illustrates basic principles
- Will discuss complexities later
45Processor Issues Load Request
cache
cache
cache
Bus
memory
data
46Processor Issues Load Request
Gimme data
cache
cache
cache
Bus
Bus
memory
data
47Memory Responds
cache
cache
cache
Bus
Bus
Got your data right here
memory
data
data
48Processor Issues Load Request
cache
cache
data
Bus
memory
data
49Processor Issues Load Request
cache
cache
data
Bus
Bus
memory
data
50Processor Issues Load Request
I got data
cache
cache
data
Bus
Bus
memory
data
51Other Processor Responds
I got data
data
cache
cache
data
Bus
Bus
memory
data
52Other Processor Responds
data
cache
cache
data
Bus
Bus
memory
data
53Modify Cached Data
data
cache
data
Bus
memory
data
(1)
54Modify Cached Data
data
data
cache
data
Bus
memory
data
(1)
55Modify Cached Data
data
cache
data
Bus
memory
data
56Modify Cached Data
data
cache
data
Bus
Whats up with the other copies?
memory
data
57Cache Coherence
- We have lots of copies of data
- Original copy in memory
- Cached copies at processors
- Some processor modifies its own copy
- What do we do with the others?
- How to avoid confusion?
58Write-Back Caches
- Accumulate changes in cache
- Write back when needed
- Need the cache for something else
- Another processor wants it
- On first modification
- Invalidate other entries
- Requires non-trivial protocol
59Write-Back Caches
- Cache entry has three states
- Invalid contains raw seething bits
- Valid I can read but I cant write
- Dirty Data has been modified
- Intercept other load requests
- Write back to memory before using cache
60Invalidate
cache
data
data
Bus
memory
data
61Invalidate
Mine, all mine!
cache
data
data
Bus
Bus
memory
data
62Invalidate
Uh,oh
cache
data
data
cache
Bus
Bus
memory
data
63Invalidate
Other caches lose read permission
cache
cache
data
Bus
memory
data
64Invalidate
Other caches lose read permission
cache
cache
data
Bus
This cache acquires write permission
memory
data
65Invalidate
Memory provides data only if not present in any
cache, so no need to change it now (expensive)
cache
cache
data
Bus
memory
data
(2)
66Another Processor Asks for Data
cache
cache
data
Bus
Bus
memory
data
(2)
67Owner Responds
cache
data
cache
data
Bus
Bus
memory
data
(2)
68End of the Day
cache
data
data
data
Bus
memory
data
Reading OK, no writing
(1)
69Mutual Exclusion
- What do we want to optimize?
- Bus bandwidth used by spinning threads
- Release/Acquire latency
- Acquire latency for idle lock
70Simple TASLock
- TAS invalidates cache lines
- Spinners
- Miss in cache
- Go to bus
- Thread wants to release lock
- delayed behind spinners
71Test-and-test-and-set
- Wait until lock looks free
- Spin on local cache
- No bus use while lock busy
- Problem when lock is released
- Invalidation storm
72Local Spinning while Lock is Busy
busy
busy
busy
Bus
memory
busy
73On Release
free
invalid
invalid
Bus
memory
free
74On Release
Everyone misses, rereads
miss
miss
free
invalid
invalid
Bus
memory
free
(1)
75On Release
Everyone tries TAS
TAS()
TAS()
free
invalid
invalid
Bus
memory
free
(1)
76Problems
- Everyone misses
- Reads satisfied sequentially
- Everyone does TAS
- Invalidates others caches
- Eventually quiesces after lock acquired
- How long does this take?
77Mystery Explained
TAS lock TTAS lock Ideal
time
Better than TAS but still not as good as ideal
threads
78Solution Introduce Delay
- If the lock looks free
- But I fail to get it
- There must be contention
- Better to back off than to collide again
time
spin lock
d
r1d
r2d
79Dynamic Example Exponential Backoff
time
spin lock
d
2d
4d
- If I fail to get lock
- wait random duration before retry
- Each subsequent failure doubles expected wait
80Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
81Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Fix minimum delay
82Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Wait until lock looks free
83Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
If we win, return
84Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Back off for random duration
85Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Double max delay, within reason
86Spin-Waiting Overhead
TTAS Lock
time
Backoff lock
threads
87Backoff Other Issues
- Good
- Easy to implement
- Beats TTAS lock
- Bad
- Must choose parameters carefully
- Not portable across platforms
88Idea
- Avoid useless invalidations
- By keeping a queue of threads
- Each thread
- Notifies next in line
- Without bothering the others
89Anderson Queue Lock
next
flags
T
F
F
F
F
F
F
F
90Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
91Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
92Anderson Queue Lock
next
Mine!
flags
T
F
F
F
F
F
F
F
93Anderson Queue Lock
next
flags
T
F
F
F
F
F
F
F
94Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
95Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
96Anderson Queue Lock
next
flags
T
F
F
F
F
F
F
F
97Anderson Queue Lock
next
flags
T
T
F
F
F
F
F
F
98Anderson Queue Lock
next
Yow!
flags
T
T
F
F
F
F
F
F
99Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
100Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
One flag per thread
101Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
Next flag to use
102Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
Thread-local variable
103Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
104Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
Take next slot
105Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
Spin until told to go
106Anderson Queue Lock
public lock() myslot next.getAndIncrement()
while (!flagsmyslot n) flagsmyslot
n false public unlock()
flags(myslot1) n true
Prepare slot for re-use
107Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
Tell next thread to go
108Performance
TTAS
- Shorter handover than backoff
- Curve is practically flat
- Scalable performance
- FIFO fairness
queue
109Anderson Queue Lock
- Good
- First truly scalable lock
- Simple, easy to implement
- Bad
- Space hog
- One bit per thread
- Unknown number of threads?
- Small number of actual contenders?
110CLH Lock
- FIFO order
- Small, constant-size overhead per thread
111Initially
tail
false
112Initially
tail
Queue tail
false
113Initially
Lock is free
tail
false
114Initially
tail
false
115Purple Wants the Lock
tail
false
116Purple Wants the Lock
tail
true
false
117Purple Wants the Lock
Swap
tail
true
false
118Purple Has the Lock
tail
true
false
119Red Wants the Lock
tail
true
false
true
120Red Wants the Lock
Swap
tail
true
false
true
121Red Wants the Lock
tail
true
false
true
122Red Wants the Lock
tail
true
false
true
123Red Wants the Lock
Implicit Linked list
tail
true
false
true
124Red Wants the Lock
tail
true
false
true
125Red Wants the Lock
Actually, it spins on cached copy
true
tail
true
false
true
126Purple Releases
Bingo!
false
tail
false
false
true
127Purple Releases
tail
true
128Space Usage
- Let
- L number of locks
- N number of threads
- ALock
- O(LN)
- CLH lock
- O(LN)
129CLH Queue Lock
class Qnode AtomicBoolean locked new
AtomicBoolean(true)
130CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
myNode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)
(3)
131CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
mynode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)
Queue tail
(3)
132CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
Qnode pred tail.getAndSet(myNode) while
(pred.locked)
Thread-local Qnode
(3)
133CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
mynode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)
Swap in my node
(3)
134CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
mynode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)
Spin until predecessor releases lock
(3)
135CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
(3)
136CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
Notify successor
(3)
137CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
Recycle predecessors node
(3)
138CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
(Code in book shows how its done using myPred
reference.)
(3)
139CLH Lock
- Good
- Lock release affects predecessor only
- Small, constant-sized space
- Bad
- Doesnt work for uncached NUMA architectures
140NUMA Architecturs
- Acronym
- Non-Uniform Memory Architecture
- Illusion
- Flat shared memory
- Truth
- No caches (sometimes)
- Some memory regions faster than others
Art of Multiprocessor Programming
140
141NUMA Machines
Spinning on local memory is fast
Art of Multiprocessor Programming
141
142NUMA Machines
Spinning on remote memory is slow
Art of Multiprocessor Programming
142
143CLH Lock
- Each thread spins on predecessors memory
- Could be far away
Art of Multiprocessor Programming
143
144MCS Lock
- FIFO order
- Spin on local memory only
- Small, Constant-size overhead
Art of Multiprocessor Programming
144
145Initially
tail
false
false
Art of Multiprocessor Programming
145
146Acquiring
(allocate Qnode)
true
tail
false
false
Art of Multiprocessor Programming
146
147Acquiring
true
swap
tail
false
false
Art of Multiprocessor Programming
147
148Acquiring
true
tail
false
false
Art of Multiprocessor Programming
148
149Acquired
true
tail
false
false
Art of Multiprocessor Programming
149
150Acquiring
false
tail
true
swap
Art of Multiprocessor Programming
150
151Acquiring
false
tail
true
Art of Multiprocessor Programming
151
152Acquiring
false
tail
true
Art of Multiprocessor Programming
152
153Acquiring
false
tail
true
Art of Multiprocessor Programming
153
154Acquiring
true
tail
true
false
Art of Multiprocessor Programming
154
155Acquiring
Yes!
true
tail
false
true
Art of Multiprocessor Programming
155
156MCS Queue Lock
class Qnode boolean locked false qnode
next null
Art of Multiprocessor Programming
156
157MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)
Art of Multiprocessor Programming
157
(3)
158MCS Queue Lock
Make a QNode
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)
Art of Multiprocessor Programming
158
(3)
159MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)
add my Node to the tail of queue
Art of Multiprocessor Programming
159
(3)
160MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)
Fix if queue was non-empty
Art of Multiprocessor Programming
160
(3)
161MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)
Wait until unlocked
Art of Multiprocessor Programming
161
(3)
162MCS Queue Unlock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
Art of Multiprocessor Programming
162
(3)
163MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
Missing successor?
Art of Multiprocessor Programming
163
(3)
164MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
If really no successor, return
Art of Multiprocessor Programming
164
(3)
165MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
Otherwise wait for successor to catch up
Art of Multiprocessor Programming
165
(3)
166MCS Queue Lock
class MCSLock implements Lock AtomicReference
queue public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null)
return while (qnode.next null)
qnode.next.locked false
Pass lock to successor
Art of Multiprocessor Programming
166
(3)
167Purple Release
false
false
Art of Multiprocessor Programming
167
(2)
168Purple Release
By looking at the queue, I see another thread is
active
false
false
Art of Multiprocessor Programming
168
(2)
169Purple Release
By looking at the queue, I see another thread is
active
false
false
I have to wait for that thread to finish
Art of Multiprocessor Programming
169
(2)
170Purple Release
prepare to spin
true
false
Art of Multiprocessor Programming
170
171Purple Release
spinning
true
false
Art of Multiprocessor Programming
171
172Purple Release
spinning
true
false
false
Art of Multiprocessor Programming
172
173Purple Release
Acquired lock
true
false
false
Art of Multiprocessor Programming
173
174Abortable Locks
- What if you want to give up waiting for a lock?
- For example
- Timeout
- Database transaction aborted by user
Art of Multiprocessor Programming
174
175Back-off Lock
- Aborting is trivial
- Just return from lock() call
- Extra benefit
- No cleaning up
- Wait-free
- Immediate return
Art of Multiprocessor Programming
175
176Queue Locks
- Cant just quit
- Thread in line behind will starve
- Need a graceful way out
- Timeout Queue Lock
Art of Multiprocessor Programming
176
177One Lock To Rule Them All?
- TTASBackoff, CLH, MCS, ToLock
- Each better than others in some way
- There is no one solution
- Lock we pick really depends on
- the application
- the hardware
- which properties are important
Art of Multiprocessor Programming
177