Outline for Today - PowerPoint PPT Presentation

About This Presentation
Title:

Outline for Today

Description:

Objectives: To introduce the critical section problem. To learn how to reason about the correctness of concurrent programs. To present Linux kernel synchronization – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 89
Provided by: Carla209
Category:

less

Transcript and Presenter's Notes

Title: Outline for Today


1
Outline for Today
  • Objectives
  • To introduce the critical section problem.
  • To learn how to reason about the correctness of
    concurrent programs.
  • To present Linux kernel synchronization
  • Administrative details

2
Reasons for Explicitly Programming with
Threads(User-level Perspective Birrell)
  • To capture naturally concurrent activities
  • Waiting for slow devices
  • Providing human users faster response.
  • Shared network servers multiplexing among client
    requests (each client served by its own server
    thread)
  • To gain speedup by exploiting parallelism in
    hardware
  • Maintenance tasks performed in the background
  • Multiprocessors
  • Overlap the asynchronous and independent
    functioning of devices and users
  • Within a single user thread signal handlers
    cause asynchronous control flow.

3
Concurrency from theKernel Perspective
  • Kernel preemption scheduler can preempt task
    executing in kernel.
  • Interrupts occurring asynchronously invoking
    handler that disrupts the execution flow.
  • Sleeping to wait for events.
  • Support for SMP multiprocessors true
    concurrency of code executing on shared memory
    locations.

4
The Trouble with Concurrency in Threads...
What is the value of x when both threadsleave
this while loop?
5
Range of Answers
  • Process 0
  • LD x // x currently 0
  • Add 1
  • ST x // x now 1, stored over 9
  • Do 9 more full loops // leaving x at 10
  • Process1
  • LD x // x currently 0
  • Add 1
  • ST x // x now 1
  • Do 8 more full loops // x 9
  • LD x // x now 1
  • Add 1
  • ST x // x 2 stored over 10

6
Reasoning about Concurrency
  • What unit of work can be performed without
    interruption? Indivisible or atomic operations.
  • Interleavings - possible execution sequences of
    operations drawn from all threads.
  • Race condition - final results depend on ordering
    and may not be correct.

7
The Trouble with Concurrency
  • Two threads (T1,T2) in one address space or two
    processes in the kernel
  • One counter (shared)

Assumedatomic
T1 T2 count ld (count) add switch ld
(count) add st (count1) count1 switch st
(count1) count1
ld r2, count add r1, r2, r3 st count, r1
private
ld r2, count add r1, r2, r3 st count, r1
Time
Shared Data
count
8
Desired Atomic Sequence of Instructions
wait
  • Atomic Sequence
  • Appears to execute to completion without any
    intervening operations

9
Unprotected Shared Data
  • void threadcode( )
  • int ilong keyfor (i0 ilt20 i) key
    rand() SortedInsert (key)
  • for (i0 ilt20 i) key SortedRemove() pri
    nt (key)

private
head
10
20
30
null
What can happen here?
10
Unprotected Shared Data
  • 2 concurrent SortedInserts with keys 5 and 7.

7
head
20
30
10
null
5
What can happen here?
11
Unprotected Shared Data
  • 2 concurrent SortedInserts with keys 5 and 7.
  • 2 concurrent SortedRemoves

head
20
30
10
null
What can happen here?
12
Critical Sections
  • If a sequence of non-atomic operations must be
    executed as if it were atomic in order to be
    correct, then we need to provide a way to
    constrain the possible interleavings
  • Critical sections are defined as code sequences
    that contribute to bad race conditions.
  • Synchronization is needed around such critical
    sections.
  • Mutual Exclusion - goal is to ensure that
    critical sections execute atomically w.r.t.
    related critical sections in other threads or
    processes.

13
The Critical Section Problem
  • Each process follows this template
  • while (1)
  • ...other stuff... //processes in here
    shouldnt stop others
  • enter_region( )
  • critical section
  • exit_region( )
  • The problem is to implement enter_region and
    exit_region to ensure mutual exclusion with some
    degree of fairness.

Problem with this definitionIt focuses on code
not shared data that needs protecting!
14
Temptation to Protect Critical Sections (Badly)
  • void threadcode( )
  • int ilong keyfor (i0 ilt20 i) key
    rand() SortedInsert (key)
  • for (i0 ilt20 i) key SortedRemove() p
    rint (key)

head
Acquire(insertmutex) Release(insertmutex)
10
20
30
null
Acquire(removemutex) Release(removemutex)
Focus on the data!
15
Temptation to Protect Critical Sections (Badly)
  • void threadcode( )
  • int ilong keyfor (i0 ilt20 i) key
    rand() SortedInsert (key)
  • for (i0 ilt20 i) key SortedRemove() p
    rint (key)

head
Acquire(listmutex) Release(listmutex)
10
20
30
null
Acquire(listmutex) Release(listmutex)
Focus on the data!
16
Yet Another Example
  • Problem Given arrays C0x,0y, A 0x,0y,
    and B 0x,0y. Use n threads to update each
    element of C to the sum of A and B and then the
    last thread returns the average value of all C
    elements.

17
Design Alternatives
  • Static partitioning of arrays
  • for (i lowi i lt highi i)
  • for (j lowj j lt highj j)
  • Ci,j Ai,j Bi,j
  • sum sum Ci,j

lowi 0 highi n/2-1 lowj 0 highj n/2-1
lowi n/2 highi n-1lowj 0 highj n/2-1
lowi n/2 highi n-1lowj n/2 highj n-1
lowi 0 highi n/2-1 lowj n/2 highj n-1
  • Static partitioning of arrays
  • for (i lowi i lt highi i)
  • for (j lowj j lt highj j)
  • Ci,j Ai,j Bi,j
  • privatesum privatesum Ci,j
  • sum sum privatesum

C
sum
18
Design Alternatives
  • Dynamic partitioning of arrays
  • while (elements_remain(i,j))
  • Ci,j Ai,j Bi,j
  • sum sum Ci,j

C
sum
19
Implementation Options for Mutual Exclusion
  • Disable Interrupts
  • Use atomic operations (read-mod-write instr.)
  • Busywaiting solutions - spinlocks
  • execute a tight loop if critical section is busy
  • benefits from specialized atomic instructions
  • Blocking synchronization
  • sleep (enqueued on wait queue) while C.S. is busy
  • Synchronization primitives (abstractions, such as
    locks) which are provided by a system may be
    implemented with some combination of these
    techniques.

20
The Critical Section Problem
while (1) ...other stuff... critical
section anything that touches a particular set
of shared data
enter_region( )
exit_region( )
21
Critical Data
  • Goal in solving the critical section problem is
    to build synchronization so that the sequence of
    instructions that can cause a race condition are
    executed AS IF they were indivisible
  • Other stuff code that does not touch the
    critical data associated with a critical section
    can be interleaved with the critical section
    code.
  • Code from a critical section involving data x can
    be interleaved with code from a critical section
    associated with data y.

22
The Critical Section Problem
while (1) ...other stuff... critical
section anything that touches a particular set
of shared data
local_irq_save(flags)
local_irq_restore(flags)
Overkill on UPInsufficient for SMP
23
Disabling Preemption
while (1) ...other stuff... critical
section per-processor data
preempt_disable()
preempt_enable()
Milder impact on UP
24
Atomic Operations (Integer)
  • Special data type atomic_t
  • Prevent misuse and compiler optimizations
  • Only 24 bit values (its SPARCs fault)
  • atomic_t u ATOMIC_INIT (0)
  • Selected operations (see p. 119)
  • atomic_read
  • atomic_set
  • atomic_add
  • atomic_inc
  • atomic_sub_and_test
  • atomic_add_negative
  • atomic_inc_and_test

25
Atomic Operations (Bitwise)
  • No special data type take pointer and bit
    number as arguments. Bit 0 is least sign. bit.
  • Selected operations
  • test_and_set_bit
  • test_and_clear_bit
  • test_and_change_bit
  • set_bit
  • clear_bit
  • change_bit
  • test_bit

26
Uses of Atomic Operations
  • static int x 0
  • threadcode()
  • int j 0
  • while(jlt10)
  • //10 times per thread
  • xx1
  • j
  • atomic_t x ATOMIC_INIT (0)
  • threadcode()
  • int j0
  • while(jlt10)
  • //10 times per thread
  • atomic_inc(x)
  • j

27
Uses of Atomic Operations
  • static int x 0
  • static int j 11
  • threadcode()
  • while((--j)!0)
  • // 10 times in all
  • xx1
  • atomic_t x ATOMIC_INIT (0)
  • atomic_t j ATOMIC_INIT (11)
  • threadcode()
  • while(!atomic_dec_and_test(j))
  • //10 times in all
  • atomic_inc(x)

28
Uses of Atomic Operations
while (1) ...other stuff... critical
section anything that touches a particular set
of shared data
//homegrown spinlockwhile(test_and_set_bit(0,
busy)
clear_bit(0, busy )
29
Linux Kernel Spinlocks
spinlock_t busy SPIN_LOCK_UNLOCKED
while (1) ...other stuff... critical
section anything that touches a particular set
of shared data
//canned spinlockspin_lock(busy)
spin_unlock(busy )
30
Pros and Cons of Busywaiting
  • Key characteristic - the waiting process is
    actively executing instructions in the CPU and
    using memory cycles.
  • Appropriate when
  • High likelihood of finding the critical section
    unoccupied (dont take context switch just to
    find that out) or estimated wait time is very
    short
  • You have a processor all to yourself
  • In interrupt context
  • Disadvantages
  • Wastes resources (CPU, memory, bus bandwidth)

31
Spinlock Subtleties
  • Using spinlock in interrupt handlers disable
    local interrupts before obtaining lock
  • Saves (and restores) IRQ-enable state.Disables
    while holding lock
  • spin_lock_irqsave (lockvar, flags)
  • spin_unlock_irqrestore (lockvar, flags)
  • spin_lock_irq (lockvar)
  • spin_unlock_irq(lockvar)
  • Disabling bottom halves
  • spin_lock_bh() and spin_unlock_bh()

32
Pros and Cons of Blocking
  • Sleeping processes/threads dont consume CPU
    cycles
  • Appropriate when the cost of a system call is
    justified by expected waiting time
  • High likelihood of contention for lock
  • Long critical sections
  • Disadvantage context switch d?overhead

33
Semaphores
  • Well-known synchronization abstraction
  • Defined as a non-negative integer with two atomic
    operations
  • P(s) - wait until s gt 0 s-- or down(s)
  • V(s) - s or up(s)

34
(No Transcript)
35
(No Transcript)
36
Semaphore Usage
  • Binary semaphores can provide mutual exclusion
    mutex (solution to critical section problem)
  • Counting semaphores can represent a resource with
    multiple instances (e.g. solving
    producer/consumer problem)
  • Signaling events (persistent events that stay
    relevant even if nobody listening right now)

37
The Critical Section Problem
static DECLARE_SEMAPHORE_GENERIC(mutex,1)
or static DECLARE_MUTEX(mutex)
while (1) ...other stuff... critical
section
down_interruptable(mutex)
Fill in the boxes
up(mutex)
38
Lock Granularity how much should one lock
protect?
tail
head
4
6
2
8
10
B
3
A
39
Lock Granularity how much should one lock
protect?
tail
head
4
6
2
8
10
B
3
A
Concurrency vs. overheadComplexity threatens
correctness
40
Optimistic Locking Seqlocks
  • Sequence counter incremented on write
  • Compare counter before and after a read
  • Even counter value means data is stable
  • Odd counter value means write in progress

Writes write_seqlock(lock) // do write, lock is
odd write_sequnlock(lock) // write complete,
// lock is even
Reads do oldread_seqbegin(lock)
//reading data while (read_seqretry(lock,
old))
41
Petersons Algorithm for 2 Process Mutual
Exclusion
  • enter_region
  • needin me true
  • turn you
  • while (needin you turn you) no_op
  • exit_region
  • needin me false

Based on the assumption of atomic ld/st operations
42
Interleaving of Execution of 2 Threads (blue and
green)
  • enter_region
  • needin me true
  • turn you
  • while (needin you turn you) no_op
  • Critical Section
  • exit_region
  • needin me false
  • enter_region
  • needin me true
  • turn you
  • while (needin you turn you) no_op
  • Critical Section
  • exit_region
  • needin me false

43
needin blue true
needin green true
turn green
turn blue
while (needin green turn green)
Critical Section
while (needin blue turn blue)no_op
while (needin blue turn blue)no_op
needin blue false
while (needin blue turn blue)
Critical Section
needin green false
44
Petersons Algorithm for 2 Process Mutual
Exclusion
  • enter_region
  • needin me true
  • turn you
  • while (needin you turn you) no_op
  • exit_region
  • needin me false

mb()
45
Barriers
  • rmb prevents loads being reordered across
    barrier
  • wmb prevents reordering stores
  • mb both loads and stores
  • read_barrier_depends data-dependent loads
  • SMP versions of above compiles to barrier on UP
  • barrier prevents compiler optimizations from
    causing the reordering

46
Classic Synchronization Problems
  • There are a number of classic problems that
    represent a class of synchronization situations
  • Critical Section problem
  • Producer/Consumer problem
  • Reader/Writer problem
  • 5 Dining Philosophers
  • Why? Once you know the generic solutions, you
    can recognize other special cases in which to
    apply them (e.g., this is just a version of the
    reader/writer problem)

47
Producer / Consumer
  • Producer
  • while(whatever)
  • locally generate item
  • fill empty buffer with item
  • Consumer
  • while(whatever)
  • get item from full buffer
  • use item

48
Producer / Consumer(with Counting Semaphores)
  • Producer
  • while(whatever)
  • locally generate item
  • fill empty buffer with item
  • Consumer
  • while(whatever)
  • get item from full buffer
  • use item

P(fullbuf)
P(emptybuf)
V(emptybuf)
V(fullbuf)
Semaphores emptybuf initially N fullbuf
initially 0
not Linux syntax
49
What does it mean that Semaphores have
persistence?Tweedledum and Tweedledee Problem
  • Separate threads executing their respective
    procedures. The code below is intended to cause
    them to forever take turns exchanging insults
    through the shared variable X in strict
    alternation.
  • The Sleep() and Wakeup() routines operate as
    follows
  • Sleep blocks the calling thread,
  • Wakeup unblocks a specific thread if that thread
    is blocked, otherwise its behavior is
    unpredictable
  • Linux wait_for_completion() and complete()

50
The code shown above exhibits a well-known
synchronization flaw. Outline a scenario in which
this code would fail, and the outcome of that
scenario
  • void Tweedledum()
  • while(1)
  • Sleep()
  • x Quarrel(x)
  • Wakeup(Tweedledee)
  • void Tweedledee()
  • while(1)
  • x Quarrel(x)
    Wakeup(Tweedledum)
  • Sleep()

Lost WakeupIf dee goes first to sleep, the
wakeup is lost (since dum isnt sleeping yet).
Both sleep forever.
51
Show how to fix the problem by replacing the
Sleep and Wakeup calls with semaphore P (down)
and V (up) operations.
  • void Tweedledum()
  • while(1)
  • Sleep()
  • x Quarrel(x)
  • Wakeup(Tweedledee)

void Tweedledee() while(1)
x Quarrel(x)
Wakeup(Tweedledum) Sleep()

semaphore dee 0 semaphore dum 0
52
5 Dining Philosophers
53
Template for Philosopher
  • while (food available)
  • /pick up forks/
  • eat
  • /put down forks/
  • think awhile

54
Naive Solution
  • while (food available)
  • /pick up forks/
  • eat
  • /put down forks/
  • think awhile

P(forkleft(me))P(forkright(me))
V(forkleft(me))V(forkright(me))
Does this work?
55
Simplest Example of Deadlock
  • Thread 0
  • P(R1)
  • P(R2)
  • V(R1)
  • V(R2)
  • Thread 1
  • P(R2)
  • P(R1)
  • V(R2)
  • V(R1)

Interleaving P(R1) P(R2) P(R1) waits P(R2) waits
R1 and R2 initially 1 (binary semaphore)
56
Conditions for Deadlock
  • Mutually exclusive use of resources
  • Binary semaphores R1 and R2
  • Circular waiting
  • Thread 0 waits for Thread 1 to V(R2) and Thread
    1 waits for Thread 0 to V(R1)
  • Hold and wait
  • Holding either R1 or R2 while waiting on other
  • No pre-emption
  • Neither R1 nor R2 are removed from their
    respective holding Threads.

57
Philosophy 101(or why 5DP is interesting)
  • How to eat with your Fellows without causing
    Deadlock.
  • Circular arguments (the circular wait condition)
  • Not giving up on firmly held things (no
    preemption)
  • Infinite patience with Half-baked schemes (hold
    some wait for more)
  • Why Starvation exists and what we can do about it.

58
Dealing with Deadlock
  • It can be prevented by breaking one of the
    prerequisite conditions
  • Mutually exclusive use of resources
  • Example Allowing shared access to read-only
    files (readers/writers problem)
  • circular waiting
  • Example Define an ordering on resources and
    acquire them in order
  • hold and wait
  • no pre-emption

59
Circular Wait Condition
  • while (food available)
  • if (me ???0) P(forkleft(me))
    P(forkright(me))
  • else (P(forkright(me)) P(forkleft(me))
  • eat
  • V(forkleft(me)) V(forkright(me))
  • think awhile

60
Hold and Wait Condition
while (food available) P(mutex) while (forks
me ! 2) blockingme true V(mutex)
P(sleepyme) P(mutex) forks
leftneighbor(me) -- forks rightneighbor(me)-
- V(mutex) eat P(mutex) forks
leftneighbor(me) forks rightneighbor(me)
if (blockingleftneighbor(me)) blocking
leftneighbor(me) false V(sleepyleftneighbor(
me)) if (blockingrightneighbor(me))
blockingrightneighbor(me) false
V(sleepyrightneighbor(me)) V(mutex)
think awhile
61
Starvation
  • The difference between deadlock and starvation is
    subtle
  • Once a set of processes are deadlocked, there is
    no future execution sequence that can get them
    out of it.
  • In starvation, there does exist some execution
    sequence that is favorable to the starving
    process although there is no guarantee it will
    ever occur.
  • Rollback and Retry solutions are prone to
    starvation.
  • Continuous arrival of higher priority processes
    is another common starvation situation.

62
Readers/Writers Problem
  • Synchronizing access to a file or data record in
    a database such that any number of threads
    requesting read-only access are allowed but only
    one thread requesting write access is allowed,
    excluding all readers.

63
Template for Readers/Writers
  • Reader()
  • while (true)
  • read
  • Writer()
  • while (true)
  • write

/request r access/
/request w access/
/release r access/
/release w access/
64
Reader/Writer Spinlocks
  • Class of reader/writer problems
  • Multiple readers OK
  • Mutual exclusion for writers
  • No upgrade from reader lock to writer lock
  • Favors readers starvation of writers possible
  • rwlock_t
  • read_lock,read_unlock
  • read_lock_irq // also unlock
  • read_lock_irqsave
  • read_unlock_irqrestore
  • write_lock,write_unlock
  • //_irq,_irqsave,_irqrestore
  • write_trylock
  • rw_is_locked

65
Reader/Writer Semaphores
  • All reader / writer semaphores are mutexes (usage
    count 1)
  • Multiple readers, solo writer
  • Uninterruptible sleep
  • Possible to downgrade writer to reader
  • down_read
  • down_write
  • up_read
  • up_write
  • downgrade_writer
  • down_read_trylock
  • down_write_trylock

66
Semaphore Solution with Writer Priority
  • int readCount 0, writeCount 0
  • semaphore mutex1 1, mutex2 1
  • semaphore readBlock 1
  • semaphore writePending 1
  • semaphore writeBlock 1

67
  • Reader()
  • while (TRUE)
  • other stuff
  • P(writePending)
  • P(readBlock)
  • P(mutex1)
  • readCount readCount 1
  • if(readCount 1)
  • P(writeBlock)
  • V(mutex1) V(readBlock)
  • V(writePending)
  • access resource
  • P(mutex1)
  • readCount readCount -1
  • if(readCount 0)
  • V(writeBlock)
  • V(mutex1)
  • Writer()
  • while(TRUE)
  • other stuff
  • P(mutex2)
  • writeCount writeCount 1
  • if (writeCount 1)
  • P(readBlock)
  • V(mutex2)
  • P(writeBlock)
  • access resource
  • V(writeBlock)
  • P(mutex2)
  • writeCount - writeCount - 1
  • if (writeCount 0)
  • V(readBlock)
  • V(mutex2)

68
  • Reader()
  • while (TRUE)
  • other stuff
  • P(writePending)
  • P(readBlock)
  • P(mutex1)
  • readCount readCount 1
  • if(readCount 1)
  • P(writeBlock)
  • V(mutex1) V(readBlock)
  • V(writePending)
  • access resource
  • P(mutex1)
  • readCount readCount -1
  • if(readCount 0)
  • V(writeBlock)
  • V(mutex1)
  • Writer()
  • while(TRUE)
  • other stuff
  • P(mutex2)
  • writeCount writeCount 1
  • if (writeCount 1)
  • P(readBlock)
  • V(mutex2)
  • P(writeBlock)
  • access resource
  • V(writeBlock)
  • P(mutex2)
  • writeCount - writeCount - 1
  • if (writeCount 0)
  • V(readBlock)
  • V(mutex2)

Assume the writePending semaphore was omitted.
What would happen?
69
  • Reader()
  • while (TRUE)
  • other stuff
  • P(writePending)
  • P(readBlock)
  • P(mutex1)
  • readCount readCount 1
  • if(readCount 1)
  • P(writeBlock)
  • V(mutex1) V(readBlock)
  • V(writePending)
  • access resource
  • P(mutex1)
  • readCount readCount -1
  • if(readCount 0)
  • V(writeBlock)
  • V(mutex1)
  • Writer()
  • while(TRUE)
  • other stuff
  • P(mutex2)
  • writeCount writeCount 1
  • if (writeCount 1)
  • P(readBlock)
  • V(mutex2)
  • P(writeBlock)
  • access resource
  • V(writeBlock)
  • P(mutex2)
  • writeCount - writeCount - 1
  • if (writeCount 0)
  • V(readBlock)
  • V(mutex2)

Assume the writePending semaphore was omitted.
What would happen?
70
  • Reader()
  • while (TRUE)
  • other stuff
  • P(writePending)
  • P(readBlock)
  • P(mutex1)
  • readCount readCount 1
  • if(readCount 1)
  • P(writeBlock)
  • V(mutex1) V(readBlock)
  • V(writePending)
  • access resource
  • P(mutex1)
  • readCount readCount -1
  • if(readCount 0)
  • V(writeBlock)
  • V(mutex1)
  • Writer()
  • while(TRUE)
  • other stuff
  • P(mutex2)
  • writeCount writeCount 1
  • if (writeCount 1)
  • P(readBlock)
  • V(mutex2)
  • P(writeBlock)
  • access resource
  • V(writeBlock)
  • P(mutex2)
  • writeCount - writeCount - 1
  • if (writeCount 0)
  • V(readBlock)
  • V(mutex2)

Assume the writePending semaphore was omitted.
What would happen?
71
  • Assume the writePending semaphore was omitted in
    the solution just given. What would happen?

This is supposed to give writers priority.
However, consider the following sequence Reader
1 arrives, executes thro P(readBlock) Reader 1
executes P(mutex1) Writer 1 arrives, waits at
P(readBlock) Reader 2 arrives, waits at
P(readBlock) Reader 1 executes V(mutex1) then
V(readBlock) Reader 2 may now proceedwrong
72
Birrell paper SRC Thread Primitives
  • SRC thread primitives
  • Thread Fork (procedure, args)
  • result Join (thread)
  • LOCK mutex DO critical section END
  • Wait (mutex, condition)
  • Signal (condition)
  • Broadcast (condition)
  • Acquire (mutex), Release (mutex) //more dangerous

73
Monitor Abstraction
  • Encapsulates shared data and operations with
    mutual exclusive use of the object (an associated
    lock).
  • Associated Condition Variables with operations of
    Wait and Signal.

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
74
Condition Variables
  • We build the monitor abstraction out of a lock
    (for the mutual exclusion) and a set of
    associated condition variables.
  • Wait on condition releases lock held by caller,
    caller goes to sleep on conditions queue.
    When awakened, it must reacquire lock.
  • Signal condition wakes up one waiting thread.
  • Broadcast wakes up all threads waiting on this
    condition.

75
Monitor Abstraction
  • EnQacquire (lock)
  • if (head null)
  • head item
  • signal (lock, notEmpty)
  • else tail-gtnext item
  • tail item
  • release(lock)
  • deQacquire (lock)
  • if (head null)
  • wait (lock, notEmpty)
  • item head
  • if (tail head) tail null
  • headitem-gtnext
  • release(lock)

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
76
Monitor Abstraction
  • EnQacquire (lock)
  • if (head null)
  • head item
  • signal (lock, notEmpty)
  • else tail-gtnext item
  • tail item
  • release(lock)
  • deQacquire (lock)
  • if (head null)
  • wait (lock, notEmpty)
  • item head
  • if (tail head) tail null
  • headitem-gtnext
  • release(lock)

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
77
Monitor Abstraction
  • EnQacquire (lock)
  • if (head null)
  • head item
  • signal (lock, notEmpty)
  • else tail-gtnext item
  • tail item
  • release(lock)
  • deQacquire (lock)
  • if (head null)
  • wait (lock, notEmpty)
  • item head
  • if (tail head) tail null
  • headitem-gtnext
  • release(lock)

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
78
Monitor Abstraction
  • EnQacquire (lock)
  • if (head null)
  • head item
  • signal (lock, notEmpty)
  • else tail-gtnext item
  • tail item
  • release(lock)
  • deQacquire (lock)
  • if (head null)
  • wait (lock, notEmpty)
  • item head
  • if (tail head) tail null
  • headitem-gtnext
  • release(lock)

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
79
Monitor Abstraction
  • EnQacquire (lock)
  • if (head null)
  • head item
  • signal (lock, notEmpty)
  • else tail-gtnext item
  • tail item
  • release(lock)
  • deQacquire (lock)
  • if (head null)
  • wait (lock, notEmpty)
  • item head
  • if (tail head) tail null
  • headitem-gtnext
  • release(lock)

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
80
Monitor Abstraction
  • EnQacquire (lock)
  • if (head null)
  • head item
  • signal (lock, notEmpty)
  • else tail-gtnext item
  • tail item
  • release(lock)
  • deQacquire (lock)
  • while (head null)
  • wait (lock, notEmpty)
  • item head
  • if (tail head) tail null
  • headitem-gtnext
  • release(lock)

notEmpty
entry queue
monitor_lock
enQ
deQ
init
shared data
conditions
81
The Critical Section Problem
while (1) ...other stuff... acquire
(mutex) critical section release
(mutex)
// conceptually inside monitor
82
PV using Locks CV (Monitor)
  • P acquire (lock)
  • while (Sval 0)
  • wait (lock, nonZero)
  • Sval Sval 1
  • release(lock)
  • V acquire (lock)
  • Sval Sval 1
  • signal (lock, nonZero)
  • release(lock)

nonZero
entry queue
lock
P
V
Sval
init
shared data
conditions
83
Design Decisions / Issues
  • Locking overhead (granularity)
  • Broadcast vs. Signal
  • Nested lock/condition variable problem
  • LOCK a DO LOCK b DO while (not_ready) wait
    (b, c) //releases b not a ENDEND
  • My advice correctness first!

Unseenin call
84
Using Condition Variables
  • while (! required_conditions) wait (m, c)
  • Why we use while not if invariant not
    guaranteed
  • Why use broadcast vs. signal can arise if we
    are using one condition queue for many reasons.
    Waking threads have to sort it out (spurious
    wakeups). Possibly better to separate into
    multiple conditions (but more complexity to
    code).

85
5DP - Monitor Style
  • Boolean eating 5
  • Lock forkMutex
  • Condition forksAvail
  • void PickupForks (int i)
  • forkMutex.Acquire( )
  • while ( eating(i-1)5 ???eating(i1)5 )
  • forksAvail.Wait(forkMutex)
  • eatingi true
  • forkMutex.Release( )
  • void PutdownForks (int i)
  • forkMutex.Acquire( )
  • eatingi false
  • forksAvail.Broadcast(forkMutex)
  • forkMutex.Release( )

86
What about this?
while (food available) forkMutex.Acquire( )
while (forks me ! 2) blockingmetrue for
kMutex.Release( ) sleep( ) forkMutex.Acquire(
) forks leftneighbor(me)-- forks
rightneighbor(me)-- forkMutex.Release(
) eat forkMutex.Acquire( ) forksleftneighb
or(me) forks rightneighbor(me) if
(blockingleftneighbor(me) blockingrightneigh
bor(me)) wakeup ( )
forkMutex.Release( ) think awhile
87
Template for Readers/Writers
  • Reader()
  • while (true)
  • read
  • Writer()
  • while (true)
  • write

fd open(foo, 0)
fd open(foo, 1)
close(fd)
close(fd)
88
Template for Readers/Writers
  • Reader()
  • while (true)
  • read
  • Writer()
  • while (true)
  • write

startRead()
startWrite()
endRead()
endWrite()
89
R/W - Monitor Style
  • Boolean busy false
  • int numReaders 0
  • Lock filesMutex
  • Condition OKtoWrite, OKtoRead
  • void startRead ()
  • filesMutex.Acquire( )
  • while ( busy )
  • OKtoRead.Wait(filesMutex)
  • numReaders
  • filesMutex.Release( )
  • void endRead ()
  • filesMutex.Acquire( )
  • numReaders--
  • if (numReaders 0)
  • OKtoWrite.Signal(filesMutex)
  • filesMutex.Release( )
  • void startWrite()
  • filesMutex.Acquire( )
  • while (busy numReaders ! 0) OKtoWrite.Wait(
    filesMutex)
  • busy true
  • filesMutex.Release( )
  • void endWrite()
  • filesMutex.Acquire( )
  • busy false
  • OKtoRead.Broadcast(filesMutex)
    OKtoWrite.Signal(filesMutex)
  • filesMutex.Release( )

90
Issues
  • Locking overhead (granularity)
  • Broadcast vs. Signal and other causes of spurious
    wakeups
  • Nested lock/condition variable problem
  • LOCK a DO LOCK b DO while (not_ready) wait
    (b, c) //releases b not a ENDEND
  • Priority inversions

Unseenin call
91
Spurious Wakeups
  • while (! required_conditions) wait (m, c)
  • Why we use while not if invariant not
    guaranteed
  • Why use broadcast using one condition queue for
    many reasons. Waking threads have to sort it
    out. Possibly better to separate into multiple
    conditions (more complexity to code)

92
Tricks (mixed syntax)
  • if (some_condition) // as a hint
  • LOCK m DO
  • if (some_condition) //the truth
  • stuff
  • END

Cheap to get info but must check for
correctness always a slow way
93
More Tricks
  • General patternwhile (! required_conditions)
    wait (m, c)
  • Broadcast works because waking up too many is OK
    (correctness-wise) although a performance impact.

LOCK m DO deferred_signal trueENDif
(deferred_signal) signal (c)
Spurious lock conflictscaused by signals
insidecritical section and threads waking up to
test mutex before it getsreleased.
94
Alerts
  • Thread state contains flag, alert-pending
  • Exception alerted
  • Alert (thread)
  • alert-pending to true, wakeup a waiting thread
  • AlertWait (mutex, condition)
  • if alert-pending set to false and raise exception
  • else wait as usual
  • Boolean b TestAlert()
  • tests and clear alert-pending
  • TRYwhile (empty) AlertWait (m, nonempty)
    return (nextchar())
  • EXCEPTThread.Alerted
  • return (eof)

95
Using Alerts
  • sibling Fork (proc, arg)
  • while (!done)
  • done longComp()
  • if (done) Alert (sibling)
  • else done TestAlert()

96
Wisdom
  • Do s
  • Reserve using alerts for when you dont know what
    is going on
  • Only use if you forked the thread
  • Impose an ordering on lock acquisition
  • Write down invariants that should be true when
    locks arent being held
  • Dont s
  • Call into a different abstraction level while
    holding a lock
  • Move the last signal beyond scope of Lock
  • Acquire lock, fork, and let child release lock
  • Expect priority inheritance since few
    implementations
  • Pack data and expect fine grain locking to work

97
(No Transcript)
98
Proposed Algorithm for 2 Process Mutual Exclusion
  • Boolean flag2
  • proc (int i)
  • while (TRUE)
  • compute
  • flagi TRUE
  • while(flag(i1) mod 2)
  • critical section
  • flagi FALSE
  • flag0 flag1 FALSE
  • fork (proc, 1, 0)
  • fork (proc, 1,1)
  • Is it correct?

Assume they go lockstep. Both set their
own flag to TRUE. Both busywait forever on the
others flag -gt deadlock.
99
Greedy Version (turn me)
needin blue true
needin green true
turn blue
while (needin green turn green)
Critical Section
turn green
while (needin blue turn blue)
Critical Section
Oooops!
100
Petersons Algorithm for 2 Process Mutual
Exclusion
  • enter_region
  • needin me true
  • turn you
  • while (needin you turn you) no_op
  • exit_region
  • needin me false
  • What about more than 2 processes?

101
Can we extend 2-process algorithm to work with n
processes?
102
Can we extend 2-process algorithm to work with n
processes?
needin me true turn you
needin me true turn you
needin me true turn you
needin me true turn you
needin me true turn you
CS
IdeaTournamentDetailsBookkeeping (left to the
reader)
103
Lamports Bakery Algorithm
  • enter_regionchoosingme truenumberme
    max(number0n-1) 1choosingme falsefor
    (j0 n-1 j) while (choosingj ! 0)
    skip while((numberj ! 0 ) and
    ((numberj lt numberme) or ((numberj
    numberme) and (j lt me)))) skip
  • exit_regionnumberme 0

104
Interleaving / Execution Sequence with Bakery
Algorithm
Thread 0
Thread 1
Choosing
Choosing
False
False
0
Number 0
Number 1
0
Thread 2
Thread 3
False
Choosing
Choosing
False
Number 3
0
Number 2
0
105
Thread 0
Thread 1
Choosing
Choosing
True
True
0
Number 0
Number 1
0
1
Thread 2
Thread 3
True
Choosing
Choosing
False
Number 3
0
1
Number 2
0
106
for (j0 n-1 j) while (choosingj !
0) skip while((numberj ! 0 ) and
((numberj lt numberme) or ((numberj
numberme) and (j lt me)))) skip
Thread 0
Thread 1
Choosing
Choosing
True
False
0
Number 0
Number 1
1
Thread 2
Thread 3
False
Choosing
Choosing
False
Number 3
1
Number 2
0
107
for (j0 n-1 j) while (choosingj !
0) skip while((numberj ! 0 ) and
((numberj lt numberme) or ((numberj
numberme) and (j lt me)))) skip
Thread 0
Thread 1
Choosing
Choosing
False
False
2
Number 0
Number 1
1
Thread 2
Thread 3
False
Choosing
Choosing
False
Number 3
1
Number 2
0
108
for (j0 n-1 j) while (choosingj !
0) skip while((numberj ! 0 ) and
((numberj lt numberme) or ((numberj
numberme) and (j lt me)))) skip
Thread 0
Thread 1
Choosing
Choosing
False
False
2
Number 0
Number 1
1
Thread 2
Thread 3
False
Choosing
Choosing
True
Number 3
1
Number 2
3
109
for (j0 n-1 j) while (choosingj !
0) skip while((numberj ! 0 ) and
((numberj lt numberme) or ((numberj
numberme) and (j lt me)))) skip
Thread 3 Stuck
Thread 0
Thread 1
Choosing
Choosing
False
False
2
Number 0
Number 1
1
Thread 2
Thread 3
False
Choosing
Choosing
True
Number 3
1
Number 2
3
110
for (j0 n-1 j) while (choosingj !
0) skip while((numberj ! 0 ) and
((numberj lt numberme) or ((numberj
numberme) and (j lt me)))) skip
Thread 0
Thread 1
Choosing
Choosing
False
False
2
Number 0
Number 1
0
Thread 2
Thread 3
False
Choosing
Choosing
True
Number 3
1
Number 2
3
111
for (j0 n-1 j) while (choosingj !
0) skip while((numberj ! 0 ) and
((numberj lt numberme) or ((numberj
numberme) and (j lt me)))) skip
Thread 0
Thread 1
Choosing
Choosing
False
False
2
Number 0
Number 1
0
Thread 2
Thread 3
False
Choosing
Choosing
False
Number 3
1
Number 2
3
112
Hardware Assistance
  • Most modern architectures provide some support
    for building synchronization atomic
    read-modify-write instructions.
  • Example test-and-set (loc, reg)
  • sets bit to 1 in the new value of loc
  • returns old value of loc in reg
  • Other examples
  • compare-and-swap, fetch-and-op

notation means atomic
113
Busywaiting with Test-and-Set
  • Declare a shared memory location to represent a
    busyflag on the critical section we are trying to
    protect.
  • enter_region (or acquiring the lock)
  • waitloop tsl busyflag, R0 // R0 ??busyflag
    busyflag ? 1
  • bnz R0, waitloop // was it already set?
  • exit region (or releasing the lock)
  • busyflag ? 0

114
Better Implementations from Multiprocessor Domain
  • Dealing with contention of TestSet spinlocks
  • Dont execute testset so much
  • Spin without generating bus traffic
  • TestSet with Backoff
  • Insert delay between testset operations (not too
    long)
  • Exponential seems good (kci)
  • Not fair
  • Test-and-TestSet
  • Spin (test) on local cached copy until it gets
    invalidated, then issue testset
  • Intuition No point in trying to set the location
    until we know that its not set, which we can
    detect when it get invalidated...
  • Still contention after invalidate
  • Still not fair
  • Analogies for Energy?

115
Blocking Synchronization
  • OS implementation involving changing the state of
    the waiting process from running to blocked.
  • Need some synchronization abstraction known to OS
    - provided by system calls.
  • mutex locks with operations acquire and release
  • semaphores with operations P and V (down, up)
  • condition variables with wait and signal

116
Template for Implementing Blocking Synchronization
  • Associated with the lock is a memory location
    (busy) and a queue for waiting threads/processes.
  • Acquire syscall
  • while (busy) enqueue caller on locks queue
  • /upon waking to nonbusy lock/ busy true
  • Release syscall
  • busy false
  • / wakeup / move any waiting threads to Ready
    queue

117
  • The Alpha and MIPS 4000 processor architectures
    have no atomic read-modify-write instructions,
    i.e., no test-and-set-lock instruction (TS).
    Atomic update is supported by pairs of
    load_locked (LDL) and store-conditional (STC)
    instructions.
  • The semantics of the Alpha architectures LDL and
    STC instructions are as follows. Executing an LDL
    Rx, y instruction loads the memory at the
    specified address (y) into the specified general
    register (Rx), and holds y in a special
    per-processor lock register. STC Rx, y stores the
    contents of the specified general register (Rx)
    to memory at the specified address (y), but only
    if y matches the address in the CPUs lock
    register. If STC succeeds, it places a one in Rx
    if it fails, it places a zero in Rx. Several
    kinds of events can cause the machine to clear
    the CPU lock register, including traps and
    interrupts. Moreover, if any CPU in a
    multiprocessor system successfully completes a
    STC to address y, then every other processors
    lock register is atomically cleared if it
    contains the value y.
  • Show how to use LDL and STC to implement safe
    busy-waiting

118
  • Acquire LDL R1 flag /R1 lt- flag/
  • BNZ R1 Acquire / if (R1 ! 0) already locked/
  • LDI R2 1
  • STC R2 flag / try to set flag/
  • BEZ R2 Acquire / if STC failed, retry/
  • Release LDI R2 0
  • STC R2 flag / reset lock, breaking lock on
    flag/
  • BEZ R2 Error / should never happen/

119
Synchronization Primitives (Abstractions)
  • implementable by busywaiting or blocking

120
Knowing a Critical Section when you see one.
  • Weve talked about solutions for protecting
    critical sections of code.
  • Before considering other problems or usage
    patterns for synchronization primitives, lets
    get some practice with recognizing critical
    sections that need such protection.
Write a Comment
User Comments (0)
About PowerShow.com