Loeng 2

About This Presentation

Title:

Loeng 2

Description:

Loeng 2. Teise loengu teema: p him isted j tkub. Meenutus: esimese loengu teemad olid: ... Processes and programs and communication types. Execution order ... – PowerPoint PPT presentation

Number of Views:110

Avg rating:3.0/5.0

Slides: 61

Provided by: tt895

Category:

Tags: loeng | olid

more less

Transcript and Presenter's Notes

Title: Loeng 2

1
Loeng 2

Teise loengu teema põhimõisted jätkub
Meenutus esimese loengu teemad olid
Sissejuhatus kursusesse
Lühiülevaade problemaatikast
Algus põhimõistetega fine and coarse grained
parallelism jne

2
Teise loengu teemad

Lihtsamad asjad
Arhitektuurid (sisuliselt esimese loengu osade
kordamine)
Client-server interaction
Transmission modes
Metrics
Keerukamad asjad
Processes and programs and communication types
Execution order
Program properties safety, liveness, fairness
Mutual exclusion (transactions)
Virtual time
Data races
Memory models

3
Client/Server (1-1)
4
Client/Server (1-N)
5
Example Web proxy server
6
Client-Server interaction (IV)
7
Peer-to-Peer Coordination
8
Mobile Code Example Applet
9
Client-Server interaction (I)

Remote procedure call

10
Client-Server interaction (II)

Multi-tier architectures

11
Client-Server interaction (III)

Asynchronous remote procedure call

12
Transmission modes

Simplex ühel kanalil liiklus ainult ühes suunas
(arvuti-gtmonitor)
Half-duplex kanalil võib liiklus olla
kahesuunaline, aga mitte korraga, vaid vahel ühes
suunas, vahel teises (politseiraadio)
Duplex kanalil liiklus korraga mõlemas suunas
(telefon)
Frequency-division
Time-division
Synchronous channel divided into time frames.
Each frame has at least as many time slots as
logical I/O lines
Asynchronous n lines, m slots per frame. M is
based on statistical analysis

13
Metrics

Bandwidth (Mbps / Mhz olenevalt kontekstist)
Latency (time to take a message from A to B.
Sometimes round-trip (A-B-A))
Propagation Transmit Queue

14
Basic Paradigms

process a unit of sequential instruction
execution
program a collection of processes
Process communication, two different ways to go
Shared Memory in the language level we find
Shared variables
Semaphores for synchronization
Mutual exclusion, Critical Code, Monitors/Locks
Message Passing
Local variables for each process
Send/receive parameters and data
Remote Procedure Call

15
Reality is Different from Paradigm

In shared memory reading and writing is
non-atomic because of queues and caching effects.
Message passing is by way of point to point
jumping and packetization, no direct connection.
OS should present to the user one of the simpler
models. User may assume everything works as in
the spec.
More often than not implementation is
buggy, or exposes details of a native view
different from the spec.
Sometimes model is being complicated to
enhance performance and reduce communication
relaxed consistency.

16
Common Types of Parallel Systems
Communication Efficiency (bandwidth
latency)

Multi-threading on a uni-processor (your home
PC)
Multi-threading on a multi-processor (SMP)
Tightly-coupled parallel computer
(Compaqs Proliant, SGIs Origin 2000,
IBMs MP/2, Crays T3D)
Distributed system (cluster)
Internet computing (peer-to-peer)
Traditionally 12 are programmable using shared
memory, 34 are programmable using message
passing, in 5 peer processes communicate with
central control only.
However things change! Most importantly recent
systems in 3 move towards presenting a shared
memory interface to a physically distributed
system. Is this an indication for the future?

Scalability, Level of Parallelism
17
Execution Order

Process execution is a-synchronic, no global bip,
no global clock. Each process has a different
execution speed, which may change over time. For
an observer, on the time axis, instruction
execution is ordered in execution order. Any
order is legal. (Sometimes different processes
may observe different global orders, TBD).
Execution order for a single process is called
program order.

x
P1
P2
18
Atomicity of Instruction Execution
Consider P1 INC(i) P2 INC(i)
i i2

The atomicity model is important for answering
the question
Is my parallel program correct?

19
Program properties or invariants

Typically we are interested of
Safety bad things cannot happen
Liveness program keeps working and necessary
things will eventually happen
Fairness if several processes run in parallel,
everybody gets some resources (time and memory)

20
Program Properties Safety Properties

something bad cannot happen
are kept throughout computation, always true
if does not hold, we will know within finite
number of steps
Example deadlock freedom
There is always a process that can execute
another instruction (However, not necessarily
does execute it).
Example mutual exclusion
It is not allowed for two given code regions (in
two different processes) to execute concurrently.
Example if xgty holds then xgty holds for the rest
of the execution.
However mutual exclusion as above holds even if
the program does not allow any of the processes
to execute any of the code regions!

21
Liveness Properties

Something good must happen (in finite number
of steps)
Guarantee progress in computation
Example no starvation
Any process wishes to execute an instruction will
eventually be able to execute.
Example Program/process eventually terminates.
Example One of the processes will enter critical
section.
(note the difference from deadlock freedom)

22
Fairness Properties

Liveness properties are relatively weak guarantee
of access to a shared resource.
Weak fairness if a process awaits on a certain
request then eventually it will be granted.
Eventually is not good enough for OS and
real-time systems, when response time counts.
Strong fairness if the process performs the
request sufficiently frequently then eventually
it will be granted.
Linear waiting if a process performs the
request it will be allowed previous to any other
process granted twice.
FIFO - . previous to granting any other process
that asked later.
Easy to implement in a centralized system.
However, in a distributed system it is not clear
what before or later mean.

23
Mutual Exclusion

N processes perform an infinite loop of
instruction sequence, which is composed of a
critical section and a non-critical section.
Mutual exclusion property instructions from
critical sections of two or more processes must
not be interleaved in the (global observers)
execution order.

x
(x
x
x)
x
x
x
x
x
x
x
x
P1
o
(o
o
o)
o
o
o
o
o
o
o
o
P2
time
24
Mutual exclusion the solution

The solution is by way of additional instructions
executed by every process which is to enter or
leave its critical section.
The pre_protocol
The post_protocol
Loop
Non_critical_section
Pre_protocol
Critical_section
Post_protocol
End_loop

25
Solution must guarantee

A process cannot stop for indefinite time in the
critical_section or the protocols. The solution
must ensure that such a stop at the
non_critical_section by one of the processes will
not violate the ability of the other processes to
enter the critical section.
No deadlock. It may be that several processes
perform inside their pre_protocols. Eventually,
one of them will succeed to enter the
critical_section.
No starvation. If a process enters its
pre_protocol with the intention to enter the
critical section, it will eventually succeed.
No self exclusion. In the absence of other
processes trying to enter the critical_section, a
single process will always succeed doing so in a
very short time.

26
Solution try 1 Give them a token to decide
whose turn is it

Integer Turn 1
P1
begin
loop
non_crit_1
loop
exit when Turn 1
end loop
crit_sec_1
Turn 2
end loop
end P1

P2 begin loop non_crit_2 loop
exit when Turn 2 end loop crit_sec_2
Turn 1 end loop end P2
(Note atomic Read/Write)
27
Solution try 2 Lets give each process a
variable it can use to announce that it is in its
crit_sec

Integer C11, C21
P1
Loop
non_crit_sec_1
loop
exit when C21
end loop
C1 0
crit_sec_1
C1 1
End Loop

Problem no mutual exclusion Execution
example P1 sees C21 P2 sees C11 P1 sets C1
0 P2 sets C2 0 P1 enters critical sec P2
enters critical sec
P2 Loop non_crit_sec_2 loop
exit when C11 end loop C2 0
crit_sec_2 C2 1 End Loop
28
Solution try 3 Lets set announcing variable
before the loop

Integer C11, C21
P1
Loop
non_crit_sec_1
C1 0
loop
exit when C21
end loop
crit_sec_1
C1 1
End Loop

P2 Loop non_crit_sec_2 C2 0
loop exit when C11 end loop
crit_sec_2 C2 1 End Loop
Problem deadlock Execution example P1 sets
C10 P2 sets C20 P1 checks C2 forever P2
checks C1 forever
29
Solution try 4 Lets allow other process to
enter its crit_sec if we fail to do so

Integer C11, C21
P1
Loop
non_crit_sec_1
C1 0
loop
exit when C21
C1 1
C1 0
end loop
crit_sec_1
C1 1
End Loop

P2 Loop non_crit_sec_2 C2 0
loop exit when C11 C2 1
C2 0 end loop crit_sec_1
C2 1 End Loop
Can other process enter between Ci1 and Ci0 ?
Problem starvation Between C11 and C10 P2
completed a full round. Problem livelock
30
Dekkers algorithm lets give processes a
priority token that will give holder the right of
way when competing

Integer C11, C21, Turn1
P1
Loop
non_crit_sec_1
C1 0
loop
exit when C21
if Turn 2 then
C1 1
loop exit when Turn 1
end loop
C1 0
end if
end loop
crit_sec_1
C1 1
Turn 2
End Loop

P2 Loop non_crit_sec_2 C2 0
loop exit when C11 if Turn 1
then C2 1 loop
exit when Turn 2 end loop
C2 0 end if end
loop crit_sec_2 C2 1 Turn
1 End Loop

Algorithm Correct!!!
P1 is performing inside the
insisting loop
If C20 then P1 knows P2 wants to enter crit_sec
If, in addition, Turn2, then P1 gives turn to
P2, and waits for P2 to finish.
Clearly, while P1 does all these, P2 itself will
not give up because it is his Turn.
All characteristics for a valid solution exist.

31
Bakery Algorithm mutual exclusion for N
processes

Loop
non_crit_sec_i
choosing(i) 1
number(i) 1 max(number)
choosing(i) 0
for j in 1..N loop
if j / i then
loop exit when choosing(j) 0
end loop
loop
exit when
number(j) 0 or
number(i) lt
number(j) or
number(i) number
(j) and i lt j)
end loop
end if
end loop
crit_sec_i
number(i) 0
End loop

Shared arrays array(1..N) of integer Choosing,
Number Process Pi performs integer i
process id
The idea is to have processes take tickets with
numbers on them (just like in the city hall, or
health care). Other processes give turn to
process holding the ticket with minimal
number (he got there first). If two tickets
happen to be the same, the process having minimal
id enters.
32
Changing the rules of the game increasing
atomicity (loadstore)

C shared variable
Bi Pis private variable
TS (Test and Set) Bi C
C 1
CS (Compare and Swap)
if Bi / C
tmp C
C Bi
Bi tmp
end if

Loop non_crit_sec_i loop
TS(Bi) exit when Bi0
end loop crit_sec_i C 0 End
loop
Such strong ops are usually supported by the
underlying hardware/OS.
33
The Price of Atomic loadstoreor Why not
Simply Always use Strong Operations?

The Set of C must be seen immediately by all
other processors, in case they execute competing
code. Since communication between processors is
via the main memory, need to cut through cache
levels. Price dozens to hundreds of clock
cycles, and growing.

Proc. 1
Proc. 2
Proc. 3
B0
B2
Local cache and registers
LoadStore
L2/L3 cache
L2/L3 cache
LoadStore
TS
Main Memory
C
34
Semaphores

A semaphore is a special variable.
After initialization, only two atomic operations
are applicable
Busy-Wait Semaphore
P(S) WAIT(S) When Sgt0 then S S-1
V(S) SIGNAL(S) S S1
Another definition Blocked-Set Semaphore
WAIT(S) if Sgt0 then S S-1
else wait on S
SIGNAL(S) if there are processes waiting on S
then let one of them proceed,
else SS1

NOTE LoadStore are embedded in both WAIT and
SIGNAL. Thus, Mutual Exclusion using semaphores
is easy.
35
Virtual Time

Virtual Time and Global States of Distributed
Systems
Friedmann Mattern, 1989
The Model An asynchronous distributed system a
set of processes having no shared memory,
communicating by message transfer.
Message delay gt 0, but is not known in advance.
A global observer sees the global state at
certain points in time. It can be said to take a
snapshot of the global state.
A local observer (one of the processes in the
system) sees the local state. Because of the
asynchrony, a local observer can only gather
local views to an approximate global view.
This is a hard hazard for many management and
control problems
Mutual exclusion, deadlock detection, distributed
contracts, leader election, load sharing,
checkpointing etc.

36
Solution Approaches

Simulating a synchronous system by an
asynchronous one. This requires high overhead on
global synchronization of each and every step.
Simulation of a global state. A snapshot, taken
asynchronously, which is not necessarily correct
for any specific point in time, but is in a way
consistent with the local states of all
processes.
A logical clock which is not global, but can be
used to derive useful global information. The
system works asynchronously, but the processes
make sure to maintain their part of the clock.

37
Events

An event is a change in the process state.
An event happens instantly, it does not take
time.
A process is a sequence of events
There are 3 types of events
send event causes a message to be sent
receive event causes a message to be received
local event only causes an internal change of
state
Events correspond to each other as follows
All events in the same process happen
sequentially, one after the other.
Each send event has a corresponding receive
This allows us to define the happened before
relation among events.

38
The Happened Before Relation
We say that event e happened before event e (and
denote it by e ? e or e lt e) if one of the
following properties holds
Processor Order e precedes e in the same
process Send-Receive e is a send and e is the
corresponding receive Transitivity exists e
s.t. e lt e and elt e
Example
39
Independent/Concurrent Events
Two such diagrams are called equivalent when the
happened before relation is the same in both.
(When global time differs for certain
events, think of processor execution as if it was
a rubber band).
Two events e, e are said to be independent or
concurrent (denoted by e e) if not e lt e and
not e lt e.
40
Virtual Time (Lamport, 1978)

A logical clock is a function CE ? T
E a set of events, C(e) timestamp of e
T a partially ordered set s.t. elte ?
C(e)ltC(e)
(the opposite not necessarily true, e.g.
concurrent events.)
Commonly, TN, and there is a local clock Ci for
each process Pi.
To meet the requirements, the clocks perform the
following protocol

Just before executing a local event in Pi Ci
Ci d (dgt0)
Each message m, sent by event e send(m), is
time-stamped t(m) C(e).
Just before Pi receives a message with timestamp
t Ci max(Ci,t(m)) d (d gt0)

Usually, d1. However, d may change arbitrarily
and dynamically, say, to reflect actual time. The
timestamp of e, C(e), is given after advancing
the clock, i.e., after (1) above was already
performed for e.
41
Logical Clocks Cntd.
Example
C11
C12
C13
P1
e11
e12
e13
C21
C22
P2
e21
e22
P3
e31
C33
A problem When e and e are concurrent, then any
of C(e) lt C(e), C(e) lt C(e), C(e) C(e) may
hold. Thus, when only the timestamps of the
events are known, there is a loss of information.
We do know that C(e) lt C(e) ? not(e lt e). But
we do not know whether e lt e or e e. In
particular, the information whether the events
are independent is most important, and
unfortunately lost.
42
What is a Data-Race?

Data-race is an anomaly of concurrent accesses by
two or more threads to a shared variable and at
least one is for writing.
Example (variable X is global and shared)
Thread 1 Thread 2
X1 TY
Z2 TX

43
Why Data-Races areUndesired?

Programs which contain data-races usually
demonstrate unexpected and even non-deterministic
behavior.
The outcome might depend on specific execution
order (A.K.A threads interleaving).
Re-running the program may not always produce the
same results.
Thus, hard to debug and hard to write correct
programs.

44
Why Data-Races areUndesired? - Example

First Interleaving Thread 1 Thread 2
1. X0
2. TX
3. X
Second Interleaving Thread 1 Thread 2
1. X0
2. X
3. TX
T0 or T1?

45
Execution Order

Each thread has a different execution speed,
which may change over time.
For an external observer of the time axis,
instructions execution is ordered in execution
order.
Any order is legal.
Execution order for a single
thread is called program order.

46
How Data-Races Can be Prevented? Explicit
Synchronization

Idea In order to prevent undesired concurrent
accesses to shared locations, we must explicitly
synchronize between threads.
The means for explicit synchronization are
Locks, Mutexes and Critical Sections
Barriers
Binary Semaphores and Counting Semaphores
Monitors
Single-Writer/Multiple-Readers (SWMR) Locks
Others

47
Synchronization Bad Bank Account Example

Thread 1 Thread 2
Deposit( amount ) Withdraw( amount )
balanceamount if (balanceltamount)
print( Error )
else
balanceamount
Deposit and Withdraw are not atomic!!!
What is the final balance after a series of
concurrent deposits and withdraws?

48
Synchronization Good Bank Account Example

Thread 1 Thread 2
Deposit( amount ) Withdraw( amount )
Lock( m ) Lock( m )
balanceamount if (balanceltamount)
Unlock( m ) print( Error )
else
balanceamount
Unlock( m )
Since critical sections can never execute
concurrently, this version exhibits no data-races.

49
Is This Enough?

Is This Enough?
Theoretically YES.
Practically NO.
What if programmer accidentally forgets to place
correct synchronization?
How all such data-race bugs can be detected in
large program?

50
Can Data-Races be Easily Detected? No!

Unfortunately, the problem of deciding whether a
given program contains potential data-races is
computationally hard!!!
There are a lot of execution orders. For t
threads of n instructions each the number of
possible orders is about tnt.
In addition to all different schedulings, all
possible inputs should be tested as well.
To compound the problem, inserting a detection
code in a program can perturb its execution
schedule enough to make all errors disappear.

51
Feasible Data-Races

Feasible Data-Races races that are based on the
possible behavior of the program (i.e. semantics
of the programs computation).
These are the actual (!) data-races that can
possibly happen in any specific execution.
Locating feasible data-races requires full
analyzing of the programs semantics to determine
if the execution could have allowed a and b
(accesses to same shared variable) to execute
concurrently.

52
Apparent Data-Races

Apparent Data-Races approximations (!) of
feasible data-races that are based on only the
behavior of the explicit synchronization
performed by some feasible execution (and not the
semantics of the programs computation, i.e.
ignoring all conditional statements).
Important, since data-races are usually a result
of improper synchronization. Thus easier to
detect, but less accurate.

53
Why Memory Model?
Answers the question Which writes by a process
are seen by which reads of the other processes?
54
Memory Consistency Models
Pi R V W V,7 R V R V Pj R V W V,13 R V R V
Example program
A consistency/memory model is an agreement
between the execution environment (H/W, OS,
middleware) and the processes. Runtime guarantees
to the application certain properties on the way
values written to shared variables become visible
to reads. This determines the memory model,
whats valid, whats not.
55
Memory Model Coherence

Coherence is the memory model in which (the
runtime guarantees to the program that) writes
performed by the processes for every specific
variable are viewed by all processes in the same
full order.

Example program
All valid executions under Coherence
Note the view of a process consists of the
values it sees in its reads, and the writes it
performs. Thus, if a R V in P which is later than
a W V,x in P sees a value different than x, then
a later R V cannot see x.
56
Formal definition of Coherence

Program Order The order in which instructions
appear in each process. This is a partial order
on all the instructions in the program.
A serialization A full order on all the
instructions (reads/writes) of all the processes,
which is consistent with the program order.
A legal serialization A serialization in which
each read X returns the value written by the
latest write X in the full order.
Let P be a program let PX be the sub-program
of P which contains all the read X/write X
operations on X only.
Coherence P is said to be coherent if for every
variable X there exists a legal serialization of
PX. (Note a process cannot distinguish one such
serialization from another for a given execution)

57
Examples
Process 2 read y,1 write x,1
Process 2 read y,1 write x,1
Coherent. Serializations x write x,1, read
x,1 y write y,1, read y,1
Process 1 read x,1 write x,2
Process 2 read x,2 write x,1
Not Coherent. Cycle of dependencies. Cannot be
serialized.
Not Coherent. Cannot be serialized.
58
Sequential Consistency Lamport 1979

Sequential Consistency is the memory model in
which all reads/writes performed by the processes
are viewed by all processes in the same full
order.

Coherent. Not Sequentially consistent.
Coherent. Not Sequentially consistent.
59
Strict (Strong) Memory Models
Sequential Consistency Given an execution, there
exists an order of reads/writes which is
consistent with all program orders.
Coherence For any variable x, there exists an
order of read x/write x consistent with all p.o.s.
60
Formal definition of Sequential Consistency

Let P be a program.
Sequential Consistency P is said to be
sequentially consistent if there exists a legal
serialization of all reads/writes in P.

Observation Every program which is sequentially
consistent is also coherent. Conclusion Sequentia
l Consistency has stronger requirements and we
thus say that it is stronger than Coherence. In
general A consistency model A is said to be
(strictly) stronger than B if all executions
which are valid under A are also valid under B.

Write a Comment

User Comments (0)