Distributed Operating Systems CS551

About This Presentation

Title:

Distributed Operating Systems CS551

Description:

Basic idea: 'When one process is about to block waiting for a resource that ... to see which has a larger timestamp (i.e. is younger).' Tanenbaum, DOS (1995) ... – PowerPoint PPT presentation

Number of Views:213

Avg rating:3.0/5.0

Slides: 72

Provided by: scha67

Learn more at: https://www.cs.colostate.edu

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Operating Systems CS551

1
Distributed Operating SystemsCS551

Colorado State University
at Lockheed-Martin
Lecture 6 -- Spring 2001

2
CS551 Lecture 6

Topics
Distributed Process Management (Chapter 7)
Distributed Scheduling Algorithm Choices
Scheduling Algorithm Approaches
Coordinator Elections
Orphan Processes
Distributed File Systems (Chapter 8)
Distributed Name Service
Distributed File Service
Distributed Directory Service

3
Distributed Deadlock Prevention

Assign each process a global timestamp when it
starts
No two processes should have same timestamp
Basic idea When one process is about to block
waiting for a resource that another process is
using, a check is made to see which has a larger
timestamp (i.e. is younger). Tanenbaum, DOS
(1995)

4
Distributed Deadlock Prevention

Somehow put timestamps on each process,
representing creation time of process
Suppose a process needs a resource already owned
by another process
Determine relative ages of both processes
Decide if waiting process should Preempt, Wait,
Die, or Wound owning process
Two different algorithms

5
Distributed Deadlock Prevention

Allow wait only if waiting process is older
Since timestamps increase in any chain of waiting
processes, cycles are impossible
Or allow wait only if waiting process is younger
Here timestamps decrease in any chain of waiting
process, so cycles are again impossible
Wiser to give older processes priority

6
Example wait-die algorithm
Wants resource
Holds resource
54
79
Waits
Holds resource
Wants resource
79
54
Dies
7
Example wound-wait algorithm
Wants resource
Holds resource
54
79
Preempts
Holds resource
Wants resource
79
54
Waits
8
Algorithm Comparison

Wait-die kills young process
When young process restarts and requests resource
again, it is killed once more
Less efficient of these two algorithms
Wound-wait preempts young process
When young process re-requests resource, it has
to wait for older process to finish
Better of the two algorithms

9
Figure 7.7 The Bully Algorithm. (Galli, p. 169)
10
Process Management in a Distributed Environment

Processes in a Uniprocessor
Processes in a Multiprocessor
Processes in a Distributed System
Why need to schedule
Scheduling priorities
How to schedule
Scheduling algorithms

11
Distributed Scheduling

Basically resource management
Want to distribute processing load among the
processing elements in order to maximize
performance
Consider having several homogeneous processing
elements on a LAN with equal average workloads
Workload may still not be evenly distributed
Some PEs may have idle cycles

12
Efficiency Metrics

Communication cost
Low if very little or no communication required
Low if all communicating processes
on same PE
not distant (small number of hops)
Execution cost
Relative speed of PE
Relative location of needed resources
Type of
operating system
machine code
architecture

13
Efficiency Metrics, continued

Resource Utilization
May be based upon
Current PE loads
Load status state
Resource queue lengths
Memory usage
Other resource availability

14
Level of Scheduling

When to run process locally or to send it to an
idle PE?
Local Scheduling
Allocate process to local PE
Review Galli, Chapter 2, for more information
Global Scheduling
Choose which PE executes which process
Also called process allocation
Precedes local scheduling decision

15
Figure 7.1 Scheduling Decision Chart.
(Galli,p.152)
16
Distribution Goals

Load Balancing
Tries to maintain an equal load throughout system
Load Sharing
Simpler
Tries to prevent any PE from becoming too busy

17
Load Balancing / Load Sharing

Load Balancing
Try to equalize loads at PEs
Requires more information
More overhead
Load Sharing
Avoid having an idle PE if there is work to do
Anticipating Transfers
Avoid PE idle wait while a task is coming
Get a new task just before PE becomes idle

18
Figure 7.2 Load Distribution Goals.
(Galli,p.153)
19
Processor Allocation Algorithms

Assume virtually identical PEs
Assume PEs fully interconnected
Assume processes may spawn children
Two strategies
Non-migratory
static binding
non-preemptive
Migratory
dynamic binding
preemptive

20
Processor Allocation Strategies

Non-migratory (static binding, non-preemptive)
Transfer before process starts execution
Once assigned to a machine, process stays there
Migratory (dynamic binding, preemptive)
Processes may move after execution begins
Better load balancing
Expensive must collect and move entire state
More complex algorithms

21
Efficiency Goals

Optimal
Completion time
Resource Utilization
System Throughput
Any combination thereof
Suboptimal
Suboptimal Approximate
Suboptimal Heuristic

22
Optimal Scheduling Algorithms

Requires state of all competing processes
Scheduler must have access to all related
information
Optimization is a hard problem
Usually NP-Hard for multiple processors
Thus, consider
Suboptimal Approximate solutions
Suboptimal Heuristic solutions

23
SubOptimal Approximate Solutions

Similar to Optimal Scheduling algorithms
Try to find good solutions, not perfect solutions
Searches are limited
Include intelligent shortcuts

24
SubOptimal Heuristic Solutions

Heuristics
Employ rules-of-thumb
Employ intuition
May not be provable
Generally considered to work in an acceptable
manner
Examples
If PE has heavy load, dont give it more to do
Locality of reference for related processes, data

25
Figure 7.1 Scheduling Decision Chart.
(Galli,p.152)
26
Types of Load Distribution Algs

Static
Decisions are hard-wired in
Dynamic
Use static information to make decisions
Overhead of keeping track of information
Adaptive
A type of dynamic algorithm
May work differently at different loads

27
Load Distribution Algorithm Issues

Transfer Policy
Selection Policy
Location Policy
Information Policy
Stability
Sender-initiated versus Receiver-Initiated
Symmetrically-Initiated
Adaptive Algorithms

28
Load Dist. Algs. Issues, cont.

Transfer Policy
When it is appropriate to move a task?
If load at sending PE gt threshold
If load at receiving PE lt threshold
Location Policy
Find a receiver PE
Methods
Broadcast messages
Polling random, neighbors, recent candidates

29
Load Dist. Algs. Issues, cont.

Selection Policy
Which task should migrate?
Simple
Select new tasks
Non-Preemptive
Criteria
Cost of transfer
should be covered by reduction in response time
Size of task
Number of dependent system calls (use local PE)

30
Load Dist. Algs. Issues, cont.

Information Policy
What information should be collected?
When? From whom? By whom?
Demand-driven
Get info when PE becomes sender or receiver
Sender-initiated senders look for receivers
Receiver-initiated receivers look for senders
Symmetrically-initiated either of above
Periodic at fixed time intervals, not adaptive
State-change-driven
Send info about node state (rather than solicit)

31
Load Dist. Algs. Issues, cont.

Stability
Queuing Theoretic
Stable Sum(arrival load overhead) lt capacity
Effective Using the algorithm gives better
performance than not doing load distribution
An effective algorithm cannot be unstable
A stable algorithm can be ineffective (overhead)
Algorithmic Stability
E.g. Performing overhead operations, but making
no forward progress
E.g. moving a task from PE to PE, only to learn
that it increases the PE workload enough that it
needs to be transferred again

32
Load Dist Algs Issues, concluded

Stability
Queuing Theoretic
Stable Sum(arrival load overhead) lt capacity
Effective Using the algorithm gives better
performance than not doing load distribution
An effective algorithm cannot be unstable
A stable algorithm can be ineffective (overhead)
Algorithmic Stability
E.g. Performing overhead operations, but making
no forward progress
E.g. moving a task from PE to PE, only to learn
that it increases the PE workload enough that it
needs to be transferred again

33
Load Dist Algs Sender-Initiated

Sender PE thinks it is overloaded
Transfer Policy
Threshold (T) based on PE CPU queue length (QL)
Sender QL gt T
Receiver QL lt T
Selection Policy
Non-preemptive
Allows only new tasks
Long-lived tasks makes this policy worthwhile

34
Load Dist Algs Sender-Initiated

Location (3 different policies)
Random
Select a receiver at random
Useless or wasted if destination is loaded
Want to avoid transferring the same task from PE
to PE to PE
Include limit on number of transfers
Threshold
Start polling PEs at random
If receiver found, send task to it
Limit search to Poll-limit
If limit hit, keep task on current PE

35
LDAs Sender-Initiated

Location (3 different policies, cont.)
Shortest
Poll a random set of PEs
Choose PE with shortest queue length
Only a little better than Threshold Location
Policy
Not worth the additional work

36
LDAs Sender-Initiated

Information Policy
Demand-driven
After identifying a sender
Stability
At high load, PE might not find a receiver
Polling will be wasted
Polling increases the load on the system
Could lead to instability

37
LDAs Receiver-Initiated

Receiver is trying to find work
Transfer Policy
If local QL lt T, try to find a sender
Selection Policy
Non-preemptive
But there may not be any
Worth the effort

38
LDAs Receiver-Initiated

Location Policy
Select PE at random
If taking a task does not move that PEs load
below threshold, take it
If no luck after trying the Poll Limit times,
Wait until another task completed
Wait another time period
Information Policy
Demand-driven

39
LDAs Receiver-Initiated

Stability
Tends to be stable
At high load, a sender should be found
Problem
Transfers tend to be preemptive
Tasks on sender node have already started

40
LDAs Symmetrically-Initiated

Both senders and receivers can search for tasks
to transfer
Has both advantages and disadvantages of two
previous methods
Above average algorithm
Try to keep load at each PE at acceptable level
Aiming for exact average can cause thrashing

41
LDAs Symmetrically-Initiated

Transfer Policy
Each PE
Estimates the average load
Sets both an upper and a lower threshold
Equal distance from any estimate
If load gt upper, PE acts as a sender
If load lt lower, PE acts as a receiver

42
LDAs Symmetrically-Initiated

Location Policy
Sender-initiated
Sender broadcasts a TooHigh message, sets timeout
Receiver sends Accept message, clears timeout,
increases Load value, sets timeout
If sender still wants to send when Accept message
comes, sends task
If sender gets TooLow message before Accept,
sends task
If sender has TooHigh timeout with no Accept
Average estimate is too low
Broadcasts ChangeAvg message to all PEs

43
LDAs Symmetrically-Initiated

Location Policy
Receiver-initiated
Receiver sends TooLow message, sets timeout
Rest is converse of sender-initiated algorithm
Selection Policy
Use a reasonable policy
Non-preemptive, if possible
Low cost

44
LDAs Symmetrically-Initiated

Information Policy
Demand-driven
Determined at each PE
Low overhead

45
LDAs Adaptive

Stable Symmetrically-Initiated
Previous instability was due to too much polling
by the sender
Each PE keeps lists of the other Pes sorted into
three categories
Sender overloaded
Receiver overloaded
Okay
Each PE has all other Pes receiver list at start

46
LDAs Adaptive

Transfer Policy
Based on PE CPU queue length
Low threshold (LT) and high threshold (HT)
Selection Policy
Sender-initiated only sends new tasks
Receiver-initiated takes any task
Trying for low cost
Information Policy
Demand-driven maintains lists

47
LDAs Adaptive

Location Policy
Receiver-initiated
Order of polling
Senders list head to tail (new info first)
OK list tail to head (out-of-date first)
Receiver list (tail to head)
When PE becomes receiver, QL lt LT
Starts polling
If it finds a sender, transfer happens
Else use replies to update lists
Continues until
It finds a sender
It is no longer a receiver
It hits the Poll Limit

48
LDAs Adaptive

Notes
At high loads, activity is sender-initiated, but
there sender will soon have an empty receiver
list ? no polling
So it will go to receiver-initiated
At low loads, receiver-initiated ? failure
But overhead doesnt matter at low load
And lists get updated
So sender-initiated should work quickly

49
Load Scheduling Algorithms (Galli)

Usage Points
Charged for using remote PEs, resources
Graph Theory
Minimum cutset of assignment graph
Maximum flow of graph
Probes
Messages to locate available, appropriate PEs
Scheduling Queues
Stochastic Learning

50
Figure 7.3 Usage Points. (Galli,p.158)
51
Figure 7.4 Economic Usage Points. (Galli,
p.159)
52
Figure 7.5 Two-Processor Min-Cut Example.
(Galli, p.161)
53
Figure 7.6 A Station with Run Queues and Hints.
(Galli, p.164)
54
CPU Queue Length as Metric

PE queue length correlates well with response
time
Easy to measure
Caution
When accepting new migrating process, increment
queue length right away
Perhaps time-out needed in case process never
arrives
PE queue length does not correlate well with PE
utilization
Daemon to monitor PE utilization overhead

55
Election Algorithms

Bully algorithm (Garcia-Molina, 1982)
A Ring election algorithm

56
Bully Algorithm

Each processor has a unique number
One processor notices that the leader/server is
missing
Sends messages to all other processes
Requests to be appointed leader
Includes his processor number
Processors with higher (lower) processor numbers
can bully the first processor

57
Figure 7.7 The Bully Algorithm. (Galli, p. 169)
58
Bully Algorithm, continued

Initial processor need only send messages about
election to higher/lower numbered processors
Any processors that respond effectively tell the
first processor that they overrule him and that
he is out of the running
These processors then start sending election
messages to the other top processors

59
Bully Example
1
4
3
3, 4 respond
0
2
1
5
4
3
2 calls election
0
2
5
60
Bully Example, continued
1
4
3
4 calls election
0
2
1
5
4
3
3 calls election
0
2
5
61
Bully Example, concluded
1
4
3
4 is the new leader
0
2
1
5
4
3
4 responds to 3
0
2
5
62
A Ring Election Algorithm

No token
Each processor knows successor
When a processor notices leader is down, sends
election message to successor
If successor is down, sends to next processor
Each sender adds own number to message

63
Ring Election Algorithm, cont.

First processor eventually receives back the
election message containing his number
Election message is changed to coordinator
message and resent around ring
The highest processor number in message becomes
the new leader
When first processor receives the coordinator
message, it is deleted

64
Ring Election Example
3,4,5,6,0,1
2
3,4,5,6,0,1,2
1
3
3,4,5,6,0
3
0
4
3,4,5,6
3,4
7
5
3,4,5
6
65
Orphan Processes

A child process that is still active after its
parent process has terminated prematurely
Can happen with remote procedure calls
Wastes resources
Can corrupt shared data
Can create more processes
Three solutions follow

66
Orphan Cleanup

A process must clean up after itself after a
crash
Requires each parent keep list of children
Parent thus has access to family tree
Must be kept in nonvolatile storage
On restart, each family tree member told of
parent processs death and halts execution
Disadvantage parent overhead

67
Figure 7.8 Orphan Cleanup Family Trees.
(Galli, p.170)
68
Child Process Allowance

All child processes receive a finite time
allowance
If no time left, child must request more time
from parent
If parent has terminated prematurely, childs
request goes unanswered
With no time allowance, child process dies
Requires more communication
Slows execution of child processes

69
Figure 7.9 Child Process Allowance. (Galli,
p.172)
70
Process Version Numbers

Each process must keep track of a version number
for its parent
After a system crash, the entire distributed
system is assigned a new version number
Child forced to terminate if version number is
out-of-date
Child may try to find parent
Terminates if unsuccessful
Requires a lot of communication

71
Figure 7.10 Process Version Numbers. (Galli,
p.174)

Write a Comment

User Comments (0)

About PowerShow.com

Distributed Operating Systems CS551 - PowerPoint PPT Presentation

Distributed Operating Systems CS551

Basic idea: 'When one process is about to block waiting for a resource that ... to see which has a larger timestamp (i.e. is younger).' Tanenbaum, DOS (1995) ... – PowerPoint PPT presentation