Distributed Operating Systems CS551 - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Operating Systems CS551

Description:

Basic idea: 'When one process is about to block waiting for a resource that ... to see which has a larger timestamp (i.e. is younger).' Tanenbaum, DOS (1995) ... – PowerPoint PPT presentation

Number of Views:213
Avg rating:3.0/5.0
Slides: 72
Provided by: scha67
Category:

less

Transcript and Presenter's Notes

Title: Distributed Operating Systems CS551


1
Distributed Operating SystemsCS551
  • Colorado State University
  • at Lockheed-Martin
  • Lecture 6 -- Spring 2001

2
CS551 Lecture 6
  • Topics
  • Distributed Process Management (Chapter 7)
  • Distributed Scheduling Algorithm Choices
  • Scheduling Algorithm Approaches
  • Coordinator Elections
  • Orphan Processes
  • Distributed File Systems (Chapter 8)
  • Distributed Name Service
  • Distributed File Service
  • Distributed Directory Service

3
Distributed Deadlock Prevention
  • Assign each process a global timestamp when it
    starts
  • No two processes should have same timestamp
  • Basic idea When one process is about to block
    waiting for a resource that another process is
    using, a check is made to see which has a larger
    timestamp (i.e. is younger). Tanenbaum, DOS
    (1995)

4
Distributed Deadlock Prevention
  • Somehow put timestamps on each process,
    representing creation time of process
  • Suppose a process needs a resource already owned
    by another process
  • Determine relative ages of both processes
  • Decide if waiting process should Preempt, Wait,
    Die, or Wound owning process
  • Two different algorithms

5
Distributed Deadlock Prevention
  • Allow wait only if waiting process is older
  • Since timestamps increase in any chain of waiting
    processes, cycles are impossible
  • Or allow wait only if waiting process is younger
  • Here timestamps decrease in any chain of waiting
    process, so cycles are again impossible
  • Wiser to give older processes priority

6
Example wait-die algorithm
Wants resource
Holds resource
54
79
Waits
Holds resource
Wants resource
79
54
Dies
7
Example wound-wait algorithm
Wants resource
Holds resource
54
79
Preempts
Holds resource
Wants resource
79
54
Waits
8
Algorithm Comparison
  • Wait-die kills young process
  • When young process restarts and requests resource
    again, it is killed once more
  • Less efficient of these two algorithms
  • Wound-wait preempts young process
  • When young process re-requests resource, it has
    to wait for older process to finish
  • Better of the two algorithms

9
Figure 7.7 The Bully Algorithm. (Galli, p. 169)
10
Process Management in a Distributed Environment
  • Processes in a Uniprocessor
  • Processes in a Multiprocessor
  • Processes in a Distributed System
  • Why need to schedule
  • Scheduling priorities
  • How to schedule
  • Scheduling algorithms

11
Distributed Scheduling
  • Basically resource management
  • Want to distribute processing load among the
    processing elements in order to maximize
    performance
  • Consider having several homogeneous processing
    elements on a LAN with equal average workloads
  • Workload may still not be evenly distributed
  • Some PEs may have idle cycles

12
Efficiency Metrics
  • Communication cost
  • Low if very little or no communication required
  • Low if all communicating processes
  • on same PE
  • not distant (small number of hops)
  • Execution cost
  • Relative speed of PE
  • Relative location of needed resources
  • Type of
  • operating system
  • machine code
  • architecture

13
Efficiency Metrics, continued
  • Resource Utilization
  • May be based upon
  • Current PE loads
  • Load status state
  • Resource queue lengths
  • Memory usage
  • Other resource availability

14
Level of Scheduling
  • When to run process locally or to send it to an
    idle PE?
  • Local Scheduling
  • Allocate process to local PE
  • Review Galli, Chapter 2, for more information
  • Global Scheduling
  • Choose which PE executes which process
  • Also called process allocation
  • Precedes local scheduling decision

15
Figure 7.1  Scheduling Decision Chart.
(Galli,p.152)
16
Distribution Goals
  • Load Balancing
  • Tries to maintain an equal load throughout system
  • Load Sharing
  • Simpler
  • Tries to prevent any PE from becoming too busy

17
Load Balancing / Load Sharing
  • Load Balancing
  • Try to equalize loads at PEs
  • Requires more information
  • More overhead
  • Load Sharing
  • Avoid having an idle PE if there is work to do
  • Anticipating Transfers
  • Avoid PE idle wait while a task is coming
  • Get a new task just before PE becomes idle

18
Figure 7.2  Load Distribution Goals.
(Galli,p.153)
19
Processor Allocation Algorithms
  • Assume virtually identical PEs
  • Assume PEs fully interconnected
  • Assume processes may spawn children
  • Two strategies
  • Non-migratory
  • static binding
  • non-preemptive
  • Migratory
  • dynamic binding
  • preemptive

20
Processor Allocation Strategies
  • Non-migratory (static binding, non-preemptive)
  • Transfer before process starts execution
  • Once assigned to a machine, process stays there
  • Migratory (dynamic binding, preemptive)
  • Processes may move after execution begins
  • Better load balancing
  • Expensive must collect and move entire state
  • More complex algorithms

21
Efficiency Goals
  • Optimal
  • Completion time
  • Resource Utilization
  • System Throughput
  • Any combination thereof
  • Suboptimal
  • Suboptimal Approximate
  • Suboptimal Heuristic

22
Optimal Scheduling Algorithms
  • Requires state of all competing processes
  • Scheduler must have access to all related
    information
  • Optimization is a hard problem
  • Usually NP-Hard for multiple processors
  • Thus, consider
  • Suboptimal Approximate solutions
  • Suboptimal Heuristic solutions

23
SubOptimal Approximate Solutions
  • Similar to Optimal Scheduling algorithms
  • Try to find good solutions, not perfect solutions
  • Searches are limited
  • Include intelligent shortcuts

24
SubOptimal Heuristic Solutions
  • Heuristics
  • Employ rules-of-thumb
  • Employ intuition
  • May not be provable
  • Generally considered to work in an acceptable
    manner
  • Examples
  • If PE has heavy load, dont give it more to do
  • Locality of reference for related processes, data

25
Figure 7.1  Scheduling Decision Chart.
(Galli,p.152)
26
Types of Load Distribution Algs
  • Static
  • Decisions are hard-wired in
  • Dynamic
  • Use static information to make decisions
  • Overhead of keeping track of information
  • Adaptive
  • A type of dynamic algorithm
  • May work differently at different loads

27
Load Distribution Algorithm Issues
  • Transfer Policy
  • Selection Policy
  • Location Policy
  • Information Policy
  • Stability
  • Sender-initiated versus Receiver-Initiated
  • Symmetrically-Initiated
  • Adaptive Algorithms

28
Load Dist. Algs. Issues, cont.
  • Transfer Policy
  • When it is appropriate to move a task?
  • If load at sending PE gt threshold
  • If load at receiving PE lt threshold
  • Location Policy
  • Find a receiver PE
  • Methods
  • Broadcast messages
  • Polling random, neighbors, recent candidates

29
Load Dist. Algs. Issues, cont.
  • Selection Policy
  • Which task should migrate?
  • Simple
  • Select new tasks
  • Non-Preemptive
  • Criteria
  • Cost of transfer
  • should be covered by reduction in response time
  • Size of task
  • Number of dependent system calls (use local PE)

30
Load Dist. Algs. Issues, cont.
  • Information Policy
  • What information should be collected?
  • When? From whom? By whom?
  • Demand-driven
  • Get info when PE becomes sender or receiver
  • Sender-initiated senders look for receivers
  • Receiver-initiated receivers look for senders
  • Symmetrically-initiated either of above
  • Periodic at fixed time intervals, not adaptive
  • State-change-driven
  • Send info about node state (rather than solicit)

31
Load Dist. Algs. Issues, cont.
  • Stability
  • Queuing Theoretic
  • Stable Sum(arrival load overhead) lt capacity
  • Effective Using the algorithm gives better
    performance than not doing load distribution
  • An effective algorithm cannot be unstable
  • A stable algorithm can be ineffective (overhead)
  • Algorithmic Stability
  • E.g. Performing overhead operations, but making
    no forward progress
  • E.g. moving a task from PE to PE, only to learn
    that it increases the PE workload enough that it
    needs to be transferred again

32
Load Dist Algs Issues, concluded
  • Stability
  • Queuing Theoretic
  • Stable Sum(arrival load overhead) lt capacity
  • Effective Using the algorithm gives better
    performance than not doing load distribution
  • An effective algorithm cannot be unstable
  • A stable algorithm can be ineffective (overhead)
  • Algorithmic Stability
  • E.g. Performing overhead operations, but making
    no forward progress
  • E.g. moving a task from PE to PE, only to learn
    that it increases the PE workload enough that it
    needs to be transferred again

33
Load Dist Algs Sender-Initiated
  • Sender PE thinks it is overloaded
  • Transfer Policy
  • Threshold (T) based on PE CPU queue length (QL)
  • Sender QL gt T
  • Receiver QL lt T
  • Selection Policy
  • Non-preemptive
  • Allows only new tasks
  • Long-lived tasks makes this policy worthwhile

34
Load Dist Algs Sender-Initiated
  • Location (3 different policies)
  • Random
  • Select a receiver at random
  • Useless or wasted if destination is loaded
  • Want to avoid transferring the same task from PE
    to PE to PE
  • Include limit on number of transfers
  • Threshold
  • Start polling PEs at random
  • If receiver found, send task to it
  • Limit search to Poll-limit
  • If limit hit, keep task on current PE

35
LDAs Sender-Initiated
  • Location (3 different policies, cont.)
  • Shortest
  • Poll a random set of PEs
  • Choose PE with shortest queue length
  • Only a little better than Threshold Location
    Policy
  • Not worth the additional work

36
LDAs Sender-Initiated
  • Information Policy
  • Demand-driven
  • After identifying a sender
  • Stability
  • At high load, PE might not find a receiver
  • Polling will be wasted
  • Polling increases the load on the system
  • Could lead to instability

37
LDAs Receiver-Initiated
  • Receiver is trying to find work
  • Transfer Policy
  • If local QL lt T, try to find a sender
  • Selection Policy
  • Non-preemptive
  • But there may not be any
  • Worth the effort

38
LDAs Receiver-Initiated
  • Location Policy
  • Select PE at random
  • If taking a task does not move that PEs load
    below threshold, take it
  • If no luck after trying the Poll Limit times,
  • Wait until another task completed
  • Wait another time period
  • Information Policy
  • Demand-driven

39
LDAs Receiver-Initiated
  • Stability
  • Tends to be stable
  • At high load, a sender should be found
  • Problem
  • Transfers tend to be preemptive
  • Tasks on sender node have already started

40
LDAs Symmetrically-Initiated
  • Both senders and receivers can search for tasks
    to transfer
  • Has both advantages and disadvantages of two
    previous methods
  • Above average algorithm
  • Try to keep load at each PE at acceptable level
  • Aiming for exact average can cause thrashing

41
LDAs Symmetrically-Initiated
  • Transfer Policy
  • Each PE
  • Estimates the average load
  • Sets both an upper and a lower threshold
  • Equal distance from any estimate
  • If load gt upper, PE acts as a sender
  • If load lt lower, PE acts as a receiver

42
LDAs Symmetrically-Initiated
  • Location Policy
  • Sender-initiated
  • Sender broadcasts a TooHigh message, sets timeout
  • Receiver sends Accept message, clears timeout,
    increases Load value, sets timeout
  • If sender still wants to send when Accept message
    comes, sends task
  • If sender gets TooLow message before Accept,
    sends task
  • If sender has TooHigh timeout with no Accept
  • Average estimate is too low
  • Broadcasts ChangeAvg message to all PEs

43
LDAs Symmetrically-Initiated
  • Location Policy
  • Receiver-initiated
  • Receiver sends TooLow message, sets timeout
  • Rest is converse of sender-initiated algorithm
  • Selection Policy
  • Use a reasonable policy
  • Non-preemptive, if possible
  • Low cost

44
LDAs Symmetrically-Initiated
  • Information Policy
  • Demand-driven
  • Determined at each PE
  • Low overhead

45
LDAs Adaptive
  • Stable Symmetrically-Initiated
  • Previous instability was due to too much polling
    by the sender
  • Each PE keeps lists of the other Pes sorted into
    three categories
  • Sender overloaded
  • Receiver overloaded
  • Okay
  • Each PE has all other Pes receiver list at start

46
LDAs Adaptive
  • Transfer Policy
  • Based on PE CPU queue length
  • Low threshold (LT) and high threshold (HT)
  • Selection Policy
  • Sender-initiated only sends new tasks
  • Receiver-initiated takes any task
  • Trying for low cost
  • Information Policy
  • Demand-driven maintains lists

47
LDAs Adaptive
  • Location Policy
  • Receiver-initiated
  • Order of polling
  • Senders list head to tail (new info first)
  • OK list tail to head (out-of-date first)
  • Receiver list (tail to head)
  • When PE becomes receiver, QL lt LT
  • Starts polling
  • If it finds a sender, transfer happens
  • Else use replies to update lists
  • Continues until
  • It finds a sender
  • It is no longer a receiver
  • It hits the Poll Limit

48
LDAs Adaptive
  • Notes
  • At high loads, activity is sender-initiated, but
    there sender will soon have an empty receiver
    list ? no polling
  • So it will go to receiver-initiated
  • At low loads, receiver-initiated ? failure
  • But overhead doesnt matter at low load
  • And lists get updated
  • So sender-initiated should work quickly

49
Load Scheduling Algorithms (Galli)
  • Usage Points
  • Charged for using remote PEs, resources
  • Graph Theory
  • Minimum cutset of assignment graph
  • Maximum flow of graph
  • Probes
  • Messages to locate available, appropriate PEs
  • Scheduling Queues
  • Stochastic Learning

50
Figure 7.3  Usage Points. (Galli,p.158)
51
Figure 7.4  Economic Usage Points. (Galli,
p.159)
52
Figure 7.5 Two-Processor Min-Cut Example.
(Galli, p.161)
53
Figure 7.6  A Station with Run Queues and Hints.
(Galli, p.164)
54
CPU Queue Length as Metric
  • PE queue length correlates well with response
    time
  • Easy to measure
  • Caution
  • When accepting new migrating process, increment
    queue length right away
  • Perhaps time-out needed in case process never
    arrives
  • PE queue length does not correlate well with PE
    utilization
  • Daemon to monitor PE utilization overhead

55
Election Algorithms
  • Bully algorithm (Garcia-Molina, 1982)
  • A Ring election algorithm

56
Bully Algorithm
  • Each processor has a unique number
  • One processor notices that the leader/server is
    missing
  • Sends messages to all other processes
  • Requests to be appointed leader
  • Includes his processor number
  • Processors with higher (lower) processor numbers
    can bully the first processor

57
Figure 7.7 The Bully Algorithm. (Galli, p. 169)
58
Bully Algorithm, continued
  • Initial processor need only send messages about
    election to higher/lower numbered processors
  • Any processors that respond effectively tell the
    first processor that they overrule him and that
    he is out of the running
  • These processors then start sending election
    messages to the other top processors

59
Bully Example
1
4
3
3, 4 respond
0
2
1
5
4
3
2 calls election
0
2
5
60
Bully Example, continued
1
4
3
4 calls election
0
2
1
5
4
3
3 calls election
0
2
5
61
Bully Example, concluded
1
4
3
4 is the new leader
0
2
1
5
4
3
4 responds to 3
0
2
5
62
A Ring Election Algorithm
  • No token
  • Each processor knows successor
  • When a processor notices leader is down, sends
    election message to successor
  • If successor is down, sends to next processor
  • Each sender adds own number to message

63
Ring Election Algorithm, cont.
  • First processor eventually receives back the
    election message containing his number
  • Election message is changed to coordinator
    message and resent around ring
  • The highest processor number in message becomes
    the new leader
  • When first processor receives the coordinator
    message, it is deleted

64
Ring Election Example
3,4,5,6,0,1
2
3,4,5,6,0,1,2
1
3
3,4,5,6,0
3
0
4
3,4,5,6
3,4
7
5
3,4,5
6
65
Orphan Processes
  • A child process that is still active after its
    parent process has terminated prematurely
  • Can happen with remote procedure calls
  • Wastes resources
  • Can corrupt shared data
  • Can create more processes
  • Three solutions follow

66
Orphan Cleanup
  • A process must clean up after itself after a
    crash
  • Requires each parent keep list of children
  • Parent thus has access to family tree
  • Must be kept in nonvolatile storage
  • On restart, each family tree member told of
    parent processs death and halts execution
  • Disadvantage parent overhead

67
Figure 7.8 Orphan Cleanup Family Trees.
(Galli, p.170)
68
Child Process Allowance
  • All child processes receive a finite time
    allowance
  • If no time left, child must request more time
    from parent
  • If parent has terminated prematurely, childs
    request goes unanswered
  • With no time allowance, child process dies
  • Requires more communication
  • Slows execution of child processes

69
Figure 7.9 Child Process Allowance. (Galli,
p.172)
70
Process Version Numbers
  • Each process must keep track of a version number
    for its parent
  • After a system crash, the entire distributed
    system is assigned a new version number
  • Child forced to terminate if version number is
    out-of-date
  • Child may try to find parent
  • Terminates if unsuccessful
  • Requires a lot of communication

71
Figure 7.10  Process Version Numbers. (Galli,
p.174)
Write a Comment
User Comments (0)
About PowerShow.com