Cluster Monitor - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Cluster Monitor

Description:

A cluster monitor uses a technique called heartbeats to check the health ... Stall delivery of the later reconfiguration until the earlier recovery is complete ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 44
Provided by: dasanSe
Category:
Tags: cluster | monitor | stall

less

Transcript and Presenter's Notes

Title: Cluster Monitor


1
Cluster Monitor
  • Regularly checks the health of each node of the
    cluster
  • Cluster monitor failure model
  • Techniques to detect failure
  • Cluster monitor protocol

2
Node Failure Model
  • Failsafe hardware failure
  • Failsafe software failure
  • However, not support
  • Byzantine Hardware Failure
  • Byzantine Software Failure
  • Split Brain

3
(No Transcript)
4
Failure Detection
  • A cluster monitor uses a technique called
    heartbeats to check the health of an application
  • User-level heartbeat
  • Node-level heartbeat( are-you-alive)

5
Cluster Membership
  • A cluster monitor assigns each node a nodeID
  • Distinct
  • Stable
  • Each stable cluster membership is associated with
    a number called a generation count

6
(No Transcript)
7
Reconfiguration Events
  • When a node exits or joins the cluster, the
    cluster is said to suffer a reconfiguration
  • Node Join
  • A new node has made its present felt to the
    cluster monitor
  • Voluntary Exit
  • This happens when a node makes a planned shutdown
  • Involuntary Node Exit
  • This happens when a node fail

8
Cascaded Reconfiguration
9
Cascaded Reconfiguration
  • Stall delivery of the later reconfiguration until
    the earlier recovery is complete
  • Deliver the reconfiguration and let the
    application sort it out

10
Quick Rejoin
  • It is possible to have a node exit and join back
    so quickly
  • This can lead to a very nasty situation in which
    a node has lost its in-memory state, but the
    other nodes cannot detect this
  • A good cluster monitor will make sure that if a
    node exits and joins, there will be separate
    reconfiguration events so that each node will
    know that a node left and then joined back

11
Cluster Messaging
  • Typed Message
  • Synchronous Message
  • Unicast and Broadcast Message
  • Reliable Message
  • Message Handler
  • Heartbeat Handler
  • Node Failure Handling
  • Addressing

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15

16

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Process Scheduling
  • Deals with scheduling, which is concerned with
    when to switch and which process to choose
  • Must fulfill several conflicting objectives fast
    process response time, good throughput for
    background jobs, avoidance of process starvation,
    reconciliation of the needs of low- and high
    priority processes, and so on
  • The set of rules used to determine when and how
    to select a new process to run is called
    scheduling policy

28
Process Scheduling
  • Linux scheduling is based on the time-sharing
    technique
  • In Linux, process priority is dynamic.
    Dynamically increasing or decreasing their
    priority
  • Three classes of processes
  • Interactive process interact constantly with
    their users, and therefore spend a lot of time
    waiting for keypasses and mouse operations
  • Batch process do not need user interaction, and
    hence they often run in the background
  • Real-time process have very stringent scheduling
    requirements. Should never be blocked by
    lower-priority processes and should have a short
    guaranteed response time with a minimum variance

29
Process Scheduling
  • nice()
  • Change the priority of a conventional process
  • getpriority()
  • Get the maximum priority of a group of
    conventional processes
  • setpriority()
  • Set the priority of a group of conventional
    processes
  • sched_getscheduler()
  • Get the scheduling policy of a process
  • sched_setscheduler()
  • Set the scheduling policy and priority of a
    process
  • sched_getparam()
  • Get the priority of a process
  • sched_setparam()
  • Ste the priority of a process
  • sched_yield()
  • Relinquish the processor voluntarily without
    blocking
  • sched_get_priority_min()
  • Get the minimum priority value for a policy
  • sched_set_priority_max()

30
Process Scheduling
  • User-mode Linux processes are preemptive.
  • If a process enters the TASK_RUNNING state, the
    kernel checks whether its dynamic priority is
    greater than the priority of the currently
    running process
  • If it is, the execution of current is interrupted
    and the scheduler is invoked
  • A process may also be preempted when its time
    quantum expires. When this occurs, the
    need_resched field of the current process is set.
  • Text editor and a compiler
  • The Linux kernel is not preemptive much simpler,
    since most synchronization problems involving the
    data structures are easily avoided

31
How Long Must a Quantum Last?
  • If the quantum duration is too short, the system
    overhead caused by process switches becomes
    excessive high
  • If the quantum is too long, processes no longer
    appear to be executed concurrently
  • In some cases, a quantum duration that is too
    long degrades the responsiveness of the system
  • The rule of thumb adopted by Linux is choose a
    duration as long as possible, while keeping good
    system response time

32
Scheduling Algorithm
  • Linux scheduling algorithm works by dividing the
    CPU time into epochs.
  • In a single epoch, every process has a specified
    time quantum whose duration is computed when the
    epoch begins.
  • The time quantum value is the maximum CPU time
    portion assigned to the process in that epoch
  • When a process has exhausted its time quantum, it
    is preempted and replaced by another runnable
    process
  • A process can be selected several times from the
    scheduler in the same epoch, as long as its time
    quantum has not been exhausted
  • The epoch ends when all runnable processes have
    exhausted their quanta in this case, the
    scheduler algorithm recomputes the time-quantum
    durations of all processes and a new epoch begins

33
Scheduling Algorithm
  • Each process has a base time quantum, which is
    the time-quantum value assigned by the scheduler
    to the process if it has exhausted its quantum in
    the previous epoch
  • The users can change the base time quantum of
    their processes by using the nice() and
    setpriority() system calls
  • The INIT_TASK macro sets the value of the initial
    time quantum of process 0 (swapper) to
    DEF_COUNTER
  • define DEF_COUNTER (10 HZ / 100)

34
Scheduling Algorithm
  • To select a process to run, the Linux scheduler
    must consider the priority of each process
  • Static priority
  • Dynamic priority
  • - Sum of the base time quantum and of the number
    of ticks of CPU time left to the process before
    its quantum expires in the current epoch
  • There is always at least one runnable process
    the swapper kernel thread, which has PID 0 and
    executes only when the CPU can not execute other
    process

35
Data Structures used by the Scheduler
  • need_resched
  • A flag checked by ret_from_sys_call()to decide
    whether to invoke the schedule() function
  • policy
  • Scheduling class. SCHED_FIFO, SCHED_RR,
    SCHED_other
  • rt_priority
  • The static priority of a real-time process valid
    priorities range between 1 and 99. The static
    priority of a conventional process must be set to
    0
  • counter
  • The number of ticks of CPU time left to the
    process before its quantum expires when a new
    epoch begins, the field contains the time-quantum
    duration of the process
  • nice
  • Determines the length of the process time quantum
    when a new epoch begins
  • cpus_allowed
  • A bit masks specifying the CPUs on which the
    process is allowed to run.
  • In the 80X86 atchitecture, the maximum number of
    processors is set to 32.
  • cpus_runnable
  • A bit mask specifying the CPU that is executing
    the process, if any
  • processors
  • Index of the CPU that is executing the process,
    if any

36
Schedule() Function
  • Its objective is to find a process in the
    runqueue list and then assign the CPU to it
  • It is invoked, directly or in a lazy way, by
    several kernel routines

37
Direct Invocation
  • The scheduler is invoked directly when the
    current process must be blocked right away
    because the resource it needs it not available
  • 1. Inserts current on the proper wait queue
  • 2. Changes the state of current either to
    TASK_INTERRUPTIBLE or to TASK_UNINTERRUPTIBLE
  • 3. Invokes schedule()
  • 4. Checks whether the resource is available if
    not, goes to step 2
  • 5. Once the resource is available, removes
    current from the wait queue

38
Lazy invocation
  • The scheduler can also be invoked in a lazy way
    by setting the need_resched field of current to 1
  • Since a check one the value of this field is
    always made before resuming the execution of a
    User Mode process, schedule() will definitely
    be invoked at some time in the near future
  • When current has used up its quantum of CPU time
  • When a process is woken up and its priority is
    higher than that of the current process
  • When a sched_setscheduler() or sched_yield()
    system call is issued

39
Actions performed by schedule() before a process
switch
  • The goal of the schedule() function consists of
    replacing the currently executing process with
    another one
  • Is to set a local variable called next so that it
    points to the descriptor of the process selected
    to replace current
  • If no runnable process in the system has priority
    greater than the priority of current, at the end,
    next coincides with current and no process switch
    takes place

40
Actions performed by schedule() before a process
switch
  • Schedule() function starts by initializing a few
    local variables
  • prev current
  • this_cpu prev-gtprocessor
  • shced_data aligned_datathis_cpu
  • Make sure that prev doesnt hold the global
    kernel lock, and then reenables the local
    interrupts
  • if (prev-gtlock_depthgt0)
  • spin_unlock(kernel_flag)
  • release_irqlock(this_cpu)
  • __sti()

41
Actions performed by schedule() before a process
switch
  • A check is made to determine whether prev is a
    Round Robin real-time process that has exhausted
    its quantum.
  • If so, schedule() assigns a new quantum to prev
    and puts it at the bottom of the runqueue list
  • spin_lock_irq(runqueue_lock)
  • if (prev-gtpolicy SCHED_RR !prev-gtcounter)
  • prev_counter (20 prev-gtnice) / 4 1
  • move_last_runqueue(prev)

42
Actions performed by schedule() before a process
switch
  • Examines the state of prev
  • if (pev-gtstate TASK_INTERRUPTIBLE
    signal_pending(prev))
  • prev-gtstate TASK_RUNNUNG
  • if (prev-gtstate ! TASK_RUNNING)
  • del_from_runqueue(prev)
  • prev-gtneed_resched 0
  • repeat_schedule
  • next init_tasksthis_cpu
  • c -1000
  • list_for_each(tmp, runqueue_head)
  • p list_entry(tmp, struct task_struct,
    run_list)
  • if (p-gtcpus_runnable p-gtcpus_allowed (1 ltlt
    this_cpu))
  • int weight goodness(p, this_cpu,
    prev-gtactive_mm)
  • if (weight gt c)
  • cweight, next p

43
Performance of the Scheduling Algorithm
  • The algorithm does not scale well
  • The predefined quantum is too large for high
    system loads
  • I/O-bound process boosting strategy is not
    optimal
  • Support for real_time applications is weak
Write a Comment
User Comments (0)
About PowerShow.com