Cluster Monitor

About This Presentation

Title:

Cluster Monitor

Description:

A cluster monitor uses a technique called heartbeats to check the health ... Stall delivery of the later reconfiguration until the earlier recovery is complete ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 44

Provided by: dasanSe

Category:

more less

Transcript and Presenter's Notes

Title: Cluster Monitor

1
Cluster Monitor

Regularly checks the health of each node of the
cluster
Cluster monitor failure model
Techniques to detect failure
Cluster monitor protocol

2
Node Failure Model

Failsafe hardware failure
Failsafe software failure
However, not support
Byzantine Hardware Failure
Byzantine Software Failure
Split Brain

3
(No Transcript)
4
Failure Detection

A cluster monitor uses a technique called
heartbeats to check the health of an application
User-level heartbeat
Node-level heartbeat( are-you-alive)

5
Cluster Membership

A cluster monitor assigns each node a nodeID
Distinct
Stable
Each stable cluster membership is associated with
a number called a generation count

6
(No Transcript)
7
Reconfiguration Events

When a node exits or joins the cluster, the
cluster is said to suffer a reconfiguration
Node Join
A new node has made its present felt to the
cluster monitor
Voluntary Exit
This happens when a node makes a planned shutdown
Involuntary Node Exit
This happens when a node fail

8
Cascaded Reconfiguration
9
Cascaded Reconfiguration

Stall delivery of the later reconfiguration until
the earlier recovery is complete
Deliver the reconfiguration and let the
application sort it out

10
Quick Rejoin

It is possible to have a node exit and join back
so quickly
This can lead to a very nasty situation in which
a node has lost its in-memory state, but the
other nodes cannot detect this
A good cluster monitor will make sure that if a
node exits and joins, there will be separate
reconfiguration events so that each node will
know that a node left and then joined back

11
Cluster Messaging

Typed Message
Synchronous Message
Unicast and Broadcast Message
Reliable Message
Message Handler
Heartbeat Handler
Node Failure Handling
Addressing

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15

16

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Process Scheduling

Deals with scheduling, which is concerned with
when to switch and which process to choose
Must fulfill several conflicting objectives fast
process response time, good throughput for
background jobs, avoidance of process starvation,
reconciliation of the needs of low- and high
priority processes, and so on
The set of rules used to determine when and how
to select a new process to run is called
scheduling policy

28
Process Scheduling

Linux scheduling is based on the time-sharing
technique
In Linux, process priority is dynamic.
Dynamically increasing or decreasing their
priority
Three classes of processes
Interactive process interact constantly with
their users, and therefore spend a lot of time
waiting for keypasses and mouse operations
Batch process do not need user interaction, and
hence they often run in the background
Real-time process have very stringent scheduling
requirements. Should never be blocked by
lower-priority processes and should have a short
guaranteed response time with a minimum variance

29
Process Scheduling

nice()
Change the priority of a conventional process
getpriority()
Get the maximum priority of a group of
conventional processes
setpriority()
Set the priority of a group of conventional
processes
sched_getscheduler()
Get the scheduling policy of a process
sched_setscheduler()
Set the scheduling policy and priority of a
process
sched_getparam()
Get the priority of a process
sched_setparam()
Ste the priority of a process
sched_yield()
Relinquish the processor voluntarily without
blocking
sched_get_priority_min()
Get the minimum priority value for a policy
sched_set_priority_max()

30
Process Scheduling

User-mode Linux processes are preemptive.
If a process enters the TASK_RUNNING state, the
kernel checks whether its dynamic priority is
greater than the priority of the currently
running process
If it is, the execution of current is interrupted
and the scheduler is invoked
A process may also be preempted when its time
quantum expires. When this occurs, the
need_resched field of the current process is set.
Text editor and a compiler
The Linux kernel is not preemptive much simpler,
since most synchronization problems involving the
data structures are easily avoided

31
How Long Must a Quantum Last?

If the quantum duration is too short, the system
overhead caused by process switches becomes
excessive high
If the quantum is too long, processes no longer
appear to be executed concurrently
In some cases, a quantum duration that is too
long degrades the responsiveness of the system
The rule of thumb adopted by Linux is choose a
duration as long as possible, while keeping good
system response time

32
Scheduling Algorithm

Linux scheduling algorithm works by dividing the
CPU time into epochs.
In a single epoch, every process has a specified
time quantum whose duration is computed when the
epoch begins.
The time quantum value is the maximum CPU time
portion assigned to the process in that epoch
When a process has exhausted its time quantum, it
is preempted and replaced by another runnable
process
A process can be selected several times from the
scheduler in the same epoch, as long as its time
quantum has not been exhausted
The epoch ends when all runnable processes have
exhausted their quanta in this case, the
scheduler algorithm recomputes the time-quantum
durations of all processes and a new epoch begins

33
Scheduling Algorithm

Each process has a base time quantum, which is
the time-quantum value assigned by the scheduler
to the process if it has exhausted its quantum in
the previous epoch
The users can change the base time quantum of
their processes by using the nice() and
setpriority() system calls
The INIT_TASK macro sets the value of the initial
time quantum of process 0 (swapper) to
DEF_COUNTER
define DEF_COUNTER (10 HZ / 100)

34
Scheduling Algorithm

To select a process to run, the Linux scheduler
must consider the priority of each process
Static priority
Dynamic priority
- Sum of the base time quantum and of the number
of ticks of CPU time left to the process before
its quantum expires in the current epoch
There is always at least one runnable process
the swapper kernel thread, which has PID 0 and
executes only when the CPU can not execute other
process

35
Data Structures used by the Scheduler

need_resched
A flag checked by ret_from_sys_call()to decide
whether to invoke the schedule() function
policy
Scheduling class. SCHED_FIFO, SCHED_RR,
SCHED_other
rt_priority
The static priority of a real-time process valid
priorities range between 1 and 99. The static
priority of a conventional process must be set to
0
counter
The number of ticks of CPU time left to the
process before its quantum expires when a new
epoch begins, the field contains the time-quantum
duration of the process
nice
Determines the length of the process time quantum
when a new epoch begins
cpus_allowed
A bit masks specifying the CPUs on which the
process is allowed to run.
In the 80X86 atchitecture, the maximum number of
processors is set to 32.
cpus_runnable
A bit mask specifying the CPU that is executing
the process, if any
processors
Index of the CPU that is executing the process,
if any

36
Schedule() Function

Its objective is to find a process in the
runqueue list and then assign the CPU to it
It is invoked, directly or in a lazy way, by
several kernel routines

37
Direct Invocation

The scheduler is invoked directly when the
current process must be blocked right away
because the resource it needs it not available
1. Inserts current on the proper wait queue
2. Changes the state of current either to
TASK_INTERRUPTIBLE or to TASK_UNINTERRUPTIBLE
3. Invokes schedule()
4. Checks whether the resource is available if
not, goes to step 2
5. Once the resource is available, removes
current from the wait queue

38
Lazy invocation

The scheduler can also be invoked in a lazy way
by setting the need_resched field of current to 1
Since a check one the value of this field is
always made before resuming the execution of a
User Mode process, schedule() will definitely
be invoked at some time in the near future
When current has used up its quantum of CPU time
When a process is woken up and its priority is
higher than that of the current process
When a sched_setscheduler() or sched_yield()
system call is issued

39
Actions performed by schedule() before a process
switch

The goal of the schedule() function consists of
replacing the currently executing process with
another one
Is to set a local variable called next so that it
points to the descriptor of the process selected
to replace current
If no runnable process in the system has priority
greater than the priority of current, at the end,
next coincides with current and no process switch
takes place

40
Actions performed by schedule() before a process
switch

Schedule() function starts by initializing a few
local variables
prev current
this_cpu prev-gtprocessor
shced_data aligned_datathis_cpu
Make sure that prev doesnt hold the global
kernel lock, and then reenables the local
interrupts
if (prev-gtlock_depthgt0)
spin_unlock(kernel_flag)
release_irqlock(this_cpu)
__sti()

41
Actions performed by schedule() before a process
switch

A check is made to determine whether prev is a
Round Robin real-time process that has exhausted
its quantum.
If so, schedule() assigns a new quantum to prev
and puts it at the bottom of the runqueue list
spin_lock_irq(runqueue_lock)
if (prev-gtpolicy SCHED_RR !prev-gtcounter)
prev_counter (20 prev-gtnice) / 4 1
move_last_runqueue(prev)

42
Actions performed by schedule() before a process
switch

Examines the state of prev
if (pev-gtstate TASK_INTERRUPTIBLE
signal_pending(prev))
prev-gtstate TASK_RUNNUNG
if (prev-gtstate ! TASK_RUNNING)
del_from_runqueue(prev)
prev-gtneed_resched 0
repeat_schedule
next init_tasksthis_cpu
c -1000
list_for_each(tmp, runqueue_head)
p list_entry(tmp, struct task_struct,
run_list)
if (p-gtcpus_runnable p-gtcpus_allowed (1 ltlt
this_cpu))
int weight goodness(p, this_cpu,
prev-gtactive_mm)
if (weight gt c)
cweight, next p

Cluster Monitor - PowerPoint PPT Presentation

Cluster Monitor

A cluster monitor uses a technique called heartbeats to check the health ... Stall delivery of the later reconfiguration until the earlier recovery is complete ... – PowerPoint PPT presentation