Scheduling in Linux and Windows 2000 - PowerPoint PPT Presentation

About This Presentation
Title:

Scheduling in Linux and Windows 2000

Description:

Scheduling in Linux and Windows 2000 CS 431 Sanjay Kumar (01D07041) Sanjay Kumar Ram (01D07042) Linux kernel Linux process Scheduling policy Data Structures and ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 35
Provided by: Sanjay96
Category:

less

Transcript and Presenter's Notes

Title: Scheduling in Linux and Windows 2000


1
Scheduling in Linux and Windows 2000
  • CS 431
  • Sanjay Kumar (01D07041)
  • Sanjay Kumar Ram (01D07042)

2
  • Linux kernel
  • Linux process
  • Scheduling policy
  • Data Structures and Scheduling Algorithm in
    Uniprocessor System
  • Scheduling algorithm for multiprocessor
  • Scheduling in windows 2000

3
Linux Kernel
Kernel subsystem overview
4
Linux Processes
  • Three class of processes
  • Interactive processes
  • Batch processes
  • Real-time processes
  • Linux processes have the following states
  • Running
  • Waiting
  • Stopped
  • Zombie

5
Scheduling Policy
  • Linux kernel is non-preemptive but processes are
    preemptive
  • The set of rules used to determine when and how
    selecting a new process to run is called
    scheduling policy
  • Linux scheduling is based on the time-sharing
    technique -several processes are allowed to run
    "concurrently
  • CPU time is roughly divided into "slices
  • scheduling policy is also based on ranking
    processes according to their priority
  • Static priority - assigned by the users to
    real-time processes and ranges from 1 to 99, and
    is never changed by the scheduler
  • Dynamic priority - the sum of the base time
    quantum (the base priority of the process) and of
    the number of ticks of CPU time left to the
    process before its quantum expires in the current
    epoch

6
Scheduling Policy (contd.)
  • Process Preemption
  • When the dynamic priority of the currently
    running process is lower than the process waiting
    in the runqueue
  • A process may also be preempted when its time
    quantum expires
  • How Long Must a Quantum Last?
  • Quantum duration too short - system overhead
    caused by task switching becomes excessively high
  • Quantum duration too long - processes no longer
    appear to be executed concurrently, degrades the
    response time of interactive applications, and
    degrades the responsiveness of the system
  • The rule of thumb adopted by Linux is choose a
    duration as long as possible, while keeping good
    system response time

7
Data Structures and Scheduling Algorithm in
Uniprocessor System
  • The Linux scheduling algorithm works by dividing
    the CPU time into epochs
  • In a single epoch, every process has a specified
    time quantum
  • epoch ends when all runnable processes have
    exhausted their quantum
  • Each process has a base time quantum
  • the time-quantum value assigned by the scheduler
    to the process if it has exhausted its quantum in
    the previous epoch
  • users can change the base time quantum of their
    processes by using the nice( ) and setpriority( )
    system calls
  • The INIT_TASK macro sets the value of the base
    time quantum of process 0 (swapper) to
    DEF_PRIORITY, which is approx. 200ms

8
Data Structures and Scheduling Algorithm in
Uniprocessor System (contd.)
  • The first entry in the process table is the
    special init process
  • Task_struct - represents the states of all tasks
    running in the systems
  • schedule policy - SCHED_OTHER, SCHED_FIFO,
    SCHED_RR
  • field for process state running, waiting,
    stopped, zombie
  • a field that indicates the processes priority
  • counter - a field which holds the number of clock
    ticks
  • a doubly linked list (next_task and prev_task) -
    to keep track of all executing processes
  • mm_struct - contains a process's memory
    management information
  • process ID and group ID are also stored
  • fs_struct - File specific process data
  • fields that hold timing information

9
Data Structures and Scheduling Algorithm in
Uniprocessor System (contd.)
  • Each time the scheduler is run it does the
    following
  • Kernel work - scheduler runs the bottom half
    handlers and processes the scheduler task queue
  • Current process
  • current process must be processed before another
    process can be selected to run
  • If the scheduling policy of the current processes
    is round robin then it is put onto the back of
    the run queue
  • If the task is INTERRUPTIBLE and it has received
    a signal since the last time it was scheduled
    then its state becomes RUNNING
  • If the current process has timed out, then its
    state becomes RUNNING
  • If the current process is RUNNING then it will
    remain in that state
  • Processes that were neither RUNNING nor
    INTERRUPTIBLE are removed from the run queue

10
Data Structures and Scheduling Algorithm in
Uniprocessor System (contd.)
  • Process Selection
  • most deserving process is selected by the
    scheduler
  • real time processes are given higher priority
    than ordinary processes
  • when several processes have the same priority,
    the one nearest the front of the run queue is
    chosen
  • when a new process is created the number of ticks
    left to the parent is split in two halves, one
    for the parent and one for the child
  • priority and counter fields are used both to
    implement time-sharing and to compute the process
    dynamic priority

11
Data Structures and Scheduling Algorithm in
Uniprocessor System (contd.)
  • schedule( ) Function
  • implements the scheduler.
  • finds a process in the runqueue list and then
    assigns the CPU to it
  • Direct invocation
  • 1. Inserts current in the proper wait queue
  • 2. Changes the state of current either to
    TASK_INTERRUPTIBLE or to TASK_UNINTERRUPTIBLE
  • 3. Invokes schedule( )
  • 4. Checks if the resource is available if not,
    goes to step 2
  • 5. Once the resource is available, removes
    current from the wait queue
  • Lazy invocation
  • When current has used up its quantum of CPU time
    this is done by the update_process_times( )
    function
  • When a process is woken up and its priority is
    higher than that of the current process
    performed by the reschedule_idle( ) function,
    which is invoked by the wake_up_process( )
    function
  • When a sched_setscheduler( ) or sched_ yield( )
    system call is issued

12
Data Structures and Scheduling Algorithm in
Uniprocessor System (contd.)
  • Actions performed by schedule( )
  • Before actually scheduling a process, the
    schedule( ) function starts by running the
    functions left by other kernel control paths in
    various queues
  • The function then executes all active unmasked
    bottom halves
  • Scheduling
  • value of current is saved in the prev local
    variable and the need_resched field of prev is
    set to 0
  • a check is made to determine whether prev is a
    Round Robin real-time process. If so, schedule( )
    assigns a new quantum to prev and puts it at the
    bottom of the runqueue list
  • if state is TASK_INTERRUPTIBLE, the function
    wakes up the process
  • schedule( ) repeatedly invokes the goodness( )
    function on the runnable processes to determine
    the best candidate
  • when counter field becomes zero, schedule( )
    assigns to all existing processes a fresh
    quantum, whose duration is the sum of the
    priority value plus half the counter value

13
Data Structures and Scheduling Algorithm in
Uniprocessor System (contd.)
  • goodness( ) function
  • identify the best candidate among all processes
    in the runqueue list.
  • It receives as input parameters prev (the
    descriptor pointer of the previously running
    process) and p (the descriptor pointer of the
    process to evaluate)
  • The integer value c returned by goodness( )
    measures the "goodness" of p and has the
    following meanings
  • c -1000, p must never be selected this value
    is returned when the runqueue list contains only
    init_task
  • c 0, p has exhausted its quantum. Unless p is
    the first process in the runqueue list and all
    runnable processes have also exhausted their
    quantum, it will not be selected for execution.
  • 0 lt c lt 1000, p is a conventional process that
    has not exhausted its quantum a higher value of
    c denotes a higher level of goodness.
  • c gt 1000, p is a real-time process a higher
    value of c denotes a higher level of goodness.

14
Scheduling algorithm for multiprocessor
  • The Linux scheduler is defined in kernel/sched.c.
  • The new scheduler have been designed to
    accomplish specific goals
  • Implement fully O(1) scheduling, Every algorithm
    in the new scheduler completes in constant-time,
    regardless of the number of running processes or
    any other input.
  • Implement perfect SMP scalability, Each processor
    has its own locking and individual runqueue.
  • Implement improved SMP affinity, Naturally
    attempt to group tasks to a specific CPU and
    continue to run them there. Only migrate tasks
    from one CPU to another to resolve imbalances in
    runqueue sizes.

15
Scheduling algorithm for multiprocessor (contd.)
  • Provide good interactive performance, Even during
    considerable system load, the system should react
    and schedule interactive tasks immediately.
  • Provide fairness, No process should find itself
    starved of timeslice for any reasonable amount of
    time. Likewise, no process should receive an
    unfairly high amount of timeslice.
  • Optimize for the common case of only 1-2 runnable
    processes, yet scale well to multiple processors
    each with many processes
  • Runqueues
  • It is the basic data structure in the scheduler.
  • It is defined in kernel/sched.c as struct
    runqueue.
  • It is the list of runnable processes on a given
    processor there is one runqueue per processor.
    Each runnable process is on exactly one runqueue.
  • It additionally contains per-processor scheduling
    information.

16
Scheduling algorithm for multiprocessor (contd.)
  • A group of macros is used to obtain specific
    runqueues.
  • cpu_rq(processor) returns a pointer to the
    runqueue associated with the given processor.
  • this_rq() returns the runqueue of the current
    processor.
  • task_rq(task) returns a pointer to the runqueue
    on which the given task is queued.
  • this_rq_lock() locks the current runqueue
  • rq_unlock(struct runqueue rq) unlocks the given
    runqueue.
  • double_rq_lock() and double_rq_unlock(), to avoid
    deadlock

17
Scheduling algorithm for multiprocessor (contd.)
  • The Priority Arrays
  • Each runqueue contains two priority arrays, the
    active and the expired array.
  • Each priority array contains one queue of
    runnable processors per priority level.
  • Priority arrays are the data structure that
    provide O(1) scheduling.
  • It also contain a priority bitmap used to
    efficiently discover the highest priority
    runnable task in the system.
  • Each priority array contains a bitmap field that
    has at least one bit for every priority on the
    system.

18
Scheduling algorithm for multiprocessor (contd.)
  • Finding the highest priority task on the system
    is therefore only a matter of finding the first
    set bit in the bitmap.
  • Each priority array also contains an array called
    queue of struct list_head queues, one queue for
    each priority.
  • The Linux O(1) scheduler algorithm.

19
Scheduling algorithm for multiprocessor (contd.)
  • Recalculating Timeslices
  • The new Linux scheduler alleviates the need for a
    recalculate loop. Instead, it maintains two
    priority arrays for each processor
  • 1. active array, contains all the tasks in
    the associated runqueue
  • that have timeslice left.
  • 2. expired array , contains all the tasks in
    the associated
  • runqueue that have exhausted their
    timeslice.
  • When each task's timeslice reaches zero, its
    timeslice is recalculated before it is moved to
    the expired array.

20
Scheduling algorithm for multiprocessor (contd.)
  • Recalculating all the timeslices is then as
    simple as just switching the active and expired
    arrays. Because the arrays are accessed only via
    pointer, switching them is as fast as swapping
    two pointers.
  • This swap is a key feature of the new O(1)
    scheduler. Instead of recalculating each
    process's priority and timeslice all the time,
    the O(1) scheduler performs a simple two-step
    array swap.

21
Scheduling algorithm for multiprocessor (contd.)
  • Sleeping and waking up

22
Scheduling algorithm for multiprocessor (contd.)
  • The load balancer

23
Scheduling in windows 2000
  • Windows 2000 scheduler
  • Concern is with threads not processes
  • i.e. threads are scheduled, not processes
  • Threads have priorities 0 through 31

31
16 real-time levels
16
15
15 variable levels
Used by zero page thread
1
0
Used by idle thread(s)
i
24
Manual Process Priority Adjustments
25
Thread Scheduling
  • Strictly priority driven
  • 32 queues (FIFO lists) of ready threads
  • One queue for each priority level
  • Queues are common to all CPUs
  • When thread becomes ready, it
  • either runs immediately, or
  • is inserted at the tail end of the Ready queue
    for its current priority
  • On a uniprocessor, highest priority Ready thread
    always runs
  • Time-sliced, round-robin within a priority level

26
Thread Scheduling Multiprocessor Issues
  • On a multiprocessor, highest-priority n threads
    will always run (subject to Affinity, explained
    later)
  • No attempt is made to share processors fairly
    among processes, only among threads
  • Tries to keep threads on the same CPU

27
Scheduling Scenarios
  • Preemption
  • A thread becomes ready at a higher priority than
    the running thread
  • Lower-priority thread is preempted
  • Preempted thread goes to the tail of its Ready
    queue

28
Scheduling Scenarios (contd.)
  • Voluntary switch
  • When the running thread gives up the CPU
  • Waiting on a dispatcher object
  • Termination
  • Explicit lowering of priority
  • Schedule the thread at the head of the next
    non-empty Ready queue
  • Running thread experiences quantum end
  • Priority is decremented unless at thread base
    priority
  • Thread goes to tail of Ready queue for its new
    priority
  • May continue running if no equal or
    higher-priority threads are Ready i.e. it
    gets the new quantum

29
Priority Adjustments
  • Threads in dynamic classes can have priority
    adjustments applied to them (Boost or Decay)
  • Idle, Below Normal, Normal, Above Normal and High
  • Carried out upon wait completion
  • Used to avoid CPU starvation
  • No automatic adjustments in real-time class
  • Priority 16 and above
  • Scheduling is therefore predictable with
    respect to the real-time threads
  • Note though that this does not mean that there
    are absolute latency guarantees

30
Priority Boosting
  • Priority boost takes place after a wait
  • Occurs when a wait (usually I/O) is resolved
  • Slow devices / long waits big boost
  • Fast devices / short waits small boost
  • Boost is applied to threads base priority
  • Does not go over priority 15
  • Keeps I/O devices busy

31
CPU Starvation
  • Balance Set Manager looks for starved threads
  • BSM is a thread running at priority 16, waking up
    once per second
  • Looks at threads that have been Ready for 4
    seconds or more
  • Boosts up to 10 Ready threads per pass
  • Special boost applied
  • Priority 15
  • Quantum is doubled
  • Does not apply in real-time range

32
Multiprocessor Support
  • By default, threads can run on any available
    processor
  • Soft affinity (introduced in NT 4.0)
  • Every thread has an ideal processor
  • When thread becomes ready
  • if ideal is idle, it runs there
  • else, if previous processor is idle, it runs
    there
  • else, may look at next thread, to run on its
    ideal processor

33
Multiprocessor Support (contd.)
  • Hard affinity
  • Restricts thread to a subset of the available
    CPUs
  • Can lead to
  • threads getting less CPU time that they normally
    would

34
References
  • http//iamexwiwww.unibe.ch/studenten/schlpbch/linu
    xScheduling/LinuxScheduling.html
  • http//oopweb.com/OS/Documents/tlk/VolumeFrames.ht
    ml
  • http//www.samspublishing.com/articles/article.asp
    ?p101760rl1
  • http//www.oreilly.com/catalog/linuxkernel/chapter
    /ch10.html
  • G. M. Candea and M. B. Jones, Vassal Loadable
    Scheduler Support for Multi-Policy Scheduling,
    Proceedings of the 2nd USENIX Windows NT
    Symposium Seattle, Washington, August 34, 1998
Write a Comment
User Comments (0)
About PowerShow.com