CSC 660: Advanced OS - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

CSC 660: Advanced OS

Description:

CSC 660: Advanced OS Scheduling Topics Basic Concepts Scheduling Policy The O(1) Scheduler Runqueues Priority Arrays Calculating Priorities and Timeslices. – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 38
Provided by: facultyC90
Category:

less

Transcript and Presenter's Notes

Title: CSC 660: Advanced OS


1
CSC 660 Advanced OS
  • Scheduling

2
Topics
  1. Basic Concepts
  2. Scheduling Policy
  3. The O(1) Scheduler
  4. Runqueues
  5. Priority Arrays
  6. Calculating Priorities and Timeslices.
  7. Scheduler Interrupts.
  8. Sleeping and Waking.
  9. The schedule() function
  10. Multiprocessor Scheduling
  11. Soft Realtime Scheduling

3
Basic Concepts
  • Scheduler
  • Selects a process to run and allocates CPU to it.
  • Provides semblence of multitasking on single CPU.
  • Scheduler is invoked when
  • Process blocks on an I/O operation.
  • A hardware interrupt occurs.
  • Process time slice expires.
  • Kernel thread yields to scheduler.

4
Types of Processes
  • CPU Bound
  • Spend most time on computations.
  • Example computer algebra systems.
  • I/O Bound
  • Spend most time on I/O.
  • Example word processor.
  • Mixed
  • Alternate CPU and I/O activity.
  • Example web browser.

5
Alternating CPU and I/O Bursts
6
Scheduling Policy
  • Scheduler executes policy, determining
  • 1. When threads can execute.
  • 2. How long threads can execute.
  • 3. Where threads can execute.

7
Scheduling Policy Goals
  • Efficiency
  • Maximize amount of work accomplished.
  • Interactivity
  • Respond as quickly as possible to user.
  • Fairness
  • Dont allow any process to starve.

8
Which goal is most important?
  • Depends on the target audience
  • Desktop interactivity
  • But kernel shouldnt spend all its time in
    context switch.
  • Server efficiency
  • But should offer interactivity in order to serve
    multiple users.

9
Pre-2.6 Scheduler
  • O(n) algorithm at every process switch
  • 1. Scanned list of runnable processes.
  • 2. Computed priority of each task.
  • 3. Selected best task to run.

10
The O(1) Scheduler
  • Replacement for O(n) 2.4 scheduler.
  • All algorithms run in constant time.
  • New data structures runqueues and priority
    arrays.
  • Performs work in small pieces.
  • Additional new features
  • Improved SMP scalability, including NUMA.
  • Better processor affinity.
  • SMT scheduling.

11
Runqueues
  • List of runnable processes on a processor.
  • Each runnable process is a member of precisely
    one runqueue.
  • Runqueue data
  • Lock to prevent concurrency problems.
  • Pointers to current and idle tasks.
  • Priority arrays which contain actual tasks.
  • Statistics

12
Runqueues
  • struct runqueue
  • spinlock_t lock
  • unsigned long nr_running
  • unsigned long long nr_switches
  • unsigned long expired_timestamp,
    nr_uninterruptible
  • unsigned long long timestamp_last_tick
  • task_t curr, idle
  • struct mm_struct prev_mm
  • prio_array_t active, expired, arrays2
  • int best_expired_prio
  • atomic_t nr_iowait

13
Priority Arrays
  • Each runqueue contains 2 priority arrays
  • Active array
  • Expired array
  • Basis for O(1) performance
  • Scheduler always runs highest priority task.
  • Round robin for multiple equal priority tasks.
  • Priority array finds highest task O(1) operation.
  • Using two arrays allows transitions between
    epochs by switching active and expired pointers.

14
Priority Arrays
  • struct prio_array
  • / of runnable tasks in array /
  • unsigned int nr_active
  • / bitmap pri lvls contain tasks /
  • unsigned long bitmapBITMAP_SIZE
  • / 1 list_head per priority (140) /
  • struct list_head queueMAX_PRIO

15
Finding Highest Priority Task
  • Find first bit set in bitmap.
  • sched_find_first_bit()
  • Read corresponding queuen
  • If one process, give CPU to that one.
  • If multiple processes, round-robin schedule all
    processes in queue for that priority.
  • idx sched_find_first_bit(array-gtbitmap)
  • queue array-gtqueue idx
  • next list_entry(queue-gtnext, task_t, run_list)

16
What if no runnable task exists?
  • System runs the swapper task (PID 0).
  • Each CPU has its own swapper process.

17
Running out of Timeslice
  • Remove task from active priority array.
  • Calculate new priority and timeslice.
  • Add task to expired priority array.
  • Swap arrays when active array is empty.
  • array rq-gtactive
  • if (unlikely(!array-gtnr_active))
  • rq-gtactive rq-gtexpired
  • rq-gtexpired array
  • ...

18
Static and Dynamic Priorities
  • Initial priority value called the nice value.
  • Set via the nice() system call.
  • Static priority is nice value 120.
  • Stored in current-gtstatic_prio.
  • Ranges from 100 (highest) to 139 (lowest).
  • Scheduling based on dynamic priority.
  • Bonuses and penalties according to interactivity.
  • Stored in current-gtprio.
  • Calculated by effective_prio() function.

19
Dynamic Priority Policy
  • Increase priority of interactive processes.
  • Favor I/O-bound over CPU-bound.
  • Need heuristic for determining interactivity.
  • Use time spent sleeping vs. runnable time.
  • Sleep average
  • Stored in current-gtsleep_avg.
  • Incremented when task becomes runnable.
  • Decremented for each timer tick task runs.
  • Scaled to produce priority bonus ranging 0..10.

20
Calculating Priority
  • / Scale sleep_avg to range 0..MAX_BONUS /
  • define CURRENT_BONUS(p) \
  • (NS_TO_JIFFIES((p)-gtsleep_avg) MAX_BONUS / \
  • MAX_SLEEP_AVG)
  • static int effective_prio(task_t p)
  • int bonus, prio
  • bonus CURRENT_BONUS(p) - MAX_BONUS / 2
  • prio p-gtstatic_prio - bonus
  • return prio

21
Time Slices
  • Time slice duration critical to performance.
  • Too short high overhead from context switches.
  • Too long loss of apparent multitasking.
  • Interactive processes and time slices
  • Interactive processes have high priority.
  • Pre-empt CPU bound tasks on kbd/ptr interrupts.
  • Long time slices slow start of new tasks.

22
Calculating Timeslice
  • Initial Timeslice
  • On fork(), parent child divide remaining time
    evenly.
  • Stored in current-gttime_slice.
  • Recalculating Timeslices
  • Time Slice (140 static priority) x 20 if
    static lt 120
  • (140 static priority) x 5 if static gt
    120

Description Nice Static Pri Time Slice
Highest -20 100 800ms
Default 0 120 100ms
Lowest 19 139 5ms
23
Scheduler Interrupts
  • Scheduler interrupt scheduler_tick()
  • Invoked every 1ms by a timer interrupt.
  • Decrements tasks time slice.
  • If a higher priority task exists,
  • Higher priority task is given CPU.
  • Current task remains in TASK_RUNNING state.
  • If time slice expired,
  • Moved to expired priority array.
  • If highly interactive, may be re-inserted into
    active priority array.

24
Sleeping and Waking
  • Sleeping tasks are not in runqueues.
  • Require no CPU time until awakened.
  • Why sleep?
  • Waiting for I/O.
  • Waiting for other hardware events.
  • Waiting for a kernel semaphore.

25
Sleeping
  • DECLARE_WAITQUEUE(wait, current)
  • / q is a wait queue, wait is a q entry /
  • add_wait_queue(q, wait)
  • while (!condition)
  • set_current_state(TASK_INTERRUPTIBLE)
  • if (signal_pending(current))
  • / Handle signal /
  • schedule()
  • set_current_state(TASK_RUNNING)
  • remove_wait_queue(q, wait)

26
Waking
  • wake_up() wakes up tasks on event
  • Exclusive only wakes up one task on waitqueue
  • Non-exclusive wakes all tasks on waitqueue

add_wait_queue
TASK_RUNNING
TASK_INTERRUPTIBLE
Signal
wake_up
27
Multiprocessor Architectures
  • Classic
  • Memory shared by all CPUs.
  • Hyperthreading
  • Single CPU executing multiple on-chip threads.
  • NUMA
  • CPUs RAM grouped in local nodes.
  • Reduces contention for accessing RAM.
  • Fast to access local RAM.
  • Slower to access remote RAM.

28
Multiprocessor Scheduling
  • Each CPU has own runqueue.
  • Scheduler selects tasks from local runqueue.
  • CPU cache more likely to still be hot.
  • Periodic checks to balance load across CPUs.
  • Called by rebalance_tick().
  • Loops over all scheduling domains.
  • Calls load_balance() if balance interval expired.

29
load_balance()
  1. Acquires this_rq-gtlock spin lock.
  2. Finds busiest CPU with gt 1 process.
  3. If no busiest or current CPU is busiest,
    terminates.
  4. Obtains spin lock on busiest CPU.
  5. Pull tasks from busiest CPU to local runqueue.
  6. Releases locks.

30
move_tasks()
  • Searches for runnable tasks in expired runqueue.
  • Then scans active runqueue.
  • Call pull_task() to move task if all true
  • Task not currently being executed.
  • Local CPU is in cpus_allowed bitmask.
  • At least one of the following is true
  • Local CPU is idle.
  • Multiple attempts to move processes have failed.
  • Process is not cache hot.

31
Realtime Scheduling
  • Hard Real-time
  • Guaranteed response within defined period.
  • Used for embedded systems car engines.
  • Ex RealTime Application Interface (RTAI)
  • Soft Real-time
  • Best effort to meet scheduling constraints.
  • Used for multimedia applications.
  • Currently provided by Linux.
  • Improved by Realtime Preemption Patch.

32
Soft Realtime Scheduling
  • Scheduling Priorities
  • RT have higher priorities than any non-RT tasks.
  • RT priorities are static, ranging 1-99, not
    dynamic.
  • If RT tasks are runnable, no other tasks can run.
  • Scheduling Policies
  • SCHED_NORMAL (non-realtime)
  • SCHED_FIFO
  • SCHED_RR

33
Realtime Policies
  • SCHED_FIFO
  • First-in First-out real-time Scheduling
  • Process uses CPU until
  • It blocks or yields the CPU voluntarily.
  • A higher priority real-time process pre-empts it.
  • SCHED_RR
  • Round Robin real-time scheduling.
  • Process runs for time slice, then waits for other
    equal priority real-time processes in runqueue.

34
Realtime Process Replacement
  • Realtime processes replaced only when
  • Pre-empted by a high-priority RT process.
  • Process performs a blocking operation.
  • Process is stopped or killed by a signal.
  • Process invokes sched_yield() system call.
  • SCHED_RR process has exhausted its time slice.

35
Realtime System Calls
  • Scheduler Policy
  • sched_setscheduler()
  • sched_getscheduler()
  • Priority
  • sched_getparam()
  • sched_setparam()

36
Yielding the Processor
  • sched_yield() system call
  • Moves regular task to expired priority array.
  • RT tasks moved to end of priority list.
  • Kernel tasks can yield the CPU too.
  • Call yield() function.

37
References
  1. Josh Aas, Understanding the Linux 2.6.8.1
    Scheduler, http//josh.trancesoftware.com/linux/,
    2005.
  2. Daniel P. Bovet and Marco Cesati, Understanding
    the Linux Kernel, 3rd edition, OReilly, 2005.
  3. Corbet, Realtime preemption and
    read-copy-update, Linux Weekly News,
    http//lwn.net/Articles/129511/, March 29, 2005.
  4. Robert Love, Linux Kernel Development, 2nd
    edition, Prentice-Hall, 2005.
  5. Claudia Rodriguez et al, The Linux Kernel Primer,
    Prentice-Hall, 2005.
  6. RTAI, http//www.rtai.org/, 2006.
  7. Peter Salzman et. al., Linux Kernel Module
    Programming Guide, version 2.6.1, 2005.
  8. Avi Silberchatz et. al., Operating System
    Concepts, 7th edition, 2004.
  9. Andrew S. Tanenbaum, Modern Operating Systems,
    3rd edition, Prentice-Hall, 2005.
Write a Comment
User Comments (0)
About PowerShow.com