Process Scheduling in Multiprocessor and Multithreaded Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Process Scheduling in Multiprocessor and Multithreaded Systems

Description:

Number of active parallel threads number of allocated processors. Overhead ... Issue instructions from multiple threads simultaneously on a superscalar processor ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 25
Provided by: int668
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Process Scheduling in Multiprocessor and Multithreaded Systems


1
Process Scheduling in Multiprocessor and
Multithreaded Systems
  • Matt Davis
  • CS535
  • 4/7/2003

2
Outline
  • Multiprocessor Systems
  • Issues in MP Scheduling
  • How to Allocate Processors
  • Cache Affinity
  • Linux MP Scheduling
  • Simultaneous Multithreaded Systems
  • Issues in SMT Scheduling
  • Symbiotic Jobscheduling
  • SMT and Priorities
  • Linux SMT Scheduling
  • Conclusions

3
Multiprocessor Systems
  • Symmetric Multiprocessing (SMP)
  • One copy of OS in memory, any CPU can use it
  • OS must ensure that multiple processors cannot
    access shared data structures at the same time

Shared Memory Multiprocessors
4
Issues in MP Scheduling
  • Starvation
  • Number of active parallel threads lt number of
    allocated processors
  • Overhead
  • CPU time used to transfer and start various
    portions of the application
  • Contention
  • Multiple threads attempt to use same shared
    resource
  • Latency
  • Delay in communication between processors and I/O
    devices

5
How to allocate processors
  • Allocate proportional to average parallelism
  • Other factors
  • System load
  • Variable parallelism
  • Min/Max parallelism
  • Acquire/relinquish processors based on current
    program needs

6
Cache Affinity
  • While a program runs, data needed is placed in
    local cache
  • When job is rescheduled, it will likely access
    some of the same data
  • Scheduling jobs where they have affinity
    improves performance by reducing cache penalties

7
Cache Affinity (cont)
  • Tradeoff between processor reallocation and cost
    of reallocation
  • Utilization versus cache behavior
  • Scheduling policies
  • Equipartition constant number of processors
    allocated evenly to all jobs. Low overhead.
  • Dynamic constantly reallocates jobs to maximize
    utilization. High utilization.

8
Cache Affinity (cont)
  • Vaswani and Zahoran, 1991
  • When a processor becomes available, allocate it
    to runnable process that was last run on
    processor, or higher priority job
  • If a job requests additional processors, allocate
    critical tasks on processor with highest affinity
  • If an allocated processor becomes idle, hold it
    for a small amount of time in case task with
    affinity comes along

9
Vaswani and Zahoran, 1991
  • Results showed that utilization was dominant
    effect on performance, not cache affinity
  • But their algorithm did not degrade performance
  • Predicted that as processor speeds increase,
    significance of cache affinity will also increase
  • Later studies validated their predictions

10
Linux 2.5 MP Scheduling
  • Each processor responsible for scheduling own
    tasks
  • schedule()
  • After process switch, check if new process should
    be transferred to other CPU running lower
    priority task
  • reschedule_idle()
  • Cache affinity
  • Affinity mask stored in /proc/pid/affinity
  • sched_setaffinity(), sched_getaffinity()

11
What is SMT?
  • Simultaneous Multithreading
  • aka HyperThreading
  • Issue instructions from multiple threads
    simultaneously on a superscalar processor

12
Why SMT?
  • Technique to exploit parallelism in and between
    programs with minimal additions in chip resources
  • Operating system treats SMT processor as two
    separate processors

13
Issues With SMT Scheduling
  • Not really separate processors
  • Share same caches
  • MP scheduling attempts to avoid idle processors
  • SMT-aware scheduler must differentiate between
    physical and logical processors

14
Symbiotic Jobscheduling
  • Recent studies from U of Washington
  • Origin of early research into SMT
  • OS coschedules jobs to run on hardware threads
  • of coscheduled jobs lt SMT level
  • Occasionally swap out running set to ensure
    fairness

15
Symbiotic Jobscheduling (cont)
  • Shared system resources
  • Functional units, caches, TLBs, etc
  • Coscheduled jobs may interact well
  • Few resource conflicts, high utilization
  • Or they may interact poorly
  • Many resource conflicts, lower utilization
  • Choice of coscheduled jobs can have large impact
    on system performance

16
Symbiotic Jobscheduling (cont)
  • Improve symbiosis by coscheduling jobs that get
    along well
  • Two phases of SOS (Sample, Optimize, Symbios)
    jobscheduler
  • Sample Gather data on current performance
  • Symbios Use computed scheduling configuration

17
Symbiotic Jobscheduling (cont)
  • Sample phase
  • Periodically alter coscheduled job mix
  • Record system utilization from hardware
    performance counter registers
  • Symbios phase
  • Pick job mix that had the highest utilization
  • Trade-off between sampling often or infrequently

18
How to Measure Utilization?
  • IPC not necessarily best predictor
  • IPC can have high variations throughout process
  • High-IPC threads may unfairly take system
    resources from low-IPC threads
  • Other predictors low conflicts, high cache hit
    rate, diverse instruction mix
  • Balance schedule with lowest deviation in IPC
    between coschedules is considered best

19
What About Priorities?
  • Scheduler estimates the natural IPC of job
  • If a high-priority jobs is not meeting the
    desired IPC, it will be exclusively scheduled on
    CPU
  • Provides a truer implementation of priority
  • Normal schedulers only guarantee proportional
    resource sharing, assumes no interaction between
    jobs

20
Another Priority Algorithm
  • SMT hardware fetches instructions to issue from
    queue
  • Scheduler can bias fetching algorithm to give
    preference to high-priority threads
  • Hardware already exists, minimal modifications

21
Symbiosis Performance Results
  • Without priorities
  • Up to 17 improvement
  • Software-enforced priorities
  • Up to 20, average 8
  • Hardware-based priorities
  • Up to 30, average 15

22
Linux 2.5 SMT Scheduling
  • Immediate reschedule forced when HT CPU is
    executing two idle processes
  • HT-aware affinity processes prefer same physical
    CPU
  • HT-aware load-balancing distinguish logical and
    physical CPU in resource allocation

23
Conclusions
  • Intelligent allocation of resources can improve
    performance in parallel systems
  • Dynamic scheduling of processors in MP systems
    produces better utilization as processor speeds
    increase
  • Cache affinity can help improve throughput
  • Symbiotic coscheduling of tasks in SMT systems
    can improve average response time

24
Resources
  • Kenneth Sevcik, Characterizations of Parallelism
    in Applications and Their Use in Scheduling
  • Raj Vaswani and John Zahoran, The Implications
    of Cache Affinity on Processor Scheduling for
    Multiprogrammed, Shared Memory Multiprocessors
  • Allan Snavely et al., Symbiotic Jobscheduling
    with Priorities for a Simultaneous Multithreading
    Processor
  • Linux MP cache affinity, http//www.tech9.net/rml/
    linux
  • Linux Hyperthreading Scheduler,
    http//www.kernel.org/pub/linux/kernel/people/rust
    y/Hyperthread_Scheduler_Modifications.html
  • Daniel Bovet and Marco Cesati, Understanding the
    Linux Kernel
Write a Comment
User Comments (0)
About PowerShow.com