Title: Chapter
1 Chapter 5.2 Static Process Scheduling
2Outline
- Part I Static Process Scheduling
- Precedence process model
- Communication system model
- Part II Current Literary Review
- "Optimizing Static Job Scheduling in a Network of
Heterogeneous Computers," ICPP 2000 - Design Optimization of Time- and Cost-
Constrained Fault-Tolerant Distribution Embedded
Systems, DATE 2005 - White Box Performance Analysis Considering
Static Non-Preemptive Software Scheduling, DATE
2009 - Part III Future Research Initiatives
3Static Process Scheduling
- Given a set of partially ordered tasks, define a
mapping of processes to processors before the
execution of the processes. - Cost model CPU cost and communication cost, both
should be specified in prior. - Minimize the overall finish time (makespan) on a
non-preemptive multiprocessor system (of
identical processors) - Except for some very restricted cases,
scheduling to optimize the makespan are
NP-Complete - Heuristic solution are usually proposed
4Precedence Process Model
- This model is used to describe scheduling for
program which consists of several sub-tasks.
The schedulable unit is sub-tasks. - Program is represented by a DAG.
- Precedence constraints among tasks in a program
are explicitly specified. - critical path the longest execution path in the
DAG, often used to compare the performance of a
heuristic algorithm.
5Precedence Process and Communication System Models
Communication overhead for A(P1) and E(P3) 4
2 8
6contd..
- Scheduling goal minimize the makespan time.
- Algorithms
- List Scheduling (LS) Communication overhead is
not considered. Using a simple greedy heuristic
No processor remains idle if there are some tasks
available that it could process. - Extended List Scheduling (ELS) the actual
scheduling results of LS with communication
consideration. - Earliest Task First scheduling (ETF) the
earliest schedulable task (with communication
delay considered) is scheduled first.
7Makespan Calculation for LS, ELS, and ETF
8Communication Process Model
- There are no precedence constrains among
processes - modeled by a undirected graph G, node represent
processes and weight on the edge is the amount of
communication messages between two connected
processes. - Process execution cost might be specified some
times to handle more general cases. - Scheduling goal maximize the resource
utilization.
9contd
- the problem is to find an optimal assignment of
m process to P processors with respect to the
target function - P a set of processors. ej(pi) computation cost
of execution process pi in processor Pj. - ci,j(pi,pj) communication overhead between
processes pi and pj. - Assume a uniform communicating speed between
processors.
10- This is referred as Module Allocation problem. It
is NP-complete except for a few cases - For P2, Stone suggested an polynomial time
solution using Ford-Fulkersons maximum flow
algorithm. - For some special graph topologies such as trees,
Bokharis algorithm can be used. - Known results The mapping problem for an
arbitrary number of processors is NP-complete.
Problem optimal polynomial time algorithm suboptimal
2 processor Yes
2 proc. with varying load Yes
tree-structured graph Yes
series parallel graph Yes
3 and more processor systems yes
11Stones two-processor model to achieve minimum
total execution and communication cost
- Example
- Partition the graph by drawing a line cutting
through some edges - Result in two disjoint graphs, one for each
process - Set of removed edges ? cut set
- Cost of cut set ? sum of weights of the edges
- Total inter-process communication cost between
processors - Of course, the cost of cut sets is 0 if all
processes are assigned to the same node - Computation constraints (no more k, distribute
evenly) - Example
- Maximum flow and minimum cut in a commodity-flow
network - Find the maximum flow from source to destination
12Maximum Flow Algorithm in Solving the Scheduling
Problem
13Minimum-Cost Cut
Only the cuts that separate A and Bare feasible
14Generalized solution for more than two processor
- Stone uses a repetitive approach based on
two-processor algorithm to solve n-processor
problems. - Treat (n-1) processors as one super processor
- The processors in the super-processor is further
broken down based on the results from previous
step.
15Other Heuristics
- Other heuristic separate the optimization of
computation and communication. - Assume communication delay is more significant
cost - merge processes with higher interprocess
interaction into cluster of processes - clusters of processes are then assigned to the
processor that minimizes the computation cost - With reduced problem size, the optimal is
relatively easier to solve (exhaust search) - A simple heuristic merge processes if
communication costs is higher than a threshold C - Also can put constrains on the total computation
for the cluster, to prevent over clustering.
16Cluster of Processes
- For C 9, We get three clusters (2,4), (1,6 )and
(3,5) - Clusters (2,4) and (1,6) must be mapped to
processors A and B. - Cluster (3,5) can be assigned to A 0r B But
assigned to A due to lower communication cost - Total Cost 41 ( Computation cost 17 on A and
14 on B Communication cost 10)
17Part II Current Literary Review
18Optimizing Static Job Scheduling in a Network of
Heterogeneous Computers-----Xueyan Tang Samuel
T. Chanson-----IEEE 2000
- Summary
- Static job scheduling schemes in a network of
computers with different speeds. - Optimization techniques are proposed for workload
allocation and job dispatching. - The proposed job dispatching algorithm is an
extension of the traditional round-robin scheme
19Optimization for Workload Allocation
- a fraction ai of all the jobs are sent to
computer ci - where
Tang Chanson 2000
20Simple Weighted Workload Allocation
- Amount of workload for each computer proportional
to its processing speed - All computers are equally utilized .
- Does not provide best performance
21Dynamic Least-Load Scheduling
- Beneficial to allocate a disproportional higher
fraction of the workload to the more powerful
computers. - Assign new job to the machine with least
normalized load - it is known that jobs moved from a slow machine
to a fast machine, decreases slow machines
utilization decreases a lot whereas utilization
of fast machine does not increase that much
22Optimizing Technique for Job Dispatching
- Random Based Job Dispatching
- Newly arrived job is scheduled to run on
randomly selected computer - Round-Robin Based Job Dispatching
- The objective here is to smooth inter-arrival
intervals of consecutive jobs . - For example suppose there are 4 computers c1, c2,
c3 and c4 with workload fractions 1/8, 1/8, 1/4
and ½ respectively. - Dispatching scheme -? c4, c3, c4, c2, c4, c3, c4,
c1, c4, c3, c4, c2, c4, c3, c4, c1,
23Summary
- The key idea of optimizing the workload
allocation scheme it to send a disproportionately
high fraction of workload to the most powerful
computers. - An analytical model is developed to derive the
optimized allocation strategy mathematically - For job dispatching an algorithm that extends
round-robin to a general case is presented
24Design Optimization of Time- and Cost-Constrained
Fault-Tolerent Distributed Embedded Systems---- V
Izosimov, P Pop, P Eles Z Peng-------DATE, IEEE
2005
- Synopsis
- Re-execution and Replication are used for
tolerating transient faults - Processes are statically schedules and
communication are performed using the time
triggered protocol
25System Architecture
- Each node has a CPU and communication controller
running independently - Time Triggered Communication Protocol
26Fault-Tolerance Mechanisms
- Re-execution
- Active Replication
27Summary
- Addresses optimization of distributed embedded
systems for fault tolerance - Two fault-tolerance mechanism
- Re-execution time redundancy
- Active replication space redundancy
28White Box Performance Analysis Considering Static
Non-Preemptive Software Scheduling --- A Viehl, M
Pressler, Oliver Bringmann---- DATE IEEE 2009
- Synopsis
- A novel approach for the integration of
cooperative and static non-preemptive scheduling
in formal white box analysis presented
29Future Research Initialtive
- Use AI techniques for Static Scheduling
- Genetic Algorithm
- Simulated Annealing
30References
- Randy Chow Theodore Johnson . Distributed
Operating Systems Algorithms. pp 156-163
Addison-Wesley 1997 - Xueyan Tang Samuel T. Chanson. Optimizing
Static Job Scheduling in a Network of
Heterogeneous Computers. pp 373- 382, icpp,
IEEE 2000 - Viacheslav Izosimov, Paul Pop, Petru Else Zebo
Peng. Design Optimization of Time- and
C0st-Constrained Fault Tolerant Distribution
Embedded Systems. Design Automation and Test in
Europe (DATE), IEEE, 2005 - Alxander Viehl, Michael Pressler and Oliver
Bringmann. White Box Performance Analysis
Considering Static Non-Preemptive Software
Scheduling. Design Automation and Test in Europe
(DATE), IEEE, 2009
31