Title: A Predictionbased Realtime Scheduling Advisor
1A Prediction-basedReal-time Scheduling Advisor
- Peter A. Dinda
- Carnegie Mellon University
2Outline
- Real-time scheduling advisor model and interface
- Prediction-based implementation
- Randomized evaluation using load trace playback
3The Problem Solved by the Real-time Scheduling
Advisor
At time tnow, the application gives you a task
with compute requirements tnom, a deadline
tnowtnom(1slack), a confidence level c, and a
list of hosts in a shared, unreserved distributed
computing environment. The application can run
the task on any of the hosts. Choose a host from
the list such that the task, if run on that host,
will meet the deadline with probability c or
better, if possible.
4Model
- Task model
- Compute-bound
- Initiated by user actions (interactive
applications) - Arrive aperiodically
- Do not overlap
- Must be started immediately (tnow)
- Application model
- Knows tasks compute requirements (tnom)
- Knows appropriate slack for task
- deadline tnow (1slack)tnom
- Can run task on one of a set of hosts
- Real-time scheduling advisor recommends the most
appropriate host
5RTSA Interface
int RTAdviseTask(RTSchedulingAdvisorRequest
req, RTSchedulingAdvisorResponse
resp) struct RTSchedulingAdvisorRequest
double tnom double slack double
conf Host hosts struct
RTSchedulingAdvisorResponse double
tnom double
slack double
conf Host
host RunningTimePredictionResponse
runningtime
Deadline tnow tnom(1slack)
Required certainty of meeting deadline
Hosts to choose from
Most appropriate host
Confidence interval for running time on host
6Prediction-based Implementation
7Anchoring this talk
This talk description and evaluation of the
real-time scheduling advisor
Assume this works (later talk)
Built host load prediction system
Developed RPS toolkit for building fast, low
overhead resource prediction systems
Found appropriate predictive models for host
load signals
Studied statistical properties of host load
signals
Developed load trace playback technique for
reconstructing load
8Scheduling Strategies
- Prediction-based (MEAN, LAST, AR(16))
- Operation
- Acquire running time predictions for each host
- Select host at random from those where confidence
interval is below deadline - If none exist, choose host with lowest expected
running time - Return host and running time prediction
- MEASURE
- Return host with current lowest measured load
- No running time prediction
- RANDOM
- Return random host
- No running time prediction
9Performance Metrics
- Fraction of deadlines met
- Will the deadline be met?
- Depends on (at least) strategy, slack, and
resource availability - Fraction of deadlines met when possible
- If strategy claims deadline will be met, will
the deadline be met? - Should depend only on strategy
- Application can try other tnom, slack
- Number of possible hosts
- How much randomness is introduced?
- Helps to avoid disastrous advisor synchronization
10Methodology
- Recreate scenario (load on a set of hosts) on
manchester testbed using load trace playback - Schedule and run randomized tasks
- random arrival times (5 to 15 seconds apart)
- tnom randomly selected from 0.1 to 10 secs
- Slack randomly selected from 0 to 2
- Randomly selected strategy
- Data-mine results
114LS Scenario
- Four PSC alpha cluster hosts
- axp0 (interactive), axp4, axp5, axp10 (batch)
- high load, high variability
- Traces start Tuesday, August 12, 1997.
- 16,000 tasks run in 36 hours
12Terminology I will Use
- Scheduling feasibility
- How likely it is that a host exists on which
deadline can be met - Increases with slack, decreases with tnom
- Also depend on variation among the hosts
- Predictor sensitivity
- How likely that the deadline will be missed due
to a bad prediction - Low when scheduling feasibility is high or low
- Highest near critical slack
- Critical slack
- Slack at which scheduling feasibility is 50
13Overview of Results
- AR(16) prediction-based strategy is superior
- Fraction of deadlines met at least as good as
MEASURE, and much improved at critical slack - Fraction of deadlines met when possible higher
than all competitors and most independent of
slack and nominal time - Introduces similar randomness as other
prediction-based strategies - Performance metrics depend slack, nominal time
14Fraction of Deadlines Met Versus Slack
15Fraction of Deadlines Met Versus tnom
16Fraction of Deadlines Met Versus tnom(near
critical slack)
17Fraction of Deadlines Met When Possible Versus
Slack
18Fraction of Deadlines Met When Possible Versus
tnom
19Fraction of Deadlines Met When Possible Versus
tnom (Near Critical Slack)
20Number of Possible Hosts Versus Slack
21Number of Possible Hosts Versus tnom
22Number of Possible Hosts Versus tnom (Near
Critical Slack)
23Conclusions
- MEASURE greatly increases chance of meeting
deadlines compared to RANDOM - AR(16) increases that chance with miniscule
additional overhead - Especially near critical slack and for short
tasks - In addition, AR(16) can tell the application,
with high accuracy, whether the deadline will be
met before the task is run - Gives the application opportunity to negotiate
- AR(16) introduces appropriate randomness into
their choices, reducing chance of conflict - AR(16) Prediction-based Real-time Scheduling
Advisor is a useful tool