Studies of the UserScheduler Relationship - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Studies of the UserScheduler Relationship

Description:

Fitness: global utility of resulting schedule (approx.)? J1. J2. J3. J4. J5 ... Genetic algorithm scheduler and model for generating synthetic utility curves: ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 20
Provided by: graalE
Category:

less

Transcript and Presenter's Notes

Title: Studies of the UserScheduler Relationship


1
Studies of the User-Scheduler Relationship
  • Cynthia Bailey Lee
  • Advisor Allan E. Snavely
  • Department of Computer Science and Engineering
  • San Diego Supercomputer Center
  • University of California, San Diego
  • May 19, 2008

2
Introduction
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • The job submission routine
  • Edit job script, including resources needed and
    amount of time requested
  • Submit jobtypically, many questions remain
  • Did I request enough time?
  • How long will the job wait in the queue?
  • Eventually, job runsmore questions
  • I submitted to a high-priority queuewas my
    wait time actually shorter than if I hadnt?
  • By how much?
  • Was it worth it?
  • Is this a satisfying relationship for either
    party?

3
Contributions of This Work
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Falsified The Padding Hypothesis as the sole
    explanation for users inaccurate runtime
    requests
  • Quantified users valuation of turnaround by
    collecting actual users utility curves
  • Proposed a model for synthetically generating
    utility functions that draws on patterns seen in
    the actual user curves
  • A genetic algorithm-based scheduler that uses
    aggregate utility as an explicit objective
    function

4
The Padding Hypothesis
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • The inaccuracy of users requested runtimes,
    relative to the actual runtime of jobs, is
    explained by users explicitly padding otherwise
    accurate runtime estimates in order to avoid the
    possibility of being killed by the scheduler.

5
Padding Hypothesis
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
Padding Hypothesis
SDSC users were asked to provide a
no-kill/no-pressure estimate, with prizes for
being accurate
  • Lessons Learned
  • Users cant provide information most schedulers
    ask for, but
  • Maybe they can (and would want to) provide useful
    information schedulers currently dont ask for

72
Users are able to self-identify as more or less
accurate
Decrease
6
What is a Utility Function?
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
u(t)?
time
8 am 121pm 5 pm 8 am 9
am
Other factors coordinate with other grid sites
or sensors, paper deadlines, weather and
hurricane prediction,
7
Real Users' Functions
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Randomly-selected users of SDSC systems provided
    these data points for jobs they were submitting
  • Utility is in terms of the SDSC charge unit
    (SU)?

8
More Real Users' Functions
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
9
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
Existing Model
Used by e.g. Chun and Culler 2002, and Irwin,
Grit, Chase 2004
10
Proposed Model
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • To use Aggregate Utility, utility functions
    needed for all jobs
  • Propose to store function as series of (time,
    value) pairs appending each line of Standard
    Workload Format, allowing arbitrarily-shaped
    functions

Absent real data collected from users for
each job, we need a model for synthetic
generation...
11
Modeling Three Distinct Decay Patterns
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Expected Linear
  • Expected Exponential
  • Step
  • Expected refers to the fact that each point is
    chosen randomly (i.e. Most won't follow the
    pattern as cleanly as shown here)?

12
Start Values and Deadlines
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • User-provided priority (queue) from the log
    controls the starting (maximum) job value
  • Distribution of actual wait times from the log
    controls the deadline (when the value goes to
    zero)?

13
Metric Aggregate Utility
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Reflects administrator's priorities
  • allocation of funds (SUs/Monopoly money) to
    users at the beginning of the fiscal
    year/quarter/month/etc
  • Reflects users' personal input
  • how they choose to spend their funds
  • Enables more comprehensive evaluation and
    comparison of all job scheduling algorithms

14
Parallel Job Scheduling Explicitly by Utility
Function
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
?
Finding the best solution is NP-hard
  • Tennis Court Scheduling (human-powered)?
  • Still practiced occasionally at most centers
    (officially and not) -- a phone call to sys
    admins gets a job a reservation or to the front
    of the queue
  • Custom Heuristics
  • Sort by current value, or a combination of start
    value and slope Chun and Culler 2002 Irwin,
    Grit, Chase 2004

15
Genetic Algorithm Scheduler
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Individuals
  • permutations of the job queue ordering
  • Mutation
  • swap two randomly-selected jobs
  • Reproduction
  • zipper-like merging of parents (skip duplicates)?
  • Fitness global utility of resulting schedule
    (approx.)?

16
Results
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
  • Schedulers compared
  • CONS Conservative Backfilling
  • EASY Aggressive Backfilling
  • PRIO Priority FIFO (typical supercomputer
    priority scheduler)?
  • GA genetic algorithm
  • Workload is SDSC-BLUE from the Parallel Workloads
    Archive (Dror Feitelson)?
  • Load modified by scaling inter-arrival times

17
Accurate and Inaccurate Runtimes
Introduction Runtime Inaccuracy
Utility Functions Utility Model
Scheduler
Normal Load? Heavy Load
Many, many more results in the paper...
18
Current Future Work
Current Future Work
  • Eliciting the Utility Function
  • What would this look like in a production
    environment
  • Interview users to better see how they think
    about the utility function
  • Quantifying the benefit
  • What is the additional benefit of providing
    additional utility function data points?
  • Who benefits? Everyone? Do users who provide more
    data points than their peers benefit individually?

19
For more information
  • Inaccurate runtime requests survey
  • Lee, C., Y. Schwartzman, J. Hardy, A. Snavely.
    Are user runtime estimates inherently
    inaccurate? Workshop on Job Scheduling
    Strategies for Parallel Processing, with
    SIGMETRICS, June 2004.
  • Survey collecting SDSC users' utility curves
  • Lee, C. and A. Snavely. "On the User-Scheduler
    Dialogue Studies of User-Provided Runtime
    Estimates and Utility Functions." International
    Journal of High Performance Computing
    Applications, vol. 20, 2006.
  • Genetic algorithm scheduler and model for
    generating synthetic utility curves
  • Lee, C. and A. Snavely. Precise and Realistic
    Utility Functions for User-Centric Performance
    Analysis of Schedulers. HPDC-16, June 2007.
  • Contact Cynthia Lee, CL_at_SDSC.EDU
Write a Comment
User Comments (0)
About PowerShow.com