Title: Coordinated Workload Scheduling
1Coordinated Workload Scheduling
- A New Application Domain for Mechanism Design
- Elie Krevat
2Introduction
- Distributed systems becoming larger, more complex
- Nodes perform computation and storage tasks
- Workloads enter system and are distributed across
nodes - Clients run many workloads, can pay for resources
- Nodes service many workloads (not dedicated)
- System provides QoS guarantees
- Performance load balance workloads to faster
free nodes - Efficiency minimize cycles wasted when tasks
available - Fairness nodes share resources across workloads
3Benefits of Shared Storage
Why cluster? Scaling, cost, and management. Why
share? Slack sharing, economies of scale,
uniformity.
4Throughput Performance Insulation in Shared
Storage
- Each of n workloads on a server
- Executes efficiently within its portion of time
(timeslice) - Ideally gets 1/n of its standalone performance
- In practice within a fraction of the ideal
- Argon project Wachs07 provides bounds on
efficiency across workloads for one server - Problems extending to many servers
(cluster-style) - Synchronized workloads need coordination of
schedules - Performance of system limited by slowest node
5Timeslice challenges
140 ms
Server A
Workload 1
Workload 2
Workload 3
Workload 1
Workload 2
280 ms
Server B
Workload 4
Workload 1
Workload 4
100 ms
Server C
1
6
3
5
2
3
6
1
6Cluster-style Storage Systems
Data Block
Synchronized Read
1
R
R
R
R
2
3
Client
Switch
Data Fragment
1
2
3
4
4
Client now sends next batch of requests
Storage Servers
6
7Environment Assumptions
- One client per workload
- Bounded number W of workloads, N of nodes
- Constant set of workloads to be scheduled
- But mechanism might support changing set
- Communication doesnt interfere with
computation/storage tasks
8Workload Distribution Settings
- Two alternative workload distribution settings
- Setting I Free Workload Assignment
- Workloads can be freely assigned to many nodes
- Example Embarrassingly parallel distributed apps
- Problem Determine best set of nodes to assign
- Setting II Fixed Workload Assignment
- Workloads must be assigned to fixed set of nodes
- Example Cluster-style storage
- Problem Coordinate responses of nodes with
better timeslice scheduling
9Computing Environments with Monetary Incentives
- Workloads pay for resources
- Weather forecasting
- Seismic measurement simulations of oil fields
- Distributed systems sell resources
- Supercomputing centers sell resources
- Shared infrastructures
- Grid computing
- Individually-owned computers sell spare cycles
- SETI_at_Home for
- May not have single administrative domain
10Why Mechanism Design?
- Central coordinator(s) lack per-node information
- Different performance capabilities and revenue
models - Enforce cooperation and global QoS
- Efficiency and fairness not always goals of
players - Reduce scheduling problems to general mechanism
- Scheduling coordinated workloads is hard (proof
later) - Divide scheduling problems across nodes
- Design mechanism to produce coordination
11Outline
- Background and Motivation
- ?Mechanism I Free Workload Assignment
- Mechanism II Fixed Workload Assignment
- Conclusions
12Revenue Model Free Assignment
- Clients pay nodes directly after task
- Payment is per-workload
- Amount depends on many factors
- Speed of response
- Number of requests/computations per timeslot
- Clients may also pay fixed cost to central
scheduler - Workloads want the best and fastest nodes
- Central scheduler doesnt know load/speed of
nodes - Nodes are greedy and want lots of workloads
- May lie about load/speed if asked directly
- System Goal Assign workloads to nodes that will
respond fastest
13Mechanism Design VCG
- Run auction to decide which M nodes to assign
- FIFO approach to scheduling each workload
- Can also run combinatorial auction on bundles
- Nodes respond with bids
- Valuations depend on speed and current load
- Same factors that affect final payment
- Apply Vickrey-Clarke-Groves mechanism
- First auction iteration finds top M bids
- Remove Node X, recompute top M bids
- Additional auction iteration not actually
necessary - Difference between Xs bid and M1st bid is
payment - May also normalize payments to share wealth over
nodes
14Mechanism Results
- Incentive compatible
- Nodes have no incentive to lie, since if they
over-report valuation for workload theyll still
be paid true valuation - Global efficiency (i.e., best allocation for
workload) - Related to general task allocation problem
Nisan99 - k tasks allocated to n agents
- Goal is to minimize completion time of last
assignment (make-span) - Valuation of agent is negation of total time
spent on tasks - Approximation/randomized algorithms exist for CA
15Outline
- Background and Motivation
- Mechanism I Free Workload Assignment
- ?Mechanism II Fixed Workload Assignment
- Conclusions
16Revenue Model Fixed Assignment
- Nodes paid by system at every timestep
- Payment is part of mechanism payment scheme
- System wants quick resolution of workload
requests - Nodes need monetary incentives to schedule fully
coordinated workloads efficiently and fairly - All M nodes service workload in same timeslice
- Uncoordinated workloads not important
- System Goals
- Enforce coordination of workloads per timeslice
- Achieve fair distribution of resources
- Achieve efficient schedule allocations
17Coordination is hard
1
2
3
4
- Reduce Max Independent Set problem to problem of
scheduling max of fully coordinated workloads
per timeslice - For every Node xi that services a workload wi,
then wi has a dependency edge to all other
workloads serviced by xi - NP-Complete, but approximation algorithms exist
- For above example, max independent set is 1,3
18Properties of Schedule Allocations
- Two types of schedule allocations
- Basic quanta timeslice allocation a
- Longer sequence of timeslices atot
- Set of workloads serviced by node nx is Sx
- / timeslice quanta allocated per workload wi is
qi - Total quanta count Qx for each node nx
- Delay between consecutive workload schedules in
allocation atot is schedule distance di,k - k refers to schedule instance in atot
- Average schedule distance di,avg, per-node is
dx,avg - Maximum schedule distance di,max, per-node is
dx,max
19Formulas for Schedule Allocation Properties
20Possible Payment Scheme
- Node is paid max of P credits for each scheduled
time quanta - No credits for uncoordinated schedule
- For every cycle of time that workload isnt
scheduled, payment decreases by c (c ltlt P) - Node is fined F if starves workload over a period
of quanta greater than Qthr - Using derived properties of schedule allocations,
each node calculates payments
21Mechanism Design Open Research Problem
- Goal is to improve efficiency and fairness
- Nodes compute their best allocations (through
heuristics) using payment scheme that rewards
efficiency/fairness - Send valuations to central scheduler
- General mechanism determines best global
allocation - But coordination is hard optimization problem
- May be better suited only for central scheduler
- Expected properties of a mechanism
- Nodes are players
- No additional utility past payments?
- Auctioned good may be single or total allocations
- Tradeoff is ability to adapt to changing
workloads vs. better assessment of efficient
allocations over longer time
22Outline
- Background and Motivation
- Mechanism I Free Workload Assignment
- Mechanism II Fixed Workload Assignment
- ?Conclusions
23Conclusions
- Distributed systems environments provide new
applications for mechanism design - Goals of better global performance, efficiency,
fairness - Not always shared by individual nodes
- Model and analysis of 2 different distribution
settings - Free workload assignment solved with VCG
- Fixed workload assignment still open problem
- Revenue model and goals of mechanism vary
- Payment functions use derived allocation
properties - Coordination of workloads is hard optimization
problem - Motivation for further research in related areas