Scheduling Data-Intensive Workflows - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Scheduling Data-Intensive Workflows

Description:

Tim H. Wong, Daniel Zinn, Bertram Lud scher (UC Davis) Outline. Problem motivation ... Schedule all tasks such that the total time is minimal. Problem Variants ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 21
Provided by: chessEecs
Category:

less

Transcript and Presenter's Notes

Title: Scheduling Data-Intensive Workflows


1
Scheduling Data-Intensive Workflows
  • Tim H. Wong, Daniel Zinn, Bertram Ludäscher
  • (UC Davis)

2
Outline
  • Problem motivation
  • Assumptions
  • Cost model
  • Problem formalization
  • Different simplifications and their complexity
  • Prototypical Java implementation for Kepler
  • Summary

3
Motivation Distributed Execution of Scientific
Workflows
4
Motivation Distributed Execution of Scientific
Workflows
  • Process a set of data on a set of
    machinesGOALMinimize WF-Execution
    time!Allocation Problem Which actors are
    computed on which hosts?

5
(No Transcript)
6
Cost Model
  • Communication Time TC
  • Function Execution Time TE
  • Total Time TT TC TEShipping and Handling
    ProblemSchedule all tasks such that the total
    time is minimal

7
Problem Variants and Complexities
Reduction from Task Scheduling Problem ERLA94
Shipping and Handling Problem (SHP)
Communication Cost Non-uniform Function
Execution Cost Non-uniform Complexity
NP-complete
Task Handling Problem (THP)
Data Shipping Problem (DSP)
Communication Cost Zero Function Execution Cost
Non-uniform Complexity NP-complete
Communication Cost Non-uniform Function
Execution Cost Zero Complexity NP-complete
Reduction from Multiprocessor Scheduling Problem
KA99
Reduction from 1-Multiterminal Cut
8
easy-DSP Uniform Transfer Rate, Uniform Data
Size
  • Given
  • Directed Acyclic Graph,Set of Colors
  • Some vertices are already colored
  • Edge Weight 1, if two adjacent vertices are of
    different colorsEdge Weight 0, otherwise
  • TASK
  • Color the rest of the vertices such that total
    weight is minimal!

4
Cost Model Minimize Total Shipped Volume!
9
1 - Multi-Terminal CUT
  • Given
  • Undirected Graph G (V,E)
  • Set of Terminals S V
  • Edge Weights 1
  • TASK
  • Find a multi-way cut of G with aminimum number
    of edges

4
Minimize edges between different terminals!
NP-Complete for more than 3 Terminals!
10
Reduction 1-MTC lt DSP
?
Order graph Color terminals
4
4
1-MTC
DSP
11
Reduction 1-MTC lt DSP
?
!
1
1
1
1
1
1
1
1
1
4
4
1-MTC
DSP
12
(No Transcript)
13
NP-Hard, ...But Need to solve
  • Greedy Algorithm
  • Dynamic Programing Algorithm
  • Investigate Approximation Algorithms for
    MTC/related !

14
Prototypical Implementation ...
abstract only some nodes assigned
concrete all nodes assigned
scheduling
15
Prototypical Implementation ... in Kepler!
Abstract Workflow ...
SCHEDULING
16
Prototypical Implementation ... in Kepler!
Concrete Workflow ...
17
Future Work
  • Use Heuristics about looping to guess
    multiplicities(then not ACYCLIC any more!)
  • Investigate approximation algorithms with error
    guarantees for 1-MTC gt try to apply for DSP
  • ALSO Relevant for COMAD Workflowscan be
    compiled into a low-level conventional WF

18
Summary
  • Bad news
  • Scheduling is hard
  • DSP is hard (for BEST plans)
  • Good news
  • Finding a quite good plan is easy
  • Greedy/Dynamic Algorithms
  • Open Problems
  • Approximation Quality of simple algorithms?
  • When do they perform badly?
  • Does this occur often in real-life workflows?

19
References
20
Thank You. Questions?
Write a Comment
User Comments (0)
About PowerShow.com