Title: PSWEEP: A Lightweight Pattern for Distributed Computational Experiments
1PSWEEP A Lightweight Pattern for Distributed
Computational Experiments
- Christopher Mueller and Andrew Lumsdaine
- Open Systems Lab, Indiana University
2Introduction
- Parameter Sweeps are common cluster applications
- Approaches
- Scripts (sh, perl ssh, mpi)
- Low level applications (C, Fortran MPI)
- Parameter sweep applications (e.g., Nimrod)
- Problems
- Custom solutions become tangled quickly
- Applications are not available on all platforms
3How do we use our clusters?
4Anatomy of a Parameter Sweep
Parameters and Enumeration Order
- for i in range(rank, n, size)
- if process load_image(i)
- elif stats query_image(i)
-
- for j in 1, 2, 4, 8
- if process time(i, j)
-
- for k in motion, gaussian
- if process process_image(i,j,k)
- elif stats image_stats(i,j,k)
- else
- print 'ssh nd run d d' (i, j, k)
-
- if process clear_process(k)
- elif bgi clear_temp(k)
-
- if process unload_image(i)
Resrouce distribution is handled by the
execution enviroment, e.g. mpirun
5Anatomy of a Parameter Sweep
Tasks and Experiments
- for i in range(rank, n, size)
- if process load_image(i)
- elif stats query_image(i)
-
- for j in 1, 2, 4, 8
- if process time(i, j)
-
- for k in motion, gaussian
- if process process_image(i,j,k)
- elif stats image_stats(i,j,k)
- else
- print 'ssh nd run d d' (i, j, k)
-
- if process clear_process(k)
- elif bgi clear_temp(k)
-
- if process unload_image(i)
6Anatomy of a Parameter Sweep
Artifacts and Errors
- for i in range(rank, n, size)
- if process load_image(i)
- elif stats query_image(i)
-
- for j in 1, 2, 4, 8
- if process time(i, j)
-
- for k in motion, gaussian
- if process process_image(i,j,k)
- elif stats image_stats(i,j,k)
- else
- print 'ssh nd run d d' (i, j, k)
-
- if process clear_process(k)
- elif bgi clear_temp(k)
-
- if process unload_image(i)
7Users View
Experiments
0, n
.01, .1, 1.0
script gen
10, 12, 14
print
0, 0.01, 10 0, 0.01, 12 0, 0.01, 14 0, 0.1,
10 0, 0.1, 12
i, j, k
Parameters
8The PSWEEP Pattern
9Abstracting the Loops
Parameter. A Parameter is an iterator or
container that supplies the values for a variable
in the experiment. Enumerator. The enumerator
takes a ordered list of parameters and
lexigraphically enumerates all possible values.
State. The state contains the current value of
each parameter, in order.
- i house.jpg, lena.jpg
- j 1, 2, 4, 8
- K motion, gaussian
-
- params i, j, k
- e enumerator(params)
-
- for state in e process_image(state)
10Abstracting the Experiments
Task. A Task is any unit of work performed when a
parameter value changes. A Task is subdivided
into setup and cleanup operations, corresponding
to the work done at the beginning and end of a
block of code in a loop, respectively.
Experiment. An Experiment is a collection of
tasks.
- def PrepareImage(state, img)
- Setup
- db_load(img, './current.jpg')
- yield suspend the function
- Cleanup
- delete('./current.jpg')
- def ProcessImage(state, alg)
- data load('./current.jpg')
- img process(data, alg(value))
- save(img, str(state) '.jpg')
-
- return no cleanup
11Binding Experiments to State
Bound Task Semantics. Tasks must execute in the
same order they would if the parameter sweep was
expanded to nested loops.
- for img in images
- PrepareImage.setup(img)
- for alg in algs
- ProcessImage.setup(alg)
- PrepareImage.cleanup(img)
- e enumerator(images, algs)
- e.bind(images, PrepareImage)
- e.bind(algs, ProcessImage)
-
- for state in e pass
These examples are equivalent.
12Distributing the Workload
DistributedEnumerator. DistributedEnumerator is
an Enumerator that distributes the state to
multiple instances across multiple computing
resources.
e RoundRobin(params) for state in e
pass States p1 house.jpg, 1, motion p2
house.jpg, 1, gaussian house.jpg, 2,
motion house.jpg, 2, gaussian
house.jpg, 4, motion house.jpg, 4,
gaussian lena.jpg, 1, motion
lena.jpg, 1, gaussian lena.jpg, 2,
motion lena.jpg, 2, gaussian
lena.jpg, 4, motion lena.jpg, 4,
gaussian
e Domain(params, images) for state in e
pass States p1 house.jpg, 1, motion
house.jpg, 1, gaussian house.jpg, 2,
motion house.jpg, 2, gaussian
house.jpg, 4, motion house.jpg, 4,
gaussian p2 lena.jpg, 1, motion
lena.jpg, 1, gaussian lena.jpg, 2,
motion lena.jpg, 2, gaussian
lena.jpg, 4, motion lena.jpg, 4,
gaussian
e MasterWorker(params) for state in e
pass States p1 house.jpg, 1, motion p2
house.jpg, 1, gaussian house.jpg, 2,
motion house.jpg, 2, gaussian
house.jpg, 4, motion house.jpg, 4,
gaussian lena.jpg, 1, motion
lena.jpg, 1, gaussian lena.jpg, 2,
motion lena.jpg, 2, gaussian
lena.jpg, 4, motion lena.jpg, 4,
gaussian
The DistributedEnumerators must ensure that bound
state semantics are satisfied.
13Implementations
- Python
- Designed around Iterators and Generators
- DistribtedEnumerator based on pyMPI
- Ideal for managing experiments on clusters
- C
- Template metaprogramming techniques remove
abstraction penalties - Ideal for applications with many nested loops
14C Example
Generate HTML tables for days of the week with
hours for the rows and minutes for the colums
Task Classes
Parameter Sweep
- struct table_task
- void setup(State state)
- stdcout ltlt "lttable title\""
- print_last_param()(state)
- stdcout ltlt "\"gt\n"
-
- void cleanup(State)
- stdcout ltlt "lt/tablegt\n"
-
-
- struct table_row_task
- // As above with lttrgt
-
- struct table_data_task
- // As above with lttdgt
- int main()
-
- using boostmake_tuple
- sweep(make_tuple("Sat", "Sun"
- make_tuple(range(24)
- make_tuple(range(0,60,10))))
- empty_state().
- bindlt0gt(table_task()).
- bindlt1gt(table_row_task()).
- bindlt2gt(table_data_task()),
- print_last_param())
- return 0
15Conclusions
- PSWEEP cleanly separates concerns
- Parameters
- Tasks
- Resources
- Modern languages enable flexible and
high-performance implementations
16Reference
A Lightweight Pattern for Managing Distributed
Computational Experiments Christopher
Mueller, Douglas Gregor, and Andrew Lumsdaine.
Submitted to HPDC 2006.
http//www.osl.iu.edu/chemuell/new/psweep.php
17Questions?