Title: Fault Tolerance and Reliable Data Placement
1Fault Tolerance and Reliable Data Placement
Zach Miller University of Wisconsin-Madison zmill
er_at_cs.wisc.edu
2Fault Tolerant Shell (FTSH)
- A grid is a harsh environment.
- FTSH to the rescue!
- The ease of scripting with very precise error
semantics. - Exception-like structure allows scripts to be
both succinct and safe. - A focus on timed repetition simplifies the most
common form of recovery in a distributed system. - A carefully-vetted set of language features
limits the "surprises" that haunt system
programmers.
3Simple Bourne script
- !/bin/sh
- cd /work/foo
- rm rf data
- cp -r /fresh/data .
What if /work/foo is unavailable??
4Getting Grid Ready
- !/bin/sh
- for attempt in 1 2 3
- cd /work/foo
- if ! ?
- then
- echo "cd failed, trying again..."
- sleep 5
- else
- break
- fi
- done
- if ! ?
- then
- echo "couldn't cd, giving up..."
- return 1
- fi
5Or with FTSH
- !/usr/bin/ftsh
- try 5 times
- cd /work/foo
- rm -rf bar
- cp -r /fresh/data .
- end
6Or with FTSH
- !/usr/bin/ftsh
- try for 3 days or 100 times
- cd /work/foo
- rm -rf bar
- cp -r /fresh/data .
- end
7Or with FTSH
- !/usr/bin/ftsh
- try for 3 days every 1 hour
- cd /work/foo
- rm -rf bar
- cp -r /fresh/data .
- end
8Exponential Backoff Example
- command_wrapper /path/to/command max_attempts
max_time each_time initial_delay - zmiller_at_cs.wisc.edu 2003-08-02
- try for 3 hours or 2 times
- try 1 time for 4 hours
- 1 6 7 8 9 10 11 12
- catch
- echo "ERROR 1 6 7 8 9 10
11 12 - echo "sleeping for delay
seconds" - sleep delay
- delaydelay .mul. 2
- failure
- end
- catch
- echo ERROR all attempts failed...
returning failure - exit 1
- end
9Another quick example
- hosts"mirror1.wisc.edu mirror2.wisc.edu
mirror3.wisc.edu" - forany h in hosts
- echo "Attempting host h"
- wget http//h/some-file
- end
- echo "Got file from h
- File transfers may be better served by Stork
10FTSH Summary
- All the usual shell constructs
- Redirection, loops, conditionals, functions,
expressions, nesting, - And more
- Logging
- Timeouts
- Process Cancellation
- Complete parsing at startup
- File cleanup
- Used on Linux, Solaris, Irix, Cygwin,
11FTSH Summary
- Written by Doug Thain
- Available under GPL license at
- http//www.cs.wisc.edu/thain/research/ftsh/
12Outline
- Introduction
- FTSH
- Stork
- DiskRouter
- Conclusions
13A Single Project..
- LHC (Large Hadron Collider)
- Comes online in 2006
- Will produce 1 Exabyte data by 2012
- Accessed by 2000 physicists, 150 institutions,
30 countries
14And Many Others..
- Genomic information processing applications
- Biomedical Informatics Research Network (BIRN)
applications - Cosmology applications (MADCAP)
- Methods for modeling large molecular systems
- Coupled climate modeling applications
- Real-time observatories, applications, and
data-management (ROADNet)
15The Same Big Problem..
- Need for data placement
- Locate the data
- Send data to processing sites
- Share the results with other sites
- Allocate and de-allocate storage
- Clean-up everything
- Do these reliably and efficiently
16Stork
- A scheduler for data placement activities in the
Grid - What Condor is for computational jobs, Stork is
for data placement - Stork comes with a new concept
- Make data placement a first class citizen in the
Grid.
17The Concept
18The Concept
19The Concept
Condor Job Queue
DaP A A.submit DaP B B.submit Job C
C.submit .. Parent A child B Parent B child
C Parent C child D, E ..
DAG specification
C
DAGMan
Stork Job Queue
C
E
20Why Stork?
- Stork understands the characteristics and
semantics of data placement jobs. - Can make smart scheduling decisions, for reliable
and efficient data placement.
21Failure Recovery and Efficient Resource
Utilization
- Fault tolerance
- Just submit a bunch of data placement jobs, and
then go away.. - Control number of concurrent transfers from/to
any storage system - Prevents overloading
- Space allocation and De-allocations
- Make sure space is available
22Support for Heterogeneity
Protocol translation using Stork memory buffer.
23Support for Heterogeneity
Protocol translation using Stork Disk Cache.
24Flexible Job Representation and Multilevel Policy
Support
-
- Type Transfer
- Src_Url srb//ghidorac.sdsc.edu/kosart.cond
or/x.dat - Dest_Url nest//turkey.cs.wisc.edu/kosart/x
.dat -
-
- Max_Retry 10
- Restart_in 2 hours
-
-
25Run-time Adaptation
- Dynamic protocol selection
-
- dap_type transfer
- src_url drouter//slic04.sdsc.edu/tmp/tes
t.dat - dest_url drouter//quest2.ncsa.uiuc.edu/tmp
/test.dat - alt_protocols nest-nest, gsiftp-gsiftp
-
-
- dap_type transfer
- src_url any//slic04.sdsc.edu/tmp/test.da
t - dest_url any//quest2.ncsa.uiuc.edu/tmp/tes
t.dat -
26Run-time Adaptation
- Run-time Protocol Auto-tuning
-
- link slic04.sdsc.edu quest2.ncsa.uiuc.edu
- protocol gsiftp
- bs 1024KB //block size
- tcp_bs 1024KB //TCP buffer size
- p 4
27Outline
- Introduction
- FTSH
- Stork
- DiskRouter
- Conclusions
28DiskRouter
- A mechanism for high performance, large scale
data transfers - Uses hierarchical buffering to aid in large scale
data transfers - Enables application-level overlay network for
maximizing bandwidth - Supports application-level multicast
29Store and Forward
C
A
With DiskRouter
DiskRouter
B
Without DiskRouter
Improves performance when bandwidth fluctuation
between A and B is independent of the bandwidth
fluctuation between B and C
30DiskRouter Overlay Network
90 Mb/s
B
A
31DiskRouter Overlay Network
90 Mb/s
B
A
400 Mb/s
400 Mb/s
DiskRouter
C
Add a DiskRouter Node C which is not necessarily
on the path from A to B, to enforce use of an
alternative path.
32Data Mover/Distributed Cache
Source
Destination
DiskRouter Cloud
- Source writes to the closest DiskRouter and
Destination receives it up from its closest
DiskRouter
33Outline
- Introduction
- FTSH
- Stork
- DiskRouter
- Conclusions
34Conclusions
- Regard data placement as first class citizen.
- Introduce a specialized scheduler for data
placement. - Introduce a high performance data transfer tool.
- End-to-end automation, fault tolerance, run-time
adaptation, multilevel policy support, reliable
and efficient transfers.
35Future work
- Enhanced interaction between Stork, DiskRouter
and higher level planners - co-scheduling of CPU and I/O
- Enhanced authentication mechanisms
- More run-time adaptation
36You dont have to FedEx your data anymore.. We
deliver it for you!
- For more information
- Stork
- Tevfik Kosar
- Email kosart_at_cs.wisc.edu
- http//www.cs.wisc.edu/condor/stork
- DiskRouter
- George Kola
- Email kola_at_cs.wisc.edu
- http//www.cs.wisc.edu/condor/diskrouter