Title: STORK
1STORK NeST Making Data Placement a First Class
Citizen in the Grid
2Need to move data around..
3While doing this..
- Locate the data
- Access heterogeneous resources
- Recover form all kinds of failures
- Allocate and de-allocate storage
- Move the data
- Clean-up everything
All of these need to be done reliably and
efficiently!
4Stork
- A scheduler for data placement activities in the
Grid - What Condor is for computational jobs, Stork is
for data placement - Storks fundamental concept
- Make data placement a first class citizen in the
Grid.
5Outline
- Introduction
- The Concept
- Stork Features
- Big Picture
- Conclusions
6The Concept
7The Concept
8The Concept
9The Concept
Condor Job Queue
Data A A.submit Data B B.submit Job C
C.submit .. Parent A child B Parent B child
C Parent C child D, E ..
DAG specification
C
DAGMan
Stork Job Queue
C
E
10Why Stork?
- Stork understands the characteristics and
semantics of data placement jobs. - Can make smart scheduling decisions, for reliable
and efficient data placement.
11Understanding Job Characteristics Semantics
- Job_type transfer, reserve, release?
- Source and destination hosts, files, protocols to
use? - Determine concurrency level
- Can select alternate protocols
- Can select alternate routes
- Can tune network parameters (tcp buffer size, I/O
block size, of parallel streams) -
12Support for Heterogeneity
Protocol translation using Stork memory buffer.
13Support for Heterogeneity
Protocol translation using Stork Disk Cache.
14Flexible Job Representation
-
- Type Transfer
- Src_Url srb//ghidorac.sdsc.edu/kosart.cond
or/x.dat - Dest_Url nest//turkey.cs.wisc.edu/kosart/x
.dat -
-
-
-
-
15Failure Recovery and Efficient Resource
Utilization
- Fault tolerance
- Just submit a bunch of data placement jobs, and
then go away.. - Control number of concurrent transfers from/to
any storage system - Prevents overloading
- Space allocation and De-allocations
- Make sure space is available
16Outline
- Introduction
- The Concept
- Stork Features
- Big Picture
- Conclusions
17USER
JOB DESCRIPTIONS
Abstract DAG
18USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
19USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
DATA PLACEMENT SCHEDULER
COMPUTATION SCHEDULER
20USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
POLICY ENFORCER
DATA PLACEMENT SCHEDULER
COMPUTATION SCHEDULER
C. JOB LOG FILES
D. JOB LOG FILES
21USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
POLICY ENFORCER
DATA PLACEMENT SCHEDULER
COMPUTATION SCHEDULER
C. JOB LOG FILES
D. JOB LOG FILES
DATA MINER
NETWORK MONITORING TOOLS
FEEDBACK MECHANISM
22USER
JOB DESCRIPTIONS
Abstract DAG
Concrete DAG
DAGMAN
MATCHMAKER
STORK
CONDOR/ CONDOR-G
C. JOB LOG FILES
D. JOB LOG FILES
DATA MINER
NETWORK MONITORING TOOLS
FEEDBACK MECHANISM
23Conclusions
- Regard data placement as individual jobs.
- Treat computational and data placement jobs
differently. - Introduce a specialized scheduler for data
placement. - Provide end-to-end automation, fault tolerance,
run-time adaptation, multilevel policy support,
reliable and efficient transfers.
24Future work
- Enhanced interaction between Stork and higher
level planners - better coordination of CPU and I/O
- Interaction between multiple Stork servers and
job delegation from one to another - Enhanced authentication mechanisms
- More run-time adaptation
25Related Publications
- Tevfik Kosar and Miron Livny. Stork Making Data
Placement a First Class Citizen in the Grid. In
Proceedings of 24th IEEE Int. Conference on
Distributed Computing Systems (ICDCS 2004),
Tokyo, Japan, March 2004. - George Kola, Tevfik Kosar and Miron Livny. A
Fully Automated Fault-tolerant System for
Distributed Video Processing and Off-site
Replication. To appear in Proceedings of 14th ACM
Int. Workshop on etwork and Operating Systems
Support for Digital Audio and Video (Nossdav
2004), Kinsale, Ireland, June 2004. - Tevfik Kosar, George Kola and Miron Livny. A
Framework for Self-optimizing, Fault-tolerant,
High Performance Bulk Data Transfers in a
Heterogeneous Grid Environment. In Proceedings
of 2nd Int. Symposium on Parallel and Distributed
Computing (ISPDC 2003), Ljubljana, Slovenia,
October 2003. - George Kola, Tevfik Kosar and Miron Livny.
Run-time Adaptation of Grid Data Placement
Jobs. In Proceedings of Int. Workshop on
Adaptive Grid Middleware (AGridM 2003), New
Orleans, LA, September 2003.
26You dont have to FedEx your data anymore..
Stork delivers it for you!
- For more information
- Email condor-admin_at_cs.wisc.edu
- http//www.cs.wisc.edu/condor/stork
27NeST
- NeST (Network Storage Technology)
- A lightweight, portable storage manager for data
placement activities on the Grid - Allocation NeST negotiates mini storage
contracts between users and server. - Multi-protocol Supports Chirp, GridFTP, NFS,
HTTP - Chirp is NeSTs internal protocol.
- Secure GSI authentication
- Lightweight Configuration and installation can
be performed in minutes, and does not require
root.
28Why storage allocations ?
- Users need both temporary storage, and long-term
guaranteed storage. - Administrators need a storage solution with
configurable limits and policy. - Administrators will benefit from NeSTs automatic
reclamations of expired storage allocations.
29Storage allocations in NeST
- Lot abstraction for storage allocation with an
associated handle - Handle is used for all subsequent operations on
this lot - Client requests lot of a specified size and
duration. Server accepts or rejects client
request.
30Lot types
- User / Group
- User single user (user controls ACL)
- Group shared use (all members control ACL)
- Best effort / Guaranteed
- Best effort server may purge data if necessary.
Good fit for derived data. - Guaranteed server honors request duration.
- Hierarchical Lots with lots (sublots)
31Lot operations
- Create, Delete, Update
- MoveFile
- Moves files between lots
- AddUser, RemoveUser
- Lot level authorization
- List of users allowed to request sub-lots
- Attach / Detach
- Performs NeST lot to path binding
32Functionality GT4 GridFTP
Sample Application
(GSI-FTP)
Disk Module
Disk Storage
33Functionality GridFTP NeST
(Lot operations, etc.)
(File transfers)
(chirp)
(GSI-FTP)
NeST Module
(File transfer) (chirp)
Chirp Handler
Disk Storage
34NeST with Stork
GT4
NeST
(Lot operations, etc.) (chirp)
(File transfers) (GSI-FTP)
NeST Module
(File transfer) (chirp)
Chirp Handler
Disk Storage
35NeST Sample Work DAG
Condor-G
Stork
Stork
36Connection Manager
- Used to control connections to NeST
- Allows connection reservations
- Reserve of simultaneous connections
- Reserve total bytes of transfer
- Reservations have durations expire
- Reservations are persistent
37Release Status
- http//www.cs.wisc.edu/condor/nest
- v0.9.7 expected soon
- v0.9.7 Pre 2 released
- Just bug fixes features frozen
- v1.0 expected later this year
- Currently supports Linux, will support other
O/Ss in the future.
38Roadmap
- Performance tests with Stork
- Continue hardening code base
- Expand supported platforms
- Solaris other UNIX-en
- Bundle with Condor
- Connection Manager
39Questions ?
- More information available at http//www.cs.wisc.e
du/condor/nest