Title: Enabling Data Intensive Science with Tactical Storage Systems
1EnablingData Intensive Sciencewith Tactical
Storage Systems
- Prof. Douglas Thain
- University of Notre Dame
- http//www.cse.nd.edu/dthain
2The Cooperative Computing Lab
- Our model of computer science research
- Understand how users with complex, large-scale
applications need to interact with computing
systems. - Design novel computing systems that can be
applied by many different users basic CS
research. - Deploy code in real systems with real users,
suffer real bugs, and learn real lessons
applied CS. - Application Areas
- Astronomy, Bioinformatics, Biometrics, Molecular
Dynamics, Physics, Game Theory, ... ??? - External Support NSF, IBM, Sun
http//www.cse.nd.edu/ccl
3Two Talks in One
- Paper at Supercomputing
- Applications of Tactical
4Abstract
- Users of distributed systems encounter many
practical barriers between their jobs and the
data they wish to access. - Problem Users have access to many resources
(disks), but are stuck with the abstractions
(cluster NFS) provided by administrators. - Solution Tactical Storage Systems allow any user
to create, reconfigure, and tear down
abstractions without bugging the administrator.
5The Standard Model
6The Standard Model
7Problems with the Standard Model
- Users encounter partitions in the WAN.
- Easy to access data inside cluster, hard outside.
- Must use different mechanisms on diff links.
- Difficult to combine resources together.
- Different access modes for different purposes.
- File transfer preparing system for intended use.
- File system access to data for running jobs.
- Resources go unused.
- Disks on each node of a cluster.
- Unorganized resources in a department/lab.
- A global file system cant satisfy everyone!
8What if...
- Users could easily access any storage?
- I could borrow an unused disk for NFS?
- An entire cluster can be used as storage?
- Multiple clusters could be combined?
- I could reconfigure structures without root?
- (Or bugging the administrator daily.)
- Solution Tactical Storage System (TSS)
9Outline
- Problems with the Standard Model
- Tactical Storage Systems
- File Servers, Catalogs, Abstractions, Adapters
- Applications
- Remote Database Access for BaBar Code
- Remote Dynamic Linking for CDF Code
- Logical Data Access for Bioinformatics Code
- Expandable Database for MD Simulation
- Improving the OS for Grid Computing
10Tactical Storage Systems (TSS)
- A TSS allows any node to serve as a file server
or as a file system client. - All components can be deployed without special
privileges but with security. - Users can build up complex structures.
- Filesystems, databases, caches, ...
- Two Independent Concepts
- Resources The raw storage to be used.
- Abstractions The organization of storage.
11App
Adapter
???
file system
file system
file system
file system
file system
file system
file system
12Components of a TSS
- 1 File Servers
- 2 Catalogs
- 3 Abstractions
- 4 Adapters
131 File Servers
- Unix-Like Interface
- open/close/read/write
- getfile/putfile to stream whole files
- opendir/stat/rename/unlink
- Complete Independence
- choose friends
- limit bandwidth/space
- evict users?
- Trivial to Deploy
- run server setacl
- no privilege required
- can be thrown into a grid system
- Flexible Access Control
Chirp Protocol
file server A
file server B
file system
owner of server A
owner of server B
14Related Work
- Lots of file services for the Grid
- GridFTP, NeST, SRB, RFIO, SRM, IBP, ...
- (Adapter interfaces with many of these!)
- Why have another file server?
- Reason 1 Must have precise Unix semantics!
- Apps distinguish ENOENT vs EACCES vs EISDIR.
- FTP always returns error 550, regardless of
error. - Reason 2 TSS focused on easy deployment.
- No privilege required, no config files, no
rebuilding, flexible access control, ...
15Access Control in File Servers
- Unix Security is not Sufficient
- No global user database possible/desirable.
- Mapping external credentials to Unix gets messy.
- Instead, Make External Names First-Class
- Perform access control on remote, not local,
names. - Types Globus, Kerberos, Unix, Hostname, Address
- Each directory has an ACL
- globus/ONotreDame/CNDThain RWLA
- kerberosdthain_at_nd.edu RWL
- hostname.cs.nd.edu
RL - address192.168.1.
RWLA
16Problem Shared Namespace
file server
globus/ONotreDame/ RWLAX
17Solution Reservation (V) Right
file server
mkdir only!
ONotreDame/CN V(RWLA)
182 - Catalogs
HTTP XML, TXT, ClassAds
catalog server
catalog server
periodic UDP updates
193 - Abstractions
- An abstraction is an organizational layer built
on top of one or more file servers. - End Users choose what abstractions to employ.
- Working Examples
- CFS Central File System
- DSFS Distributed Shared File System
- DSDB Distributed Shared Database
- Others Possible?
- Distributed Backup System
- Striped File System (RAID/Zebra)
20CFS Central File System
appl
appl
appl
adapter
adapter
adapter
CFS
CFS
CFS
file server
file
file
file
21DSFS Dist. Shared File System
appl
appl
adapter
adapter
DSFS
DSFS
file server
file server
file server
file
file
file
file
file
ptr
file
file
file
file
file
ptr
ptr
pointers to multiple copies
22DSDB Dist. Shared Database
appl
appl
adapter
adapter
DSDB
DSDB
insert
query
direct access
file server
file server
database server
create
file
file
file index
file
file
file
file
file
file
file
file
file
234 - Adapter
- Like an OS Kernel
- Tracks procs, files, etc.
- Adds new capabilities.
- Enforces owners policies.
- Delegated Syscalls
- Trapped via ptrace interface.
- Action taken by Parrot.
- Resources chrgd to Parrot.
- User Chooses Abstr.
- Appears as a filesystem.
- Option Timeout tolerance.
- Option Cons. semantics.
- Option Servers to use.
- Option Auth mechanisms.
system calls trapped via ptrace
Adapter - Parrot
process table
file table
Abstractions CFS DSFS - DSDB
HTTP, FTP, RFIO, NeST, SRB, gLite ???
24App
Adapter
???
file system
file system
file system
file system
file system
file system
file system
25Performance Summary
- Nothing comes for free!
- System calls order of magnitude slower.
- Memory bandwidth overhead extra copies.
- However
- TSS can take full advantage of bandwidth (!NFS)
- TSS can drive network/switch to limits.
- Typical slowdown on real apps 5-10 percent.
- Allows one to harness resources that would go
unused. - Observation Most users constrained by
functionality.
26Outline
- Problems with the Standard Model
- Tactical Storage Systems
- File Servers, Catalogs, Abstractions, Adapters
- Applications
- Remote Database Access for BaBar Code
- Remote Dynamic Linking for CDF Code
- Logical Data Access for Bioinformatics Code
- Expandable Database for MD Simulation
- Improving the OS for Grid Computing
27Remote Database Access
Credit Sander Klous _at_ NIKHEF
- HEP Simulation Needs Direct DB Access
- App linked against Objectivity DB.
- Objectivity accesses filesystem directly.
- How to distribute application securely?
- Solution Remote Root Mount via TSS
- parrot M //chirp/fileserver/rootdir
- DB code can read/write/lock files
directly.
GSI
script
DB data
TSS file server
file system
Parrot
libdb.so
WAN
GSI Auth
CFS
sim.exe
28Remote Application Loading
Credit Igor Sfiligoi _at_ Fermi National Lab
- Modular Simulation Needs Many Libraries
- Devel. on workstations, then ported to grid.
- Selection of library depends on analysis tech.
- Constraint Must use HTTP for file access.
- Solution Dynamic Link with TSSHTTP
- /home/cdfsoft -gt /http/dcaf.fnal.gov/cdfsoft
appl
proxy
select several MB from 60 GB of libraries
liba.so
HTTP server
file system
Parrot
libb.so
proxy
HTTP
libc.so
29Technical Problem
- HTTP is not a filesystem! (No directories)
- Advantages Firewalls, caches, admins.
Appl
HTTP Server
root
Parrot
etc
home
bin
HTTP Module
alice
cms
babar
30Technical Problem
- Solution Turn the directories into files.
- Can be cached in ordinary proxies!
Appl
HTTP Server
make httpfs
root
Parrot
etc
home
bin
HTTP Module
alice
cms
babar
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35Logical Access to Bio Data
- Many databases of biological data in different
formats around the world - Archives Swiss-Prot, TreMBL, NCBI, etc...
- Replicas Public, Shared, Private, ???
- Users and applications want to refer to data
objects by logical name, not location! - Access the nearest copy of the non-redundant
protein database, dont care where it is. - Solution EGEE data management system maps
logical names (LFNs) to physical names (SFNs).
Credit Christophe Blanchet, Bioinformatics
Center of Lyon, CNRS IBCP, France http//gbio.ibcp
.fr/cblanchet, Christophe.Blanchet_at_ibcp.fr
36Logical Access to Bio Data
gLite Server
BLAST
nr.data
EGEE File Location Service
Chirp Server
Parrot
nr.data
FTP Server
RFIO
gLite
HTTP
FTP
nr.data
37Appl Distributed MD Database
- State of Molecular Dynamics Research
- Easy to run lots of simulations!
- Difficult to understand the big picture
- Hard to systematically share results and ask
questions. - Desired Questions and Activities
- What parameters have I explored?
- How can I share results with friends?
- Replicate these items five times for safety.
- Recompute everything that relied on this
machine. - GEMS Grid Enabled Molecular Sims
- Distributed database for MD siml at Notre Dame.
- XML database for indexing, TSS for storage/policy.
38GEMS Distributed Database
Credit Jesus Izaguirre and Aaron Striegel, Notre
Dame CSE Dept.
database server
catalog server
catalog server
39Active Recovery in GEMS
40GEMS and Tactical Storage
- Dynamic System Configuration
- Add/remove servers, discovered via catalog
- Policy Control in File Servers
- Groups can Collaborate within Constraints
- Security Implemented within File Servers
- Direct Access via Adapters
- Unmodified Simulations can use Database
- Alternate Web/Viz Interfaces for Users.
41Outline
- Problems with the Standard Model
- Tactical Storage Systems
- File Servers, Catalogs, Abstractions, Adapters
- Applications
- Remote Database Access for BaBar Code
- Remote Dynamic Linking for CDF Code
- Logical Data Access for Bioinformatics Code
- Expandable Database for MD Simulation
- Improving the OS for Grid Computing
42OS Support for Grid Computing
- Distributed computing in general suffers because
of limitations in the operating system. - How can we improve the OS in the long term?
- Resource allocation
- Cannot reserve space -gt jobs crash
- Hard to clean up procs -gt unreliable systems
- Security and permissions
- No ACLs -gt hard to share data
- Only root can setuid -gt hard to secure services.
43Allocation in the Filesystem
root
jobs
logs
job23
ftp
coredump
ftp.log
44Allocation in the Filesystem
root
jobs
logs
ftp.log
45root
kerberos given to the login server.
httpd
kerberos
alice created by krb5 login.
alice
bob
anon1
anon2
student created at run-time.
The web server can create distinct anonymous
accounts. No need for global nobody.
student
visitor
visitor
These two users are completely different rootke
rberosalicevisitor rootkerberosbobvisitor
46Approach by Degrees
- What can we do as an ordinary user?
- Simulate OS functionality within Parrot.
- Drawback Performance / Assurance.
- What can we do as root?
- Setuid toolkit to manage system on request.
- Drawback Limitations in Policy / Expr.
- What can we do by modifying the OS?
- Modify kernel/FS to support to new features.
- Drawback Deployment.
47Tactical Storage Systems
- Separate Abstractions from Resources
- Components
- Servers, catalogs, abstractions, adapters.
- Completely user level.
- Performance acceptable for real applications.
- Independent but Cooperating Components
- Owners of file servers set policy.
- Users must work within policies.
- Within policies, users are free to build.
48Parting Thought
- Many users of the grid are constrained by
functionality, not performance. - TSS allows end users to build the structures that
they need for the moment without involving an
admin. - Analogy building blocks
- for distributed storage.
49Acknowledgments
- Science Collaborators
- Christophe Blanchet
- Sander Klous
- Peter Kunzst
- Erwin Laure
- John Poirer
- Igor Sfiligoi
- CS Collaborators
- Jesus Izaguirre
- Aaron Striegel
- CS Students
- Paul Brenner
- James Fitzgerald
- Jeff Hemmes
- Paul Madrid
- Chris Moretti
- Phil Snowberger
- Justin Wozniak
50For more information...
- Cooperative Computing Lab
- http//www.cse.nd.edu/ccl
- Cooperative Computing Tools
- http//www.cctools.org
- Douglas Thain
- dthain_at_cse.nd.edu
- http//www.cse.nd.edu/dthain
51Performance System Calls
52Performance - Applications
parrot only
53Performance I/O Calls
54Performance Bandwidth
55Performance DSFS
56SP5 Performance on EDG Testbed
57(No Transcript)
58(No Transcript)
59(No Transcript)