Title: Overview of Lustre
1Overview of Lustre
- ECE, U of MN
- Changjin Hong (Prof. Tewfiks group)
- hongcj92_at_ece.umn.edu
- Monday, Aug. 19, 2002
2Outline
- Reference
- Lustre Cluster
- Lustre System Components
- Distributed Lock Manager
- Object Based Storage
- Conclusion (security issues)
3Reference
- Lustre A SAN File System for Linux
- http//www/lustre.org/docs/lustre/luswhite.pdf
- Several presentation materials from Dr. Peter J.
Braam
4A Lustre Cluster
10,000s
10s of nodes
1,000s
5Key Design Issue Scalability
- I/O throughput
- How to avoid bottlenecks
- Metadata scalability
- How can 10,000s of nodes work on files in same
folder - Cluster Recovery
- If sth fails, how can transparent recovery happen
- Management
- Adding, removing, replacing, systems data
migration backup
6System Components
7Interaction between systems
MDS
CMD protocol (directory) metadata
handling, inodes updates, concurrency
- Pre-allocation file creation, recovery purpose,
file status,
Client
OS protocol File I/O, allocation of blocks,
striping, security enforcement
OST
8Client File System
- A directory tree, subdivision into filesets for
cluster ?wide Unix file sharing semantics - CMD protocol
- Transaction-based
- Authenticated access
- Write-behind caching for MD updates with strict
data/metadata coherency
9Metadata Service (MDS)
- All access to the file is governed by MDS which
will directly or indirectly authorize access. - To control namespace and manage inodes
- Load balanced cluster service for the scalability
(a well balanced API, a stackable framework for
logical MDS, replicated MDS) - Journaled batched metadata updates
10Object Storage Targets (OST)
- Keep file data objects
- File I/O service ?Access to the objects
- The block allocation for data obj., leading
distributed and scalability - OST s/w modules
- OBD server, Lock server
- Obj. storage driver, OBD filter
- Portal API
11VAXCluster DLM adapted
12Distributed Lock Manager
- For generic and rich lock service
- Lock resources resource database
- Organize resources in trees
- High performance
- node that acquires resource manages tree
13Big Picture
- Resource Tree and namespace
Resource manager
Obj.1
ltnamespacegt Name1 Name2 Name3 Name4
R
R
Obj.2
R
R
Obj.3
distributed resource directory/hash function
(LDWV)/lock directory
Obj.4
Apps.
14Mechanism in resource dB
- Hash binary string N ? get h
- Lookup system in lock directory weight
- vector h ? find system K.
- Systems
- may occupy 0, 1 or more slots in LDWV
- Number of slots is lock directory weight
15Lustre DLM features
- Low concurrency
- Want write-back caching
- High concurrency
- Want load balancing in cluster
- Subdivide directories etc with hashes
- Want server of request to limit lock
revocations-gt ops. on the MD cluster in a client
server RPC model - Deadlock detection
16Object Based Storage
17Object Based Storage
- Object Based Storage Device
- More intelligent than block device
- Speak storage at inode level
- create, unlink, read, write, getattr, setattr
- Iterators, security, almost arbitrary processing
18Components of OB Storage
- Storage Object Device Drivers
- Class drivers attach driver to interface
- Targets, clients remote access
- Direct drivers to manage physical storage
- Logical drivers for intelligence storage
management - Object storage application (OSA)
- (cluster) file systems
- Advanced storage parallel I/O, snapshots
- Specialized apps. caches, dbs, filesrv
19System Interface
- Modules
- Load the kernel modules to get drivers of a
certain type - Name devices to be of a certain type
- Build stacks of devices with assigned types
20Layering of Object Drivers
21Interaction of Obj. Storages/w modules
22Benefits-clustering/SM
- Suitable for use in a SAN file system
- Shared at the level of an individual block
- Obj namespace divided into obj group. This is
very advantageous to be able to create obj w/
given obj ids. Good for snapshot! - Hot file migration
23Conclusion
- Object Based Storage
- To process the disk operations on the higher
concept of individual files and the file inode
level, rather than the low-level h/w disk block
level. - Security Issues
- Auxiliary service in cluster
- LDAP, PKI, Kerberos
- Purpose
- CFS/ MDS/ OST
- Authenticate to each other
- Set up session keys
24Etc.
- GSS-API for authentication and Integrity Checks
- Remote DMA
- Layer for NEVER bypass security processing
- Request processing for checking authentication by
a higher level layer in the networking stack