SRM in GGF - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

SRM in GGF

Description:

whose function is to provide dynamic. space allocation. file management ... CERN: Olof Barring, Jean-Philippe Baud, James Casey, Peter Kunszt ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 42
Provided by: arie54
Category:
Tags: ggf | srm | baud

less

Transcript and Presenter's Notes

Title: SRM in GGF


1
Storage Resource Management Providing Uniform
Access to Storage Systems (including
MSSs) Arie Shoshani LBNL (on behalf of the SRM
collaboration) http//sdm.lbl.gov/srm-wg
2
Storage Resource Managers (SRMs)
Definition SRMs are middleware components whose
function is to provide dynamic space allocation
file management of shared storage components on
the Grid
3
History
  • 4 year of Storage Resource (SRM) Management
    activity
  • Experience with system implementations v.1.x -
    2001
  • MSS HPSS (LBNL, ORNL, BNL), Enstore (Fermi),
    JasMINE (Jlab), Castor (CERN), MSS (NCAR), SE
    (RAL)
  • Disk systems DRM(LBNL), dCache(Fermi), jSRM
    (Jlab),
  • SRM v2.x spec was finalized - 2003
  • Several implementations of v2.x completed or
    in-progress
  • Jlab, Fermi, CERN, LBNL
  • Started GSM GGF-BOF at GGF8 (June 2003)
  • Last SRM collaboration meeting Sept. 2004
  • SRM v3.x spec (for GGF) being finalized - 2005

4
Uniformity of Interface ?Compatibility of SRMs
Client USER/APPLICATIONS
Grid Middleware
SRM
SRM
SRM
SRM
SRM
SRM
SRM
Enstore
dCache
JASMine
Castor
Unix-based disks
SE
5
Current Storage Resource ManagementActive
Working Group
CERN Olof Barring, Jean-Philippe Baud, James
Casey, Peter Kunszt Rutherford
lab Jens Jensen, Owen Synge Jefferson Lab
Bryan Hess, Andy Kowalski, Chip Watson Fermilab
Don Petravick, Timur Perelmutov LBNL Junmin Gu
, Arie Shoshani, Alex Sim, Kurt
Stockinger Univa Rich Wellner
6
Basic Issues
  • Suppose you want to run a job on your local
    machine
  • Need to allocate space
  • Need to bring all input files from a storage
    system
  • Need to ensure correctness of files transferred
  • Need to monitor and recover from errors
  • What if files dont fit space? Need to manage
    file streaming
  • Need to remove files to make space for more files
  • Now, suppose that the machine and storage space
    is a shared resource
  • Need to to the above for many users
  • Need to enforce quotas
  • Need to ensure fairness of space allocation and
    scheduling

7
Basic Issues
  • Now, suppose you want to do that on the WAN
  • Need to access a variety of storage systems
  • mostly remote systems, need at have access
    permission
  • Need to have special software to access mass
    storage systems
  • Now, suppose you want to run distributed jobs on
    the WAN
  • Need to allocate remote spaces
  • Need to move (stream) files to remote sites
  • Need to manage file outputs and their movement to
    destination site(s)

8
Types of storage resource managers
  • Disk Resource Manager (DRM)
  • Manages one or more disk resources
  • Tape Resource Manager (TRM)
  • Manages access to a tertiary storage system (e.g.
    HPSS)
  • Hierarchical Resource Manager (HRMTRM DRM)
  • An SRM that stages files from tertiary storage
    into its disk cache

9
Peer-to-Peer Uniform Interface
10
General Analysis Scenario
11
Standards for Grid Storage Management
  • Main concepts
  • Allocate spaces
  • Get/put files from/into spaces
  • Pin files for a lifetime
  • Release files and spaces
  • Get files into spaces from remote sites
  • Manage directory structures in spaces
  • SRMs communicate as peer-to-peer
  • Negotiate transfer protocols
  • No logical name space management (rely of GGF-
    GFS)

12
Where do SRMs belongin the Grid architecture?

.
2
N
G
O
R

S
T
O
R
E
O
Request
Workflow or
I
E

V
O

C
T
Community
Application-
Consistency Services

I
C
N
I
Interpretation
Request
A
L
T
I
I
F
C
Authorization
Specific Data
(e.g., Update Subscription,
A
C
V
I
A
I
and Planning
Management
C
R
U
M
E
L
Services
Discovery Services
Versioning, Master Copies)
E
T
L
E
P
O
E
Services
Services
L
P
S
R
P
V
D
I
S
O
I
A
V
T
C
C
E
L

G
R
L
1
S

N
O
O
I
E
L
E
E
C
F
T
V

L
C
Data Filtering or
A
General Data
Storage
Compute
Data
Monitoring/
Data
I
A
S
P
R
T
R
N
I
E
Transformation
Discovery
Management
Scheduling
Transport
Auditing
Federation
E
T
U
C
I
C
D
L
N
E
O
I
Services
Services
(Brokering)
(Brokering)
Services
Services
Services
L
E
V
R
U
S
L
M
G
R
O
E
O
E
O
R
C
S
C
Resource
Storage
Compute
Data Filtering or
Database
File Transfer
Monitoring/
Resource
Resource
Transformation
Management
Service
Auditing
Manager
Management
Services
Services
(GridFTP)
Y
T
I
V
I
Communication
Authentication and
T
C
Protocols (e.g.,
Authorization
E
TCP/IP stack)
Protocols (e.g., GSI)
N
N
O
C
C
Other Storage
I
Mass Storage System (HPSS)
Compute
R
Networks
B
This figure based on the Grid Architecture paper
by Globus Team
Systems
A
systems
F
13
SRMs supports data movement betweenstorage
systems

.
2
N
G
O
R

S
T
O
R
E
O
Request
Workflow or
I
E

V
O

C
T
Community
Application-
Consistency Services

I
C
N
I
A
Interpretation
Request
L
T
I
I
F
C
A
Authorization
Specific Data
(e.g., Update Subscription,
C
V
I
A
I
and Planning
Management
C
R
U
M
E
L
Services
Discovery Services
Versioning, Master Copies)
E
T
L
E
P
O
E
Services
Services
L
P
S
R
P
V
D
I
S
O
I
A
V
T
C
C
E
L

G
R
L
1
S

N
O
O
I
E
L
E
E
C
F
T
Storage
V

L
C
Data Filtering or
A
Compute
Data
Monitoring/
Data
General Data
I
A
S
P
R
T
R
N
I
E
Transformation
Scheduling
Transport
Auditing
Federation
E
Discovery
Data
T
U
C
I
C
D
L
N
E
O
I
Services
(Brokering)
Services
Services
Services
Services
L
E
V
R
U
Movement
S
L
M
G
R
O
E
O
E
O
R
C
S
C
E
L

S
G
E
E
N
C
Resource
Storage
I
C
Compute
Data Filtering or
Database
File Transfer
R
S
R

Monitoring/
U
Resource
Resource
Transformation
Management
Service
U
G
O
N
O
Auditing
Manager
Management
Services
Services
(GridFTP)
S
I
S
R
E
E
A
R
R
H
S
Y
T
I
V
I
Communication
Authentication and
T
C
Protocols (e.g.,
Authorization
E
TCP/IP stack)
Protocols (e.g., GSI)
N
N
O
C
C
Other Storage
I
Mass Storage System (HPSS)
Compute
R
Networks
B
This figure based on the Grid Architecture paper
by Globus Team
Systems
A
systems
F
14
SRM Functional Concepts
  • Manage Spaces dynamically
  • Reservation, lifetime
  • Negotiation
  • Manage files in spaces
  • Request to put files in spaces
  • Request to get files from spaces
  • Lifetime, pining of files, release of files
  • No logical name space management (done by replica
    location services)
  • Access remote sites for files
  • Bring files from other sites and SRMs as
    requested
  • Use existing transport services (GridFTP, https,
    )
  • Transfer protocol negotiation
  • Manage multi-file requests
  • Manage request queues
  • Manage caches
  • Manage garbage collection
  • Directory Management
  • Uxix semantics srmLs, srmMkdir, srmMv, srmRm,
    srmRmdir

15
Concepts Types of Files
  • Volatile temporary files with a lifetime
    guarantee
  • Files are pinned and released
  • Files can be removed by SRM when released or when
    lifetime expires
  • Permanent
  • No lifetime
  • Files can only be removed by creator (owner)
  • Durable files with a lifetime that CANNOT be
    removed by SRM
  • Files are pinned and released
  • Files can only be removed by creator (owner)
  • If lifetime expires invoke administrative
    action (e.g. notify owner, archive and release)

16
Concepts Types of Spaces
  • Types
  • Volatile
  • Space can be reclaimed by SRM when lifetime
    expires
  • durable
  • Space can be reclaimed by SRM only if it does NOT
    contain files
  • Can choose to archive files and release space
  • Permanent
  • Space can only be released by owner or
    administrator
  • Assignment of files to spaces
  • Files can only be assigned to spaces of the same
    type
  • Spaces can be reserved
  • No limit on number of spaces
  • Space reference handle is returned to client
  • Total space of each type are subject to SRM
    and/or VO policies
  • Default spaces
  • Files can be put into SRM spaces without explicit
    reservation
  • Defaults are not visible to client
  • Compacting space
  • Release all unused space space that has no
    files or files whose lifetime expired

17
Concepts Directory Management
  • Usual unix semantics
  • srmLs, srmMkdir, srmMv, srmRm, srmRmdir
  • A single directory for all file type
  • No directories for each type
  • File assignment to types is virtual
  • File can be placed in SRM-managed directories by
    maitaining mapping to clients directory
  • Access control services
  • Support owner/group/world permission
  • Can only be assigned by owner
  • When file requested by user, SRM should check
    permission with source site

18
Examples of Directory Structures(user defined)
D1
D1
D3
D2
D3
D2
D4
D4
F2 (P)
F1 (D)
F3 (V)
F1 (V)
F2 (V)
F3 (V)
F4 (D)
F5 (D)
F6 (D)
F7 (P)
F8 (P)
F4 (P)
F5 (D)
(1) Mixed file types
(2) By file type
  • Supported function ChangeFileType
  • Advantage of (1) no need to move files when
    file types are changed

19
Concepts Space Reservations
  • Negotiation
  • Client asks for space C-guaranteed, MaxDesired
  • SRM return S-guaranteed lt C-guaranteed,
    best effort lt MaxDesired
  • Type of space
  • Can be specified
  • Subject to limits per client (SRM or VO policies)
  • Default volatile
  • Lifetime
  • Negotiated C-lifetime requested
  • SRM return S-lifetime lt C-lifetime
  • Reference handle
  • SRM returns space reference handle
  • User can provide srmSpaceTokenDescription to
    recover handles

20
Concepts Transfer Protocol Negotiation
  • Negotiation
  • Client provides an ordered list
  • SRM return highest possible protocol it supports
  • Example
  • Protocols list bbftp, gridftp, ftp
  • SRM returns gridftp
  • Advantages
  • Easy to introduce new protocols
  • User controls which protocol to use
  • Default SRM policy choice
  • How it is returned?
  • The protocol of the Transfer URL (TURL)
  • Example bbftp//dm.slac.edu/temp/run11/File678.tx
    t

21
Concepts Multi-file requests
  • Can srmRequestToGet multiple files
  • Required Files URLs
  • Optional space file type, space handle, Protocol
    list
  • Optional total retry time
  • Provide Site URL (SURL)
  • URL known externally e.g. in Rep Catalogs
  • e.g. srm//sleepy.lbl.gov4000/tmp/foo-123
  • Get back transfer URL (TURL)
  • Path can be different that in SURL SRM internal
    mapping
  • Protocol chosen by SRM
  • e.g. gridftp//dm.lbl.gov4000/home
    /level1/foo-123
  • Managing request queue
  • Allocate space according to policy, system load,
    etc.
  • Bring in as many files as possible
  • Provide information on each file brought in or
    pinned
  • Bring additional files as soon as files are
    released
  • Support file streaming

22
SRM Methods
Space management srmReserveSpace srmReleaseSpace s
rmUpdateSpace srmCompactSpace   FileType
management srmChangeFileType Status/metadata srm
GetRequestStatus srmGetFileStatus srmGetRequestSum
mary srmGetRequestID srmGetFilesMetaData srmGetSpa
ceMetaData
File Movement srmPrepareToGet srmPrepareToPut srmC
opy   Lifetime management srmReleaseFiles srmPutDo
ne srmExtendFileLifeTime Terminate/resume srmAbor
tRequest srmAbortFile srmSuspendRequest srmResumeR
equest  
23
SRM v3.x Basic vs. Advanced Features
BASIC ADVANCED
  • File movement
  • PrepareToGet
  • PrepareToPut
  • Copy
  • Request capabilities
  • Multi-file Streaming
  • Trans. Prot. Negotiation
  • File lifetime negotiation
  • File types
  • Volatile
  • Permanent
  • durable

yes
yes
yes
yes
no
yes
yes
yes
yes
yes
no
yes
yes
yes
yes (for MSS)
yes
no
yes
24
Features in Basic vs. Advanced SRM
BASIC ADVANCED
  • Space reservations
  • Space-time negotiation
  • Space types
  • Remote access
  • gridFTP
  • Other SRMs
  • User-specified Directory
  • Volatile
  • Permanent
  • Durable
  • Terminate/suspend
  • Abort file
  • Abort request
  • Suspend/resume request

no
yes
no
yes
no
yes
no
yes
no
yes
yes
yes
no
yes
yes
yes
yes
yes
no
yes
25
Use of SRMs forRobust directory-to-directoryfil
e replication
Use Case
26
Massive Robust File Replication
  • Multi-File Replication why is it a problem?
  • Tedious task many files, repetitious
  • Lengthy task long time, can take hours, even
    days
  • Error prone need to monitor transfers
  • Error recovery need to restart file transfers
  • Stage and archive from MSS limited concurrency,
    down time, transient failures
  • Use of FTP no large windows / multiple streams
  • Security both for local MSS and the network
  • Firewalls transfer from/to MSS must be internal
    to the site
  • Specialized MSS HPSS at NERSC, ORNL, ,
  • Legacy MSS MSS at NCAR

27
Main Idea
  • Leverage off Storage Resource Managers (SRMs)
    Technology
  • Supported by SRM middleware project
  • Leverage from experience with other SciDAC
    projects PPDG
  • What do you get?
  • SRMs queue multi-file requests
  • SRMs allocate space and release space
    automatically
  • SRMs request files from remote SRMs
  • Recover from network failures
  • SRMs invoke GridFTP use large windows
    parallel streams

28
DataMover HRMs use in ESG for Robust Muti-file
replication
Anywhere
DataMover
BNL
HRM (performs writes)
HRM (performs reads)
LBNL/ ORNL
29
(No Transcript)
30
Web-Based File Monitoring Tool
  • Shows
  • Files already transferred- Files during
    transfer
  • Files to be transferred
  • Also shows for
  • each file
  • Source URL
  • Target URL
  • Transfer rate

31
File tracking helps to identify bottlenecks
Shows that archiving is the bottleneck
32
File tracking shows recovery from transient
failures
Total 45 GBs
33
File tracking shows network slowdown and recovery
Total 53 GBs
34
Multi-file Transfer plot from BNL to LBNL
(27/02/04)
1 Request ACCEPTED 2 File SpaceReserved 3
Grid FTPStart 4 Grid FTPEnd 5 HPSS
MIGRATION_REQUEST 6 HPSS ARCHIVE_START 7
HPSS ARCHIVED 8 File Released
35
Multi-file Transfer plot from BNL to LBNL
(10/02/04)
1 Request ACCEPTED 2 File SpaceReserved 3
Grid FTPStart 4 Grid FTPEnd 5 HPSS
MIGRATION_REQUEST 6 HPSS ARCHIVE_START 7
HPSS ARCHIVED 8 File Released 9 File
SpaceClaimed 10 HPSS Archivig_Error
36
Summary
  • Storage Resource Management essential for Grid
  • SRM is a functional definition
  • Adaptable to different frameworks (WS, OGSA,
    WSRF, )
  • Multiple implementations interoperate
  • Permit special purpose implementations for unique
    products
  • Permits interchanging one SRM product by another
  • SRM implementations exist and some in production
    use
  • Particle Physics Data Grid
  • Earth System Grid
  • More coming
  • Cumulative experience in GGF-WG
  • Specifications SRM v3.0 complete

37
Extra Slides
38
Space Reservation Functional Spec
  • srmReserveSpace
  • In TUserID userID,
  • TSpaceType typeOfSpace,
  • String userSpaceTokenDescription,
  • TSizeInBytes sizeOfTotalSpaceDesired,
  • TSizeInBytes sizeOfGuaranteedSpaceDesired,
  • TLifeTimeInSeconds lifetimeOfSpaceToReserve,
  • TStorageSystemInfo storageSystemInfo
  • Out TSpaceType typeOfReservedSpace,
  • TSizeInBytes sizeOfTotalReservedSpace,
  • TSizeInBytes sizeOfGuaranteedReservedSpace,
  • TLifeTimeInSeconds lifetimeOfReservedSpace,
  • TSpaceToken, referenceHandleOfReservedSpace,
  • TReturnStatus returnStatus

39
Request-to-Get Files Functional Spec
  • srmPrepareToGet
  • In TUserID userID,
  • TGetFileRequest arrayOfFileRequest,
  • string arrayOfTransferProtocols,
  • string userRequestDescription,
  • TStorageSystemInfo storageSystemInfo,
  • TLifeTimeInSeconds TotalRetryTime
  • Out TRequestToken requestToken,
  • TReturnStatus returnStatus,
  • TGetRequestFileStatus arrayOfFileStatus

40
TGetFileRequest typedef Functional Spec
  • typedef struct TSURLInfo fromSURLInfo,
  • TLifeTimeInSeconds lifetime, // pin time
  • TFileStorageType fileStorageType,
  • TSpaceToken spaceToken,
  • TDirOption dirOption
  • TGetFileRequest

41
Detailed sequence of actions For each file being
replicated
Anywhere
srmCopy (sourceURLhpss.lbnl.gov/xyz/file_x,
targetURL mss.ncar.gov/uvw/file_y)
Get list of files from directory
Request files
Write a Comment
User Comments (0)
About PowerShow.com