CLRC e-Science Centre - PowerPoint PPT Presentation

About This Presentation
Title:

CLRC e-Science Centre

Description:

CLRC - e-Science Centre, Kerstin Kleese - van Dam and SDSC, ... Extractor. Cursor Control. MAPS. Initialization. DB2. Query. System. Schema. to. MAPS. Convertor ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 57
Provided by: RAl51
Category:

less

Transcript and Presenter's Notes

Title: CLRC e-Science Centre


1
CLRC e-Science Centre
SRB Kerstin Kleese -van Dam k.kleese_at_dl.ac.uk
2
Special thanks to George Kremenek -
kremenek_at_sdsc.edu Alasdair Earl -
aearl_at_ph.ed.ac.uk
3
Contents
  • Introduction
  • Architecture description
  • What is good
  • What needs improving
  • What can it be used for

4
Introduction
  • More and more information is available today, it
    can be
  • Random Information (e.g. news items)
  • Scientific Data
  • Commercial or Administrative Data
  • Data about Data (metadata describing the content
    of the actual data)
  • The information is generally available via/from
  • Web-sites, Filesystems, Databases, Tape Libraries
    or on Paper and other none digital media.

5
Introduction (2)
How do you find the information Search Engines,
Catalogue Systems or Hard Work (big bucket) How
do you evaluate the information Combine,
Compare, Present How do you manage the
information Preservation, Sharing, Replicating,
Transferring, Securing
6
Where does SRB fit into this Scenario?
  • SRB - the Storage Resource Broker can
  • Integrate distributed, heterogeneous storage
    devices
  • Make data access transparent for the user
  • Helps to share, replicate, transfer and preserve
    data
  • SRB can not
  • Replace metadata catalogues
  • Provide high level information services

7
How does SRB fit into a Grid Environment?
SRB can used to Manage information required
internally by Portals Integrate data across
various media Integrate data across sites SRB
can be used For a particular site In a research
collaboration In a wider Grid community
8
General Facts
  • Storage Resource Broker - SRB
  • Developed by the San Diego Supercomputing Centre
    (SDSC) from the mid 1990s for the US
    governments National Partnership for Advanced
    Computational Infrastructure (NPACI).
  • Initial release 1997
  • Latest version V1.1.8 - released February 2001
  • In the US approximately 200TB of data are shared
    via SRB between 30 participating Universities.
  • Used by the HPCPortal developed by Mary Thomas
    group at SDSC.

9
The SRB/MCAT Core Team
  • SDSC Team
  • Reagan Moore, Arcot Rajasekar, Michael Wan,
    George Kremenek, Charlie Coward, Sheau Yen Chen,
    Roman Olschanowski
  • SRB Expertise at SDSC
  • Michael Wan (SRB client/server, drivers,
    srbBrowser)
  • Arcot Rajasekar (MCAT, DB drivers)
  • George Kremenek (SRB Client Modules, Security,
    DAM, application design)
  • Charlie Coward Windows Servers and Browser
  • Sheau Yen Chen administration
  • Roman Olschanowski - testing

10
What is SRB?
  • SRB is an Intelligent Data Access System
  • SRB provides protocol transparency to diverse and
    distributed storage systems
  • SRB provides location transparency to distributed
    datasets
  • SRB provides access transparency to remote user
  • Extends File Systems
  • Extends Database Systems
  • Extends I/O protocol

11
SRB Access
  • SRB can be accessed in three ways
  • High Level graphical Java interface - SRB
    Browser
  • Application Programming interface - SRB API
    (high and low level)
  • Unix shell Command Line Interface - SRB Scommands

12
SRB Concepts(1)
  • Provide Scalability (Hosts, Resource Types,
    Resources, Collections, Data Objects - size and
    number, Users Groups)
  • Provide Uniform Interfaces (to Resources,
    Collections and Datasets, authentication across
    SRB Space)
  • Replication of Datasets
  • Access Control Lists
  • Ticket-based Access
  • Authentication and Encryption (text password,
    encrypted password, SEA and GSI)
  • Server-side proxy Operations
  • Metadata-based Discovery

13
SRB Concepts(2)
  • Provide Logical Abstractions
  • srbSpace - an abstract storage space
  • Resource Types - resource defined by properties
  • Resources - resource identified by name and type
  • multiple resources tied together as a single
    resource
  • Collections - abstraction over directory
    structure
  • distributed curated
  • Datasets - identified by properties
  • Users - authenticated across hosts/networks
  • Domain - abstraction over physical domains
  • Metadata Schema/Attributes

14
What is MCAT?
  • Cataloging System
  • Metadata Repository
  • Digital Object Metadata
  • type, format, lineage, usage methods,
    domain-specific attributes, collection info, etc
  • System-level Metadata
  • access control, audit trails, location,
    replication, resource types, user groups, etc
  • Schema-level Metadata
  • ontology, relationships among attributes/schemas,
    semantics of attributes, etc
  • Uniform Access and Federation interface

15
Contents
  • Introduction
  • Architecture description
  • What is good
  • What needs improving
  • What can it be used for

16
SRB V1.x Features
  • Multi-platform (clients and servers)
  • SunOS/Solaris, AIX, Cray C90, SGI, OSX
  • API and command line interfaces
  • Low-level and high-level APIs
  • Storage systems supported
  • Oracle, DB2, Sybase, HPSS, UNIX FS, W2000/NT FS,
  • Support for distributed servers, GSI
    authentication, password encryption

17
The Storage Resource Broker
18
How does SRB work?
  • The SRB Server spawns SRB Agent to authenticates
    the User/Application (SRB Client) by comparing it
    with information stored in MCAT
  • Find file location in MCAT
  • Check user request against permissions stored in
    MCAT
  • SRB Agent contacts user with the result of
    his/her request
  • The SRB Agent communicates with the user through
    a port specific to this client session, it can
    handle one or more requests from the client.

19
The SRB Process Model
Application
(Host, port)
SRB Master
(port)
SRB agents
MCAT
20
How does SRB handle remote Data Access?
  • Steps 1-3 are the same as in the simple case -
    Spawn SRB Agent on local Machine Authenticate,
    Check User Request, Locate File
  • SRB Agent contacts remote SRB Agent via SRB
    Server on the remote Machine where the data is
    stored
  • The second SRB Agent returns the pointer to the
    data item to the first SRB Agent, which passes it
    on to the user
  • The SRB Client can then interact with the data
    item directly (as described before, however all
    communication still runs via the first SRB Agent
    and the Machine it is situated on

21
Remote SRB Operation
Application
1
6
SRB server
SRB server
3
4
5
SRB agent
SRB agent
2
MCAT
22
SRB Space
  • The SRB Space consists of
  • A number of SRB Servers (possibly across
    multiple sites)
  • Many heterogeneous Storage Resources linked to
    SRB Servers via SRB Media Drivers
  • One MCAT System
  • Many Users
  • The SRB Space provides a single view on all the
    data within the Space.

23
SRB Space
24
MCAT Metadata Catalog
  • Stores metadata about
  • Users, Data sets, Resources, Methods
  • Provides collection abstraction
  • Stores detailed access control information
  • Maintains audit trail information on data sets
  • Implemented as a relational database with
    referential integrity constraints (currently uses
    Oracle, DB2 , Sybase)

25
MCAT Architecture
26
Federated Catalog Architecture
MAPS
MCAT
CATALOG
Semantics Definitions
Local Routines
Internal Catalogs
External CATALOG Interface
CATALOG
MAPS Interface
Local Interface
Local Interface
CAT-2
CAT-1
Semantics Definitions
Semantics Definitions
Local Routines
CATALOG
CATALOG
Local Routines
27
New MCAT Features
  • Meta-Schema to hold System and User meta data
    schema information
  • Extensible meta data schema
  • Distributed meta data schema
  • Metadata exchange Interface Protocol
  • MAPS- Metadata Attribute Presentation Structure
  • query, update and result structures
  • Close to Z39.50

28
New MCAT Features (contd.)
  • Core Schema Implemented
  • MCAT Core - Data, Resources, Users and Methods
  • Dublin Core
  • IV Core - Image Visualization attributes
  • Web-based Prototype User Interface
  • extensible schema functions
  • query,, insert and update of meta data
  • integrated presentation of meta data and data

29
SRB Data Replication Support
  • Replication via Resource Set definition
  • Replication support integrated into write
    function
  • srbObjReplicate API can be used for post facto
    replication
  • Synchronous replication across all sites. Can
    choose any k out of n
  • Can choose specific replica on read operation

30
Data Replication Example
Application SAIC
MCAT
SDSC
SRB
SRB
SRB
Caltech
NCSA
LogRsrc1
LogRsrc2
HPSS
HPSS
Oracle
DB2
Unix
31
Ticket-based Access Control
  • Owner can request ticket for a data set
  • Ticket can be issued for a data set or a
    collection
  • Ticket controls access by
  • time-period (start and expire timestamps)
  • number of access (count)
  • user names ( any, single or group users)
  • Non-registered Users can also access using
    tickets
  • Useful for sharing data and access through the
    web
  • Tickets generated and stored in MCAT
  • Currently supports read-only tickets

32
SRB API
  • Programmatic API
  • High-level API
  • Low-level API
  • SRB Manager API
  • Command Level Interface - Scommands
  • Graphical User Interface - srbBrowser
  • Web Utilities

33
SRB API Interface
Application
MCAT
SRB Master
34
High Low-level API
  • Low-level API
  • talks to resource drivers
  • no registration of data sets in MCAT
  • no authentication through MCAT
  • User provides all information
  • High-level API
  • Uses low-level API to access resources
  • Registers data management information in MCAT
  • Uses MCAT for authentication and meta information
  • Uses MCAT for resource and data discovery
  • Access/store data in remote SRB

35
System Manager API
  • srbChkMdasAuth(conn, userName, userAuth, domain)
  • srbChkMdasSysAuth(conn, userName, userAuth,
    domain)
  • srbRegisterUser(conn, userName, domain, password,
    userType, userAddress, userPhone, userEmail)
  • srbRegisterUserGrp(conn, userGrpName,
    userGrpPassword, userGrpType,
    userGrpAddress, userGrpPhone, userGrpEmail)

36
srbBrowser - A SRB Graphical Interface
  • A java GUI
  • Interface with SRB servers using the client
    API library.
  • Performs most SRB operations - cp, replicate,
    import, export, metadata query, etc.

USER
Windows or Java GUI
Obtain users metadata information via SRB.
Invoke SRB operations
SRB Agent
MCAT
Proxy operation
37
SRB Command Line Interface
Environment File
USER
SRB shell commands Sls, Scp, Scat, Sput, Sget,
...
MCAT
SRB Agent
Proxy operation
38
Scommands
  • Sinit - initialize S-environment
  • Sexit - clean up
  • Sman - get manpage for Scommand
  • Scat - display srbObject on screen
  • Sput - copy local file into srbSpace
  • Sget - copy srbObject to local space
  • Sappend - append to srbObject
  • Srename - change srbObject name
  • Srm - remove srbObject
  • Schmod - change/grant access to srbObject
  • Scd - change collection
  • Spwd - display current collection
  • Sls - list collection
  • Smkdir - make new collection
  • Srmdir - remove old collection
  • SgetD - get srbObject information
  • SgetR - get resource information
  • SgetU - get user information
  • SmodD - modify srbObject info
  • SmodU - modify user info
  • Stoken - get native type information
  • Scopy - copy srbObject in another
    collection and under another name
  • Sreplicate - clone object in new resource -
    same internal id
  • Smove - move srbObject to new collection or
    resource

39
Scommands (contd )
  • ingestUser - adding a new user or group
  • ingestResource - adding a new resource
  • ingestLogicalResource - making a new resource
    grouping
  • addLogicalResource - adding to a resource
    grouping
  • ingetLocation - adding new location
    information
  • ingestToken - adding new native types
    (eg. resourceType, objectType, userType,
    domainName, ActionType, . . .)

40
Scommands
  • Sls
  • Sls -h -L number -Y number -r-f
    collection ...
  • Sls -L number -Y number srbObj
  • Sput
  • Sput -p -D dataType -R resourceName
    -P pathName localFileName ...
    TargetName
  • Sput -p -D dataType -R resourceName
    -P pathName -i TargetName
  • Sget
  • Sget -C_n -p srbObj ... localFile
  • Sreplicate
  • Sreplicate -Cn -p -R resourceName
    -P pathName srbObj ...

41
SRBIO
  • Open
  • creat
  • read
  • write
  • close
  • lseek
  • fopen
  • fread
  • fwrite
  • fclose
  • fseek
  • fflush
  • fgetc
  • fgets
  • fputc
  • fputs
  • getc
  • putc
  • ungetc
  • rewind
  • vfprintf
  • fprintf
  • fscanf

42
Contents
  • Introduction
  • Architecture description
  • What is good
  • What needs improving
  • What can it be used for

43
Useful features
  • Easy interfaces to access data held in SRB
  • Transparent access independent of location or
    type
  • Support for replication of data
  • Support for logical structuring of data
  • Database support to locate data
  • Ticket system
  • Enhanced access right structure
  • Modular SRB Media Drivers
  • Useful to users and system administrators

44
Contents
  • Introduction
  • Architecture description
  • What is good
  • What needs improving
  • What can it be used for

45
Current Obstacles
  • Only one MCAT catalogue - single point of
    failure, performance, ownership
  • All MCAT metadata is visible to everyone
  • Data Access at remote sites - two many interim
    steps
  • Documentation not up-to-date
  • Installation not straight forward - patches
    needed, dependent on other software
  • Licence required

46
Contents
  • Introduction
  • Architecture description
  • What is good
  • What needs improving
  • What can it be used for

47
Grid Applications within CLRC
  • Various Portals to access experimental, data and
    computing facilities within CLRC and outside.
  • Issues
  • Data held widely distributed across the site and
    in community owned facilities
  • Data required where it is not stored
  • Data located through service that is not local
    to data holding

48
Planned Structure of CLRC - Services
Problem Solving Environments
CLRC Authentication
Computing Applications
Experimental Facilities
HPCPortal
Remote systems
Local systems
49
Integrated Solution for Earth Science
Data Storage
DataPortal
RasDaMan
Disk
Tape
BADC Catalogue
SRB
HPCPortal
50
General CLRC DataPortal Architecture
CLRC DataPortal Server
XML wrapper
XML wrapper
Local metadata
Common metadata catalogue database
Local data
Facility 1
51
Server Architecture
USER
Key
User input interpreter
User output generator
Internal
http
pre-set XSL Script
Query Generator
module
Response Generator
XML Schema
XML Parser
External agent
Central metadata repository
Wrapper for other Catalogues
XML File
XML File
Ascii file
52
Architecture for integrating existing Catalogues
DataPortal Server
Request file(s)
XML Wrapper
Response Generator
SQL input translator
XML output generator
Local Metadata Catalogue
RasDaMan
SRB
External agent
53
Local Integration of SRB
54
Remote Integration of SRB
User
Locating Data
DataPortal
HPCPortal
Job submission
RasDaMan
BADC
CSAR
EPCC
SRB Server
MCAT
SRB Server
MCAT
SRB Agent
SRB Agent
SRB Agent
SRB Agent
SRB Agent
SRB Agent
Data location
Data itself
55
Conclusions
SRB is a useful tool in the GRID context It has
many plus points But there is still a lot to
do There is nothing comparable out there!
56
Where can you get more information?
For a SRB license send mail to K.kleese_at_dl.ac.uk
For general information see the UK Grid Support
Centre http//www.grid-support.ac.uk/ For
specific questions register with the
Centre http//www.grid-support.ac.uk/form.html Fo
r information on e-science research within CLRC
see the CLRC e-Science Centre http//www.e-scienc
e.clrc.ac.uk/
Write a Comment
User Comments (0)
About PowerShow.com