Grid Computing - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Grid Computing

Description:

Some wax philosophical, and say it is an unlimited capacity for computing. ... Grid resources, e.g. Condor. Batch system. A three tier web services architecture. 17 ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 29
Provided by: chipw7
Category:
Tags: computing | condor | grid

less

Transcript and Presenter's Notes

Title: Grid Computing


1
Grid Computing
  • Chip Watson
  • Jefferson Lab
  • Hall B Collaboration Meeting
  • 1-Nov-2001

2
What is the Grid?
  • Some wax philosophical, and say it is an
    unlimited capacity for computing. Like the power
    grid, you just plug in and use it, dont care who
    provides it.

Difficulty metering your use of resources, and
charging for them. We arent there yet.
Simpler view it is a large computer center, with
a geographically distributed file system and
batch system.
This view assumes you have a right to use each
piece of the distributed system, subject to
perhaps local accounting constraints.
3
Key Aspects of the Grid
  • Data Grid Location independent file system.
  • If you know the logical name of a data set, you
    can find it. (Normal access controls apply).
  • Files can migrate around the grid to optimize
    usage, and may exist in multiple locations.
  • Computational Grid Submit a job to the grid.
  • You describe the requirements of your job, and
    grid middleware finds an appropriate place to run
    it.
  • Jobs can be batch, or even interactive.

4
Other Important Aspects
  • Single Sign-On
  • You log on to the grid once, and you can use
    the distributed resources for a certain period of
    time (sort of like the AFS file system)
  • Analog all day metro ticket

5
Distributed Computing Model
  • In the old model, a lab has a large computer
    center, provisioned for all demanding data
    storage, analysis and simulation requirements.
  • In the current model, only a fraction resides
    at the lab.
  • already widely used in HEP experiments
  • large experiment may enlist a major computing
    partner site, e.g. IN2P3 for BaBar
  • In the new model, many sites large and small
    participate.
  • Some sites may be special based upon capacity or
    specialized capabilities (such as robotic
    storage).
  • LHC will use a 3 tier model, with a large central
    facility (tier 0), distributing data to
    moderately large national centers (tier 1), which
    in turn service small nodes (tier 2)
  • What is a reasonable distribution for Hall D???

6
Why desert a working model?
  • Easier to get additional funds
  • State matching funds
  • Also NSF or other funding agency
  • Easier to involve students
  • Room full of computers more attractive than
    account on a machine 1000 km away
  • Opportunity for innovation
  • Easier to play with local machine than to get
    root access on machine 1000 km away

7
Case StudyThe Lattice PortalA prototype
virtual computer center for Jefferson Lab(under
development)
8
Contents
  • Components of the virtual computer center
  • Data management
  • Batch system
  • Interactive system
  • Architectural components
  • Information Services using XML (Java servlets)
  • Replica Catalog
  • Data Grid Server (file cache transfer agent)
  • Batch Server
  • Authentication using X.509, SSL
  • Java client packages

9
A Virtual Computer Center Data Management
  • Global Logical File System (possibly
    constrained to a project)
  • Logical names for files (location independent)
  • Active files possibly cached at multiple sites
  • Inactive files in off-line storage (tape silo,
    multi-site)
  • Data Grid Node
  • Manages a cache of logical files, perhaps using
    multiple file servers, NFS exports files locally
  • Maps logical name to local (physical) file name
  • Supports file transfers between managed and
    unmanaged storage, and between grid nodes
    (queuing of transfer requests)
  • Replica Catalog
  • Tracks which logical files exist at which data
    grid nodes
  • Contains some file meta-data to allow file
    selection by attributes as well as by name

10
In picture form
ClientProgram
library
MetaDataCatalog
MetaData Catalog Host
DataGridServer
FileServer
ReplicaCatalog
File Host
Replica Catalog Host
  • Get file names from meta data (energy, target,
    magnet settings)
  • Contact replica catalog to locate desired file.
    Get referral to a Data Grid Server
  • Get file state (on disk), additional info,
    referral to transfer agent
  • Get the file (parallel streams)

11
A Virtual Computer Center Batch System
  • Global Queue(s)
  • A user submits a batch job, perhaps using a web
    interface, to the virtual computer center (a.k.a.
    meta-facility). Based upon the locations of the
    executable, the input data files, and the
    necessary compute resources, the job is assigned
    to a particular compute grid node (cluster of
    machines).
  • Compute Grid Node
  • Set of co-located compute resources managed by a
    batch system. Typically co-located with a data
    grid node. E.g. Jefferson Labs Computer Center.

12
Virtual Computer Center Interactive
  • Conventional remote login is expected to be less
    common, as all capabilities are remotely
    accessible. Nevertheless
  • Interactive Services
  • ssh login to machine of desired architecture and
    operating system
  • interactive access to small clusters for serial
    and parallel jobs (or fast turnaround on local
    batch system)

13
Implementation?
  • As with any distributed system, there are many
    ways to construct a meta-facility or grid
  • CORBA (distributed object system)
  • DCOM (Windows only)
  • Custom protocols over TCP/IP or UDP/IP
  • Grid Middleware
  • Globus (from ANL)
  • Legion (UVA)
  • Web Services
  • . . . or some combination of the above

14
What are Web Services?
  • Web Services are functions or methods that can be
    accessed across the web.
  • Think of this as a better RPC (remote procedure
    call) system. Why better?

15
Why Web Services ?
  • Use of industry standards
  • HTTP, HTTPS, XML, SOAP, WSDL, UDDI,
  • Support for many languages
  • Compiled and scripted
  • Self describing protocols
  • easier management of versioning, evolution
  • Support for authentication
  • Strong Industry Support
  • Microsofts .NET initiative
  • SUNs ONE (Open Net Environment)
  • IBM contributions to Apache / SOAP

16
A three tier web services architecture
Web Browser
Application
Authenticated connections
XML to HTML servlet
Web Service
Web Server (Portal)
Web Service
Web Service
Remote Web Server
Grid Service
Web Service
Local Backend Services (batch, file, etc.)
Storage system
Grid resources, e.g. Condor
Batch system
17
Web Services Details Data Grid
  • Replica Catalog Data Grid Node
  • List contents of a directory
  • Navigate to another directory, or follow a soft
    link
  • Mkdir make a new directory
  • Link make a new link
  • Delete a logical file, directory, or link
  • Properties set / retrieve properties of a file,
    directory, or link (including protection / access
    control)
  • Replica Catalog specific
  • Create a new logical file
  • Add/Remove/Select/Delete replica manipulate
    references to where file is stored
  • Data Grid Node specific
  • Allocate space for an incoming file
  • Copy a file to/from unmanaged space, or another
    grid node
  • Locate get reference to physical file for
    transfer or local access

18
Web Services Details Batch System
  • User Job Operations
  • Submit
  • Resource requirements (CPU, memory, disk, net, )
  • Dependencies on other jobs / events
  • Executables, libraries, etc., input files, output
    files
  • Cancel
  • Suspend / Resume
  • List by queue, owner, site,
  • View allocation, usage
  • Operator Operations
  • On systems, queues, jobs
  • On quota / allocation system

19
Technology Choice XML
  • Advantages
  • Self describing data (contains meta data)
  • Facilitates heterogenous systems
  • Robust against evolution
  • (no fragile versioning that distributed object
    systems encounter)
  • New server generates additional tags which are
    ignored by old client
  • New client detects absence of new tags knows it
    is talking to an old server (and/or supplies
    defaults)
  • Capable of defining all key concepts and
    operations for both client-server and
    client-portal communications
  • Technologies
  • XML eXtensible Markup Language
  • SOAP Simple Object Access Protocol (modern rpc
    system)
  • WSDL Web Services Description Language (idl)
  • UDDI Universal Description, Discovery and
    Integration

20
Technology Choices Java Servlets
  • Java Advantages
  • Rapid code development
  • No memory leaks
  • Easy to use interface to SQL databases, XML
    libraries
  • Rich library of low level components (containers,
    etc.)
  • Web Servlet Advantages
  • Java (see above)
  • Scalability (see e-commerce)
  • Modular web services
  • One servlet can invoke another, e.g. to translate
    XML to HTML
  • Minor Web Inconvenience
  • Asynchronous notification of clients of web
    services

21
PPDG Collaboration JLAB/SRB
  • Web services as a general portal to a variety of
    back end storage systems (JLAB, SRB, )
  • And other services batch
  • Project should define the abstractions at the web
    services level define all metadata for
    interacting with a storage system
  • Define XML to describe digital objects and
    collections/directories (ALL)
  • Metadata to describe logical namespace of the
    grid (SRB, JLAB, GridFTP attributes)
  • Standard structure for organizing as XML
  • Define (WSDL ?) operations of browse, query,
    manage (ALL)
  • Listing files available through interface,
  • Caching, replication, pinning, staging, resource
    allocation, etc
  • Back-end implementations
  • JASMine (JLAB)
  • SRB (SDSC)
  • (SRM, Globus)
  • Implement demonstration web services client
    (JLAB)
  • Web services clients should be able to interact
    with any of these

22
JLAB mss - JASMine
  • Features
  • Stand alone cache manager
  • Pluggable policies
  • Implemented in Java
  • Distributed, scaleable
  • Pluggable security
  • Authentication authorization
  • To be integrated with GSI
  • Scheduling of drives
  • Can manage tape, tape and disk, or disk alone

2 TB Farm cache
15 TB Experiment cache pools
JASMine managed mass storage sub-systems
0.5 TB LQCD cache pool
23
Example demo client
  • Similar to a graphical ftp client, but
  • Each half can attach to a grid node
  • Cache managed filesystem
  • Users home directory
  • Other file systems at web server
  • Replica catalog
  • Local mss if it is separate from replica system
  • Can move files in/out of managed store
  • Negotiates compatible protocols between grid
    nodes
  • E.g., http, SRB, gridFTP, ftp, bbftp, JASMine, etc

24
Technologies Employed
  • Apache web server
  • Tomcat servlet engine, SOAP libraries
  • Java Server Pages (JSP)
  • XML data format
  • XSL style sheets for presentation
  • X.509 certificate authentication
  • Web interface to a simple certificate authority
    to issue certificates valid within the
    meta-facility (signed by Jefferson Lab)

25
Data Grid
  • Capabilities planned
  • Replicated data (multi-site), global tree
    structured name space (like Unix file system)
  • Replica catalog, replicated to multi-site
  • using mySQL as back end, probably using mySQLs
    ability to replica the catalog (fault tolerance)
  • Browse by attributes as well as by name
  • Parallel file transfers (
  • bbftp, gridftp,
  • Jpars 100 java parallel file transfers (w/ 3rd
    party, authen.)
  • Drag-n-drop between sites
  • Policy based replication (auto migrate between
    sites)

26
Status
  • Prototype
  • Browse contents of a prototype disk cache / tape
    storage file system
  • Move files between managed and unmanaged storage
    on data node
  • Move files (including entire directories) between
    desktop and data node
  • Displays if file is currently in disk cache
  • Can request move from tape to disk (not released)
  • Soon
  • 3rd party file transfers (between 2 servers)

27
Near Term
  • Convert from raw XML to SOAP (this month)
  • Deploy disk cache manager to FSU MIT (4Q01)
  • Abstract disk-to-tape migration of current system
    to use WAN site-to-site migration of files
    wrapping, e.g. gridftp or other parallel transfer
    (1Q02)

28
Conclusions
  • Grid Capabilities are starting to emerge
  • Jefferson Lab will have a functioning data grid
    in FY02
  • Jefferson Lab will have a functioning
    meta-facility in FY03
Write a Comment
User Comments (0)
About PowerShow.com