Course Introduction (cont.) and Introduction to Globus - PowerPoint PPT Presentation

1 / 97
About This Presentation
Title:

Course Introduction (cont.) and Introduction to Globus

Description:

What basic services must be provided by a grid infrastructure? Introduction to Globus ... The Globus toolkit provides a range of basic Grid services ... – PowerPoint PPT presentation

Number of Views:164
Avg rating:3.0/5.0
Slides: 98
Provided by: carlk152
Category:

less

Transcript and Presenter's Notes

Title: Course Introduction (cont.) and Introduction to Globus


1
Course Introduction (cont.) and Introduction to
Globus
  • Joint work of USC Information Sciences Institute
  • and Argonne National Laboratory

2
Last timeDefined computational grids
  • Emerging computational and networking
    infrastructure
  • pervasive, uniform, and reliable access to
    remote data, computational, sensor, and human
    resources
  • Enable entirely new approaches to applications
    and problem solving
  • remote resources the rule, not the exception

3
Example Aeronautic Design
Collaboration
Simulation
Instrumentation
Design data
4
Why Now?
  • The Internet as infrastructure
  • Increasing bandwidth, advanced services
  • Advances in storage capacity
  • Terabyte store is 150,000
  • Increased availability of compute resources
  • clusters, supercomputers, etc.
  • Advanced applications
  • simulation based design, advanced scientific
    instruments, ...

5
Tomorrows InfrastructureNot Just Faster and
More Reliable
O(109) nodes
Caching
Resource Discovery
QoS
  • Application-centric heterogeneous, mobile
    end-systems many embedded capabilities rich
    services user-level quality of service

6
Today
  • How does grid computing differ from traditional
    distributed computing?
  • Where do grids get their names?
  • What basic services must be provided by a grid
    infrastructure?
  • Introduction to Globus

7
A Grid Application Scenario
  • A distributed simulation involving 10
    supercomputers at 10 different locations
  • How do you know where they are?
  • How do you identify yourself to each?
  • How do you get permission to use them?
  • How do you submit remote jobs?
  • How do you get access to resources on all the
    machines simultaneously?
  • What happens if a machine fails?
  • How are input/output files managed?

8
Basic Grid Services
  • Security
  • Authentication both client and server
  • Authorization what privileges does the client
    have?
  • Access control Sites want local control of
    operations that remote users are allowed to
    perform
  • Confidential data transfer using encryption

9
Basic Grid Services (cont.)
  • Resource management
  • Mechanism for submitting jobs to remote locations
  • Local policies for use, management, resource
    configuration
  • Scheduling of important resources
  • Coordinating scarce, expensive resources
    (e.g., cooperating supercomputers)
  • Advanced reservations to guarantee
  • Quality of service
  • Completion of operations (e.g., reserve disk
    space for a large data transfer)

10
Basic Grid Services (cont.)
  • Information Services
  • Register and query information about grid
    resources
  • Where are all the Cray T3Es in the grid?
  • Where is a storage system with 250 gigabytes of
    free space that transfers data at 1 gigabit/sec?
  • Centerpiece for many Grid components
  • Performance measurement services
  • What is the current bandwidth of the link from
    jupiter.isi.edu to apogee.sdsc.edu?
  • Dynamic environment assume the information
    service contains old information

11
Basic Grid Services (cont.)
  • Efficient Data Transfers
  • Secure (authentication, encryption)
  • Parallel transfers
  • Partial file transfers
  • Third-party transfers
  • Reliable transfers
  • Replica Management Service
  • Large (petabyte-scale) datasets
  • Multiple stored and cached copies
  • Select the best copy with best performance

12
Basic Grid Services (cont.)
  • Fault detection
  • Detect and report failure of component of a
    computation
  • Limited by ability to distinguish between network
    partition and system failure
  • Goal make low-level operations reliable
  • No libraries for checkpoint and restart
  • Cant checkpoint a socket
  • Only application knows how to checkpoint and
    restart
  • Likewise, storage system must do logging

13
Major Grid ComputingInfrastructure Projects
  • The Globus Project
  • Bag of services model for grid computing
  • USC Information Sciences Institute and Argonne
    National Laboratory (Chicago)
  • We will use Globus for most of the examples in
    this class
  • The Legion Project
  • Object-oriented approach to grid computing
  • The Condor Project
  • Schedule computations on pool of resources

14
Today
  • How does grid computing differ from traditional
    distributed computing?
  • Where do grids get their names?
  • What basic services must be provided by a grid
    infrastructure?
  • Introduction to Globus

15
Grid Services Architecture
High-energy physics data analysis
Collaborative engineering
On-line instrumentation
Applications
Regional climate studies
Parameter studies
16
The Globus Approach
  • The Globus toolkit provides a range of basic Grid
    services
  • Security, information, fault detection,
    communication, resource management, ...
  • These services are simple and orthogonal
  • Can be used independently, mix and match
  • Programming model independent
  • For each there are well-defined APIs
  • Standards are used extensively
  • E.g., LDAP, GSS-API, X.509, ...

17
Grid Security Infrastructure
  • Single-sign on, run anywhere if authorized
  • Standards based (GSS, SSL, X.509)
  • GSS-API Interface
  • Identity/credential mapping at each resource
  • Limited delegation of rights
  • Integrated into wide variety of tools
  • Globus Resource Management
  • Secure shell, FTP,
  • Storage Resource Broker

18
Authentication Model
  • Authentication is done on a user basis
  • Single authentication step allows access to all
    grid resources
  • No communication of plaintext passwords
  • Most sites will use conventional account
    mechanisms
  • You must have an account on a resource to use
    that resource
  • Sites may use generic Grid accounts
  • Not common, but Globus can deal with it

19
Grid Security Infrastructure
  • Based on public key technology
  • Standard X.509 certificate, same as certificates
    used for the Web
  • Each user has
  • a Grid user id (called a Subject Name)
  • /CUS/OGlobus/OUniversity of Southern
    California/OUInformation Sciences
    Institute/CNAnn Chervenak
  • a private key (like a password)
  • a certificate signed by a Certificate Authority
    (CA)
  • A gridmap file at each site specifiesgrid-id
    to local-id mapping

20
Certificate Based Authentication
  • User has a certificate, signed by a trusted
    certificate authority (CA)
  • Certificate contains users name and public key
  • Globus project operates a CA
  • Users private key is used to encode a challenge
    string
  • Public key is used to decode the challenge
  • If you can decode it, you know the user
  • Treat your private key carefully!!
  • Private key is stored in encrypted form

21
User Proxies
  • Minimize exposure of users private key
  • A temporary credential for use by our
    computations
  • We call this a user proxy certificate
  • Allows process to act on behalf of user
  • User-signed user proxy certificate stored in
    local file
  • Proxys private key is not encrypted
  • Rely on file system security, proxy certificate
    file must be readable only by the owner

22
Delegation
  • Remote creation of a user proxy
  • Allows remote process to act on behalf of the
    user
  • Avoids sending passwords or private keys across
    the network

23
Single sign-onvia grid-id
CREDENTIAL
Assignment of credentials to user proxies
Globus Credential
Mutual user-resource authentication
Site 2
Mapping to local ids
Authenticated interprocess communication
GSSAPI multiple low-level mechanisms
Certificate
24
Resource Management
  • Globus Resource Allocation Manager (GRAM)
  • Uniform interface to resource management
  • Globus Arch. for Reservation and Allocation
  • Co-allocation of compute resources
  • Immediate and advance reservation of network and
    computers in prototype form
  • Fault detection service
  • Network measurement tools
  • Code management and distribution infrastructure

25
Resource Management
  • Resource Specification Language (RSL) is used to
    communicate requirements
  • The Globus Resource Allocation Manager (GRAM) API
    allows programs to be started on remote
    resources, despite local heterogeneity
  • A layered architecture allows application-specific
    resource brokers and co-allocators to be defined
    in terms of GRAM services

26
Resource Management Architecture
RSL specialization
RSL
Application
Information Service
Queries
Info
Ground RSL
Simple ground RSL
Local resource managers
GRAM
GRAM
GRAM
LSF
EASY-LL
NQE
27
Resource Specification Language
  • Common notation for exchange of information
    between components
  • RSL provides two types of information
  • Resource requirements Machine type, number of
    nodes, memory, etc.
  • Job configuration Directory, executable, args,
    environment
  • API provided for manipulating RSL

28
RSL Syntax
  • Elementary form parenthesis clauses
  • (attribute op value value )
  • Operators Supported
  • lt, lt, , gt, gt , !
  • Some supported attributes
  • executable, arguments, environment, stdin,
    stdout, stderr, resourceManagerContact,resourceMa
    nagerName
  • Unknown attributes are passed through
  • May be handled by subsequent tools

29
Constraints
  • For example
  • (countgt5) (countlt10)
  • (max_time240) (memorygt64)
  • (executablemyprog)
  • Create 5-10 instances of myprog, each on a
    machine with at least 64 MB memory that is
    available to me for 4 hours

30
Multirequest
  • A multirequest allows us to specify multiple
    resource needs, for example
  • ( (count5)(memorygt64)
  • (executablep1))
  • ((networkatm) (executablep2))
  • Execute 5 instances of p1 on a machine with at
    least 64M of memory
  • Execute p2 on a machine with an ATM connection
  • Multirequests are central to co-allocation

31
Co-allocation
  • Simultaneous allocation of a resource set
  • Handled via optimistic co-allocation based on
    free nodes or queue prediction
  • In the future, advance reservations will also be
    supported

32
A Co-allocation Multirequest
( (resourceManagerContact
flash.isi.edu754/CUS//CNflash.isi.edu-fork)
(count1) (label"subjob A")
(executable my_app1) ) (
(resourceManagerContact
sp139.sdsc.edu8711/CUS//CNsp097.sdsc.edu-lsf
") (count2) (label"subjob B")
(executablemy_app2) )
33
Job Submission Interfaces
  • Globus Toolkit includes several command line
    programs for job submission
  • globus-job-run Interactive jobs
  • globus-job-submit Batch/offline jobs
  • globusrun Flexible scripting infrastructure
  • Others are building better interfaces
  • General purpose
  • Condor-G, PBS, GRD, Hotpage, etc
  • Application specific
  • ECCE, Cactus, Web portals

34
Grid Information Services
  • Publish and retrieve information about system
    elements
  • Used for discovery, configuration, scheduling
  • Distributed collection of information servers and
    index nodes
  • LDAP V3 as wire protocol and API

35
Examples of Useful Information
  • Characteristics of a compute resource
  • IP address, software available, system
    administrator, networks connected to, OS version,
    load
  • Characteristics of a network
  • Bandwidth and latency, protocols, logical
    topology
  • Characteristics of the Globus infrastructure
  • Hosts, resource managers

36
Grid Information Service
  • Provide access to static and dynamic information
    regarding system components
  • A basis for configuration and adaptation in
    heterogeneous, dynamic environments
  • Requirements and characteristics
  • Uniform, flexible access to information
  • Scalable, efficient access to dynamic data
  • Access to multiple information sources
  • Decentralized maintenance

37
The Globus ToolkitMetacomputing Directory Service
  • Store information in a distributed directory
  • Directory stored in collection of LDAP servers
  • Directory can be updated by
  • Information providers and tools
  • Applications (i.e., users)
  • Backend tools which generate info on demand
  • Information dynamically available to
  • Tools
  • Applications

38
Directory Service Functions
  • White Pages
  • Look up the IP number, amount of memory, etc.,
    associated with a particular machine
  • Yellow Pages
  • Find all the computers of a particular class or
    with a particular property
  • Temporary inconsistencies are often considered
    okay
  • In a distributed system, you often do not know
    the state of a resource until you actually use it
  • Information is often used as hints
  • Information itself can contain ttl, etc.

39
MDS Approach
Application
  • Based on LDAP
  • Lightweight Directory Access Protocol v3 (LDAPv3)
  • Standard data model
  • Standard query protocol
  • Globus specific schema
  • Host-centric representation
  • Globus specific tools
  • GRIS, GIIS
  • Data discovery, publication,

Middleware
LDAP API
GRIS
GIIS


SNMP
NWS
NIS
LDAP
40
Grid Resource Information Service
  • Server which runs on each resource
  • Given the resource DNS name, you can find the
    GRIS server (well known port 2135)
  • Provides resource specific information
  • Much of this information may be dynamic
  • Load, process information, storage information,
    etc.
  • GRIS gathers this information on demand
  • White pages lookup of resource information
  • Ex How much memory does machine have?
  • Yellow pages lookup of resource options
  • Ex Which queues on machine allow large jobs?

41
Grid Index Information Service
  • GIIS describes a class of servers
  • Gathers information from multiple GRIS servers
  • Each GIIS is optimized for particular queries
  • Ex1 Which Alliance machines are gt16 process
    SGIs?
  • Ex2 Which Alliance storage servers have gt100Mbps
    bandwidth to host X?
  • Akin to web search engines
  • Organization GIIS
  • The Globus Toolkit ships with one GIIS
  • Caches GRIS info with long update frequency
  • Useful for queries across an organization that
    rely on relatively static information

42
Referral Service
  • Links together multiple GRIS and/or GIIS servers
    into a single LDAP namespace
  • Referral servers contain no actual content

43
Data Grid Services
  • Access to remote data
  • Uniform access to diverse, remote storage
    management systems
  • Cache management
  • Transport services
  • Standards based (GSI, FTP protocol)
  • Client API, Extensible server, support for third
    party transfer
  • Replica Management

44
Data Intensive Issues Include
  • High-speed, reliable access to remote data
  • Automated discovery of best copy of data
  • Manage replication to improve performance
  • Co-schedule compute, storage, network
  • Enforce access control on data

45
The Globus Data Grid
  • Two major components
  • 1. Data Transport and Access
  • Common protocol
  • Secure, efficient, flexible, extensible data
    movement
  • Family of tools supporting this protocol
  • 2. Replica Management Architecture
  • Simple scheme for managing
  • multiple copies of files
  • collections of files

46
Motivation for a Common Data Access Protocol
  • Existing distributed data storage systems
  • DPSS, HPSS focus on high-performance access,
    utilize parallel data transfer, striping
  • DFS focus on high-volume usage, dataset
    replication, local caching
  • SRB connects heterogeneous data collections,
    uniform client interface, metadata queries
  • Problems
  • Incompatible protocols
  • Each require custom client
  • Partitions available data sets and storage
    devices
  • Each protocol has subset of desired functionality

47
A Common, Secure, EfficientData Access Protocol
  • Common, extensible transfer protocol
  • Decouple low-level data transfer mechanisms from
    the storage service
  • Advantages
  • New, specialized storage systems are
    automatically compatible with existing systems
  • Existing systems have richer data transfer
    functionality
  • Interface to many storage systems
  • HPSS, DPSS, file systems
  • Plan for SRB integration

48
A UniversalAccess/Transport Protocol
  • Suite of communication libraries and related
    tools that support
  • GSI security
  • Third-party transfers
  • Parameter set/negotiate
  • Partial file access
  • Reliability/restart
  • Logging/audit trail
  • All based on a standard, widely deployed protocol
  • Integrated instrumentation
  • Parallel transfers
  • Striping (cf DPSS)
  • Policy-based access control
  • Server-side computation
  • later

49
And the Universal Protocol is GSI-FTP
  • Why FTP?
  • Ubiquity enables interoperation with many
    commodity tools
  • Already supports many desired features, easily
    extended to support others
  • Well understood and supported
  • We use the term GSI-FTP to refer to
  • Transfer protocol which meets requirements
  • Family of tools which implement the protocol
  • Note GSI-FTP gt FTP
  • Note that despite name, GSI-FTP is not restricted
    to file transfer!

50
Replica Management
  • Maintain a mapping between logical names for
    files and collections and one or more physical
    locations
  • Important for many applications
  • Example CERN HLT data
  • Multiple petabytes of data per year
  • Copy of everything at CERN (Tier 0)
  • Subsets at national centers (Tier 1)
  • Smaller regional centers (Tier 2)
  • Individual researchers will have copies

51
Our Approach to Replica Management
  • Identify replica cataloging and reliable
    replication as two fundamental services
  • Layer on other Grid services GSI, transport,
    information service
  • Use LDAP as catalog format and protocol, for
    consistency
  • Use as a building block for other tools
  • Advantage
  • These services can be used in a wide variety of
    situations

52
Replica Manager Components
  • Replica catalog definition
  • LDAP object classes for representing
    logical-to-physical mappings in an LDAP catalog
  • Low-level replica catalog API
  • globus_replica_catalog library
  • Manipulates replica catalog add, delete, etc.
  • High-level reliable replication API
  • globus_replica_manager library
  • Combines calls to file transfer operations and
    calls to low-level API functions create,
    destroy, etc.

53
Replica Catalog Structure A Climate Modeling
Example
Replica Catalog
Logical Collection C02 measurements 1998
Logical Collection C02 measurements 1999
Filename Jan 1998 Filename Feb 1998
Logical File Parent
Location jupiter.isi.edu
Location sprite.llnl.gov
Filename Mar 1998 Filename Jun 1998 Filename
Oct 1998 Protocol gsiftp UrlConstructor
gsiftp//jupiter.isi.edu/ nfs/v6/climate
Filename Jan 1998 Filename Dec 1998 Protocol
ftp UrlConstructor ftp//sprite.llnl.gov/
pub/pcmdi
Logical File Jan 1998
Logical File Feb 1998
Size 1468762
54
Replica Catalog Servicesas Building Blocks
Examples
  • Combine with information service to build replica
    selection services
  • E.g. find best replica using performance info
    from NWS and MDS
  • Use of LDAP as common protocol for info and
    replica services makes this easier
  • Combine with application managers to build data
    distribution services
  • E.g., build new replicas in response to frequent
    accesses

55
Relationship to Metadata Catalogs
  • Metadata services describe data contents
  • Have defined a simple set of object classes
  • Must support a variety of metadata catalogs
  • MCAT being one important example
  • Others include LDAP catalogs, HDF
  • Community metadata catalogs
  • Agree on set of attributes
  • Produce names needed by replica catalog
  • Logical collection name
  • Logical file name

56
A Model Architecture for Data Grids
Attribute Specification
Replica Catalog
Metadata Catalog
Application
Multiple Locations
NWS
Logical Collection and Logical File Name
Selected Replica
Replica Selection
MDS
gsiftp commands
Performance Information and Predictions
Disk Cache
Tape Library
Disk Array
Disk Cache
Replica Location 1
Replica Location 2
Replica Location 3
57
Fault Detection Globus Heartbeat Monitor
  • Detect and report failure of component of a
    computation
  • Limited by ability to distinguish between network
    partition and system failure
  • Optionally used within Globus Toolkit to monitor
    status of system processes
  • Can also be used to construct special fault
    monitors for applications
  • Example Netsolve

58
Fault Detection (cont.)
  • Goal make low-level operations reliable
  • No libraries for checkpoint and restart
  • Cant checkpoint a socket
  • Only application knows how to checkpoint and
    restart
  • Likewise, storage system must do logging

59
Heartbeat Monitor
Application Level Fault Handler
!
System Monitoring Tools
Process and Host Heartbeat
Process and Host Heartbeat
Host 2
Host 1
Process Status Inquiry
Process Status Inquiry
Register/ Unregister
Register/ Unregister
60
Grid Enabled Tools
  • Message Passing Interface
  • Multi-method communication, specialized
  • CAVERNsoft
  • Shared state for collaborative environments
  • Condor, Nimrod-G
  • High-throughput computing
  • User level tools
  • FTP, SSH

61
Thursday, September 7
  • How does grid computing differ from traditional
    distributed computing?
  • Where do grids get their names?
  • Grid hardware
  • Grid applications

62
Distributed Computing A Quick Review
  • Andrew Tannenbaum
  • A distributed system is a collection of
    independent computers that appear to the users of
    the system as a single computer.

63
Distributed Systems Hardware
  • Distributed in the local area
  • Memory organization
  • Shared-memory multiprocessors
  • Single virtual address space shared by all CPUs
  • Multicomputers with private memories
  • Separate address spaces
  • Interconnection network organization
  • Bus-based
  • A single shared network, backplane, bus or cable
  • Switch-based
  • Individual connections between machines

64
Simplest Hardware A Bus-based Shared-Memory
Multiprocessor
Processor
Processor
Processor
Memory
Cache
Cache
Cache
Bus
  • Shared memory
  • Caches must be kept consistent
  • Bus bandwidth limits to 64 processors

65
Bus-based Distributed Shared-Memory
(DSM)Multiprocessor
Memory
Memory
Memory
Memory
Cache
Cache
Cache
Cache
Processor
Processor
Processor
Processor
Bus
  • Each processor contains portion of shared
    memory
  • Local accesses fast, remote accesses slow
  • NUMA non-uniform memory access

66
Switch-Based Multicomputer Workstation Cluster
Work-station
Work-station
Ethernet Switch
Work-station
Work-station
Work-station
Work-station
  • Workstations share resources
    file servers, printers, storage archives
  • Schedule jobs
  • Use idle workstations

67
HardwareWhat is different in a grid?
  • Heterogeneous hardware environment
  • computing platforms
  • network connections
  • storage systems and caches
  • Wide-area distribution
  • Wide-area network latency and bandwidth
  • Resources in different administration domains
  • Dynamic environment
  • Resources enter and leave grid

68
Software Issues in Distributed Operating
Systems
  • Communication models
  • Client-Server Model
  • Remote procedure call
  • Group communication
  • In a grid
  • Algorithms must tolerate wide-area latency for
    message transfers
  • Avoid large numbers of messages
  • Typically perform larger transfers, initiate
    remote jobs rather than procedure calls

69
Software Issues in Distributed Operating
Systems
  • Synchronization
  • Clock synchronization
  • Election algorithms determine a coordinator
  • Atomic transactions
  • In a grid
  • With wide-area latencies, typically perform
    synchronization on larger grain
  • Can implement atomic operations

70
Software Issues inDistributed Operating Systems
  • Processes and Processors
  • Threads
  • Allocating Processors
  • Scheduling and co-scheduling resources
  • Fault tolerance
  • In a grid scheduling, allocation, fault
    tolerance issues get more complicated in the wide
    area environment

71
Software Issues in a Distributed Operating
System
  • Distributed file systems
  • File service that reads and writes file, controls
    access
  • Creating, deleting managing directories
  • Naming
  • Sharing
  • Caching and consistency
  • Replication and updates
  • In a grid, same issues complicated by wide area
    distribution, different administrative domains,
    enormous data sets

72
Software Issues for a Distributed Operating
System
  • Distributed Shared Memory
  • Generally applies to machines in a LAN
  • Each processor contains memory corresponding to
    part of the shared memory address space
  • Each processor caches data from other processors
  • Many consistency algorithms
  • In a grid EASIER! Globus does not support a
    shared address space
  • Legion has a single shared object space

73
Summary Heterogeneity makes things harder in a
grid
  • Heterogeneous software and hardware
  • Different administrative domains
  • Different policies for use and management of
    local resources
  • Must do coordinated scheduling
  • Different security policies
  • Dynamic environment
  • Must discover resources
  • Robust in the presence of network, resource
    failures

74
Today
  • How does grid computing differ from traditional
    distributed computing?
  • Where do grids get their names?
  • Grid Hardware
  • Grid Applications

75
Where do computational grids get their names?
  • A computational grid is a hardware and software
    infrastructure that provides dependable,
    consistent, pervasive, and inexpensive access to
    high-end computational capabilities.
  • Name (and definition) imply an analogy to the
    electric power grid
  • Power inexpensive, universally available
  • Enabled new devices and industries

76
An Infrastructure AnalogyThe Electric Power Grid
  • Revolutionary development transmission and
    distribution of electricity
  • Before power accessible in crude forms
  • human work
  • horses
  • water power
  • steam engines
  • Today cheap, reliable power universally
    available

77
Electric Power Grid (cont.)
  • Power to billions of devices
  • Efficient
  • Low-cost
  • Reliable
  • North America 10,000 generators linked to
    billions of outlets
  • Heterogeneous components, distributed ownership
  • Interconnections between regions
    share reserve capacity, trade excess power

78
Electric Power Grid (cont.)
  • Required more than just technology
  • Regulatory, political and institutional
    development
  • Infrastructure for monitoring and management
  • Huge social impact
  • Fundamentally changed work and home life
  • Huge environmental impact
  • Consume resources, generate pollution, global
    warming,

79
Another Infrastructure Analogy Railroads and
the Rise of Chicago
  • Early 1800s Chicago was a small field of onions
    on a very large lake
  • Impact of railroad infrastructure
  • Trains used for shipment of goods
  • Chicago a cache for agricultural products
  • New financial institutions, technologies and
    industries
  • Board of Trade
  • Stockyards, refrigerated rail cars
  • Midwests native ecosystems destroyed
  • Bison replaced by cattle
  • Prairie replaced by wheat and corn fields

80
New Infrastructure Has Serious Social
Consequences
  • More examples highway system, telephone
    network, banking system
  • What changes will the Grid infrastructure bring
    about?
  • Enable unimagined applications
  • Likely to have positive and negative effects
  • Are we ready to deal with the rate of change?
  • Processing power, bandwidth, storage all growing
    exponentially

81
Based on Infrastructure Analogies Desired
Characteristics of Grids
  • Pooling of resources
  • Compute cycles, data, people, sensors
  • Dependable service
  • Predictable
  • Sustained performance
  • Often high-performance

82
Grid Characteristics (cont.)
  • Consistent service
  • Standard services available
  • Via standard interfaces
  • Enable application development
  • Pervasive
  • Services always available
  • Inexpensive
  • Otherwise not widely accepted and used

83
Application Examples
  • Online instrumentation
  • Distributed supercomputing
  • Collaborative engineering
  • High-throughput computing
  • Remote job submission, meta-queueing

84
Online Instrumentation
Advanced Photon Source
wide-area dissemination
desktop VR clients with shared controls
real-time collection
archival storage
tomographic reconstruction
DOE X-ray grand challenge ANL, USC/ISI, NIST,
U.Chicago
85
Grid ApplicationsDistributed Supercomputing
  • Solve problems that cannot be solved using a
    single system
  • Example applicationdistributed, interactive
    simulation involving 100,000s of entities
  • Difficult issues
  • Co-scheduling of scarce, expensive resources
  • Algorithms that scale to many nodes and tolerate
    latency
  • Achieving and sustaining high performance across
    heterogeneous systems

86
Globus ExampleDistributed Supercomputing
  • SF-Express distributed, interactive simulation
  • 100K vehicles (2002 goal) using 13 computers,
    1386 nodes, 9 sites
  • Largest DIS ever done
  • Globus mechanisms for
  • Resource allocation
  • Distributed startup
  • I/O and configuration
  • Fault detection

NCSA Origin
Caltech Exemplar
CEWES SP
Maui SP
P. Messina et al., Caltech
87
Grid ApplicationsHigh-Throughput Computing
  • Schedule large numbers of loosely-coupled or
    independent tasks
  • Tie together idle workstations
  • Put unused cycles to work
  • Example applications chip design, solving
    cryptographic problems
  • Systems
  • Condor manages pool of hundreds of workstations
    around the world
  • Entropia startup company

88
High-Throughput ComputingSETI_at_home
89
Grid ApplicationsData-Intensive Computing
  • Geographically distributed data repositories,
    digital libraries and databases
  • Up to petabytes of data
  • Example applications High-energy physics
    experiments, climate modeling, human genome
    project databases
  • Challenging Issues
  • High-performance data transfers in wide-area
    environments
  • Management of caching and replication

90
Globus Data-Intensive Computing
How do midwest flood frequencies under 2xCO2
scenario compare with historical data?
91
Grid ApplicationsCollaborative Computing
  • Enabling and enhancing human interactions
  • Virtual shared spaces
  • Shared resources data archives, simulations
  • Example applications collaborative design or
    collaborative exploration of data sets
  • Challenges
  • Real-time requirements of human perception
  • Rich interactions

92
Globus ExampleCollaborative Engineering
Manipulate shared virtual space Simulation
components Multiple flows Control, Text, Video,
Audio, Database, Simulation, Tracking,
Haptics, Rendering Uses Globus communication
CAVERNsoft UIC Electronic Visualization
Laboratory
93
Grids Changing Science
  • NSF National Earthquake Engineering Center
  • Integrated instrumentation, collaboration,
    simulation environment
  • National Environmental
  • High-energy Physics Grid (GriPhyn)
  • CERN Data Grid

94
Current and Future Applications
  • Interesting applications exist today
  • More sophisticated applications will follow
  • Characteristics
  • Appetite for resources (CPU, memory, storage)
  • Synchronization
  • Only satisfied by multiple systems
  • Need high availability of resources

95
Who will use grids?
  • 1. Governments
  • Disaster response, national defense, national
    collaboratory, strategic
    computing reserve
  • 2. Private grids for institutions
  • Relatively low-cost, small-scale
  • Central management
  • Example hospitals and medical personnel

96
Who will use grids? (cont.)
  • 3. Virtual Grid
    Multi-institution collaboration
  • Large, fluid, highly-distributed community
  • Hundreds of researchers and students around the
    world
  • Share instruments, data archives, software,
    computers
  • Public Grid
  • Enormous community
  • Consumers, service providers, resource providers,
    network providers

97
Summary
  • Grids will change the way we do science and
    engineering
  • Transition of services and application to
    production use
  • Future will see increases sophistication and
    scope of services, tools, and applications
Write a Comment
User Comments (0)
About PowerShow.com