Title: Grid
1Grid Globus
2Part 1 Grid
3A Story Late 80s, early 90s
- Gigabit testbeds program
- CASA (southwest)
- MAGIC and BLANCA (Midwest)
- AURORA and NECTAR (northeast)
- VISTANET (southeast)
- Focus on connection and bandwidth
4Supercomputing 95
- I-WAY, the first large-scale Grid experiment
- Consisted of a Grid of 17 sites connected by vBNS
- Over 60 applications ran on the I-WAY during
SC95 - Users could use single authentication and job
submission across multiple sites or they could
work directly with end-users
5Mid to late 90s
- Academic software projects (e.g., Legion, Globus)
- Application experiments (e.g., IPG, ASCI,
National Technology Grid)
To be continued
6What is Grid
7Electrical power grid
? Grid
- HPCC ( high performance computer)
Knowledge, transaction
Electricity
Electrical network
WAN, LAN
Appliances (applications)
Terminal (applications)
8Electrical power grid
- Dependable
- Electricity is always there. When the switch is
thrown, the light always comes on. - Consistent
- The uniform interface means that all appliances
use the same connector. - Pervasive
- Electrical service is virtually everywhere,
throughout the world.
9Grid
- Dependable
- Can Provide performance and functionality
guarantees - Consistent
- Uniform interfaces to a wide variety of
resources - Pervasive
- Ability to Plug In from anywhere
10What is Grid ?
- Enabling the coordinated use of geographically
- distributed resources in the absence of
- central control, omniscience, strong trust
- relationships
- Ian
Foster (Globus) - What, exactly, is the Grid ?
- http//access.ncsa.uiuc.edu/CoverStories/WhatisGri
d/
11What is special ?
- High performance
- Bandwidth, speed, architecture
- Shared resources
- Computer, software, data, instrument
- Single system image
- single, coherent virtual machine
- Knowledge generating
12Taxonomy
Distributed Supercomputing
Computational Grid
High Throughput
Grid
Data Grid
On Demand
Collaborative
Service Grid
Multimedia
13Computational Grid
- The computational Grid category denotes
systems that have a higher aggregate
computational capacity available for single
applications than the capacity of any constituent
machine in the system.
Back
14Data Grid
- The data Grid category is for systems that
provide an infrastructure for synthesizing new
information from data repositories such as
digital libraries or data warehouses that are
distributed in a wide area network.
TIDE The Tele-Immersive Data Explorer (USA)
Back
15Service Grid
- The service Grid category is for systems that
provide services that are not provided by any
single machine.
Back
16Distributed Supercomputing
- Distributed Particle
- Physics Research
- California Institute of Technology, USA
- CERN
Back
17High Throughput
SETI_at_home the Search of ExtraTerrestrial
Intelligence
IBM ASCI White 12.3 TeraFLOPs/sec
Back
18On Demand
- Advanced Networking
- for Telemicroscopy
- National Center for Microscopy and Imaging
Research , UCSD, USA - SDSC, USA
- Research Center for Ultra-High Voltage Electron
Microscopy, Osaka Univ., Japan - NLANR, USA
- Lawrence Berkeley National Laboratory, USA
Back
19Collaborative
- CyberCAD Internet Distributed Interactive
Collaborative Design - Temasek Polytechnic, Singapore
- National University of Singapore
- Indiana University, USA
Back
20Multimedia
- MediaZine?A Combination of Television, WWW,
- Telecommunications and 3D Computer Graphics
- Fraunhofer Institute Graphische Datenverarbeitung
(IGD), Germany - Centre for Advanced Media Technology (CAM Tech),
Singapore
21Grid vs Traditional distributed computing
- Heterogeneous hardware environment
- computing platforms
- network connections
- storage systems and caches
- Wide-area distribution
- Wide-area network latency and bandwidth
- Resources in different administration domains
- Dynamic environment
- Resources enter and leave grid
22Research areas
- Architecture (Jini)
- Communication and remote access to data
- Information services
- Resource management
- Tasks scheduling
- Performance
- User management
- Security
- User Environment (programming model)
- Applications
23Communication models
- Client-Server Model,
Remote procedure call,
Group communication - In a grid
- Algorithms must tolerate wide-area latency for
message transfers - Avoid large numbers of messages
- Typically perform larger transfers, initiate
remote jobs rather than procedure calls
24Synchronization
- Clock synchronization, Election
algorithms determine a coordinator,
Atomic transactions - In a grid
- With wide-area latencies, typically perform
synchronization on larger grain - Can implement atomic operations
25Processes and Processors
- Threads,
Allocating Processors,
Scheduling and co-scheduling
resources, Fault tolerance - In a grid
scheduling, allocation, fault tolerance
issues get more complicated in the wide area
environment
26Distributed file systems
- File service (read, write, controls access),
Creating,
deleting managing directories, Naming, Sharing,
Caching
and consistency,
Replication and updates - In a grid
same issues complicated by wide
area distribution, different administrative
domains, enormous data sets
27Some projects
- Globus
- Legion
- IPG (Information Power Grid)
- TeraGrid
- European DataGrid
- NHPCE
(National High Performance Computing
Environment) - Peer to Peer computing
28Globus
- Ian Foster (ANL/UC) and
Carl Kesselman (USC/ISI) - Bag of services model
- Globus Metacomputing Toolkit (GMT) GRAM, Nexus,
MDS, GSI, HBM, GASS - Globus Ubiquitous Supercomputing Testbed
Organization (GUSTO). - http//www.globus.org
29Legion
- Andrew Grimshaw (UVA)
- Single, coherent virtual machine model
- Everything is an object (all hardware and
software components) - Interface and basic functionality a set of core
object types - Users can define their own class
- Centurion and the NPACI-net testbeds
- http//legion.virginia.edu/
30Information Power Grid
- NASA
- Large-scale science and engineering
- Globus providing the Grid Common Services
- Computing resources 800 CPU nodes in half a
dozen SGI Origin 2000s and several workstation
clusters - WAN interconnects of at least 100 mbit/s
- Storage resources 50-100 Terabytes of archival
information/data storage - http//www.ipg.nasa.gov/
31TeraGrid (2001.8)
- NSF (NCSA, SDSC, ANL, CIT), IBM
- The world's largest, fastest, most comprehensive,
distributed infrastructure - Built from four individual clusters
- 53 million
- 13.6 teraflops
- 450 terabytes
- 40 gigabits/second
- http//www.teragrid.org/
32European DataGrid
- CERN The European Organisation for Nuclear
Research (20 European countries. 2,700 staff,
6,000 users) - Project objectives
- Middleware for fabric Grid management
- Large scale testbed
- Production quality demonstrations
- http//www.eu-datagrid.org
33NHPCE
- National High Performance Computing Environment
- Beijing, Shanghai, Hefei, Xian, Changsha,
Chengdu - Infrastructure, Gridware
- Weather forecast, Bioinformaion, Industry
applications, Information services - http//www.grid.org.cn/
34Peer-to-peer computing
- Intel (Marquam)
- The sharing of computer resources and services by
direct exchange between systems. - Information, processing cycles, cache storage,
and disk storage for files - Collaboration, Edge services, Distributed
computing and resources, Intelligent agents. - http//www.peer-to-peerwg.org/
35Lists
- http//www.gridforum.org/info/Initiatives.html
- http//www.globus.org/about/related.html
- http//www.gridcomputing.com/
36Part2 Globus
37Globus Project
- 1996,Argonne National Lab.
-Basic research in grid-related
technologies -Development of Globus
Toolkit -Construction of production grids
testbeds -Application experiments
- Globus Toolkit 1.1.3---Jun 2000
- Globus Toolkit 1.1.4---Nov 2000
- TeraGrid uses globus as a core software
38Globus Layered Architecture
Applications
High-level Services and Tools
GlobusView
Testbed Status
DUROC
globusrun
MPI
Nimrod/G
Condor
HPC
Core Services
GRAM
Nexus
Metacomputing Directory Service
Globus Security Interface
Heartbeat Monitor
I/O
GASS
Local Services
Condor
MPI
TCP
UDP
LSF
NQE
PBS
AIX
Linux
Solaris
39Globus Toolkit Services
- Security
- Grid Security Infrastructure (GSI)
- Information Service
- Grid Information System (GIS)
- Resource Management
- Globus Resource Allocation Manager (GRAM)
- Data Management
- Globus Access to Secondary Storage (GASS)
- Communication
- NEXUS, Globus_io
- Process Monitoring
- Heart Beat Monitor (HBM)
401. Security
- GSI Grid Security Infrastructure
- Uses GSSAPI (a Generic Security Services API
implementation Based on SSL) - Single sign-on for all resources
- Mutual authentication
-
41GSI User Management
- Grid user id,certification and key
- Example
- grid-cert-request
- grid-proxy-init
- Enter PEM pass phrase
-
- Gridmap file
- "/OGlobus/ONPACI/OUSDSC/CNRich Gallup rpg
42GSI - Secure Remote Startup
1. Exchange certificates 2. Check gridmap
file 3. Lookup Service 4. Run service program
432. Information Service
- GIS Grid Information System
- Based on LDAPv3
- Critical for Grid
- - Resource Discovery
- What resources are available?
- - Resource Selection
- What is the state of the grid?
- - Application Configuration and Adaptation
- How to optimize resource use
44GIS Structure Classic
- Centralized
- Use the same server mds.globus.org
45GIS Structure Standard (1)
- Decentralized
- -GRIS and GIIS
- Resource Description Services --
- Grid Resource Information Service (GRIS)
- - One GRIS server running on each resource
- - Provide the state of the local resource
46GIS Structure Standard (2)
- Aggregate Directory Services --
- Grid Index Information Service (GIIS)
- - Gathers information from multiple GRIS
servers - - Each GIIS is optimized for particular queries
473. Resource Management
48Major Components (1)
- RSL Resource Specification Language
- - Information exchange for components
- - Resource requirements Job configuration
- Example
- (countgt5) (countlt10)
- (max_time240) (memorygt64)
- (executablemyprog)
49Major Components (2)
- GRAM Globus Resource Allocation Manager
- -Processes requests for execution on remote
resources - -Allocates local resources required by remote
requests - -Provides an API for submit job,jobs status
query and cancel job
50Major Components (3)
- DUROC Dynamically Updated Request Online
Co-allocator - - Co-allocationSimultaneous allocation of a
resource set - Example
- ( (count5)(memorygt64)
- (executablep1))
- ((networkatm) (executablep2))
51Major Components (4)
- GARA General-purpose Architecture for
Reservation and Allocation - - 2nd generation resource management service
- - Advance reservation of resources
- - A research prototype currently
52Interesting Things
- Broker and Meta-Scheduler
- - No implementation in Globus Toolkit
- - Nimrod/G, AppLeS build based on GRAM and GIS
- - PBS,LSF
- Future
- -Reservation,Failure Management,Security,XML
534. Data Management
- GASS
- Globus Access to Secondary Storage
- -For GRAM to access remote data
- GridFTP
- -For high-performance, reliable data transfer in
the Grid environment - Replica Management
- -A map between logical to one or more physical
545. Communication (1)
- NEXUS
- 1. Uni-API MultiMethod Communication
- -- Where?What?When?
- -- Unicast vs. multicast
- Reliable vs. unreliable
- Quality of service
- Encrypted vs. unencrypted
- Compressed vs. uncompressed
- -- automatically or manually
55Communication (2)
- 2. Communication Links (CLs)
- 3. Remote Service Request (RSR)
- 4. MPI-G implementation based on NEXUS
56Communication (3)
- Globus_io
- -Compatibility?Security?QoS
- -- Easier Win32 portability
- -- Connection properties query
- -MPI-G implementation based on Globus_io
- Future
- NEXUS?Globus_io
57An Example Installation
58Useful Site
- www.globus.org
- discuss_at_globus.org
59End
60?????
- ???????,???????????
- ??????,??????
- ???? BBS ?
HPCC ? - ????FTP//202.38.76.228/incoming/Grid