Title: Grid Architecture Developments
1Grid Architecture Developments
- Aslam Parvez Memon Shakil Akhtar
- Shaheed Zulifkar Ali Bhutto Institute of Science
and Technology (SZABIST), Karachi Pakistan. - http//www.szabist.edu.pk
- December 21, 2006
2Outlines
- Defining Grid Computing
- What is Grid and Grid Computing
- Grid Technology
- Why Grid
- Study of Grid Computing
- Grid Technology Problem Space
- Virtual Organization
- Virtual Organization Problem Space
- Grid Architecture
- Some Solutions
- Globus Toolkit
- Working With Gird
- Key Concepts of GT4
- Major Grid Projects
- Research Bodies
- Global Community
- Research Areas
- The Research Processes
3Ian Fosters 3 point checklist
- A Grid is a system that is able to
- coordinate resources that are not subject to
centralized control - Use standard, open, general-purpose protocols
and interfaces - to deliver nontrivial qualities of service.
4Defining Grid Computing
- There are several competing definitions for The
Grid and Grid computing - These definitions tend to focus on
- Implementation of Distributed computing
- A common set of interfaces, tools and APIs
- Some stress the inter-institutional aspect of
grids and Virtual Organizations - The Virtualization of Resources abstraction of
resources
5What is Grid and Grid Computing?
- Grid computing must provide basic functions
- resource discovery and information collection
publishing - data management on and between resources
- process management on and between resources
- common security mechanism underlying the above
- process and session recording/accounting
6Grid Technology
- Emerging enabling technology.
- Natural evolution of distributed systems and the
Internet. - Middleware supporting network of systems to
facilitate sharing, standardization and openness. - Infrastructure and application model dealing with
sharing of compute cycles, data, storage and
other resources. - Publicized by prominent industries as on-demand
computing, utility computing, etc. - Move towards delivering computing to masses
similar to other utilities (electricity and voice
communication). - Currently used for high performance computing
however the trend is towards Service Oriented
Applications (SOA).
7Why Grid?
- What can the grid do that existing technology
cannot do? - Grid infrastructure and application architecture
form a global computing framework facilitating
sharing of resources and schedulability of jobs
by matching their needs with available pool of
compute and storage resources. - Compute cycles can be tapped on demand from
sources other then yours. - Wasted cycles from idle sources can be utilized
for use in needed application. - Grid is molding computing into an utility similar
to utilities we are used to electricity and
telephone.
8Study of Grid Computing
- Components Core, system defined and user defined
- Infrastructure
- Application model
- Standards
- Application Programming Interfaces
- Technology Support (enabling technologies)
- Job submission and associated functions
- Service creation and deployment and related
functions
9Grid Technology Problem Space
- Grid technologies and infrastructures support the
sharing and coordinated use of diverse resources
in dynamic, distributed virtual organizations. - Grid technologies are distinct from technology
trends such as Internet, enterprise, distributed
and peer-to-peer computing. But these
technologies can benefit from growing into the
problem space addressed by grid technologies.
10The Grid Problem
- Flexible, secure, coordinated sharing of
computation among dynamic collections of
individuals, institutions, and resources - Enable communities (virtual organizations) to
share geographically distributed resources as
they pursue common goals -- assuming the absence
of - central location
- central control
- omniscience
- existing trust relationships
11Virtual Organization
- Grids virtual organizations (VOs) concept
provides seamless access to federated
heterogeneous resourcescomputers, mobile
devices, network bandwidth, storage, databases,
scientific instruments, servers etc. by creating
illusion of supercomputing infrastructure. A grid
user can have on demand access to such resources,
distributed across various organizations in
different geographical locations, yet in a
controlled and secure resource sharing
environment.
12Elements of the Problem
- Resource sharing
- Computers, storage, sensors, networks,
- Sharing always conditional issues of trust,
policy, negotiation, payment, - Coordinated problem solving
- Beyond client-server distributed data analysis,
computation, collaboration, - Dynamic, multi-institutional virtual orgs
- Community overlays on classic org structures
- Large or small, static or dynamic
13The Programming Problem
- Applications require resources (compute power,
storage, data, instruments, displays) at many
sites for many users. - Some requirements
- Abstractions and models to increase
speed/robustness/etc. of development - Tools to ease application development and
diagnose common problems, ease deployment - Code/tool sharing to allow reuse of code
components developed by others
14Grid must suspport computational workflows
- Locate suitable computers
- Authenticate with appropriate sites
- Allocate resources on those computers
- Initiate computation on those computers
- Configure those computations
- Select appropriate communication methods
- Compute with suitable algorithms
- Access data files, return output
- Respond appropriately to resource changes
15Grid Requirements
- identity authentication
- authorization policy
- resource/service discovery
- resource allocation
- (co-)reservation, workflow
- remote data access
- rapid data transfer
- monitoring
- intrusion detection
- resource management
- accounting
- fault management
- system evolution
- and more
16Grid Computing - Functions
- Grid computing must provide typically these basic
functions (Foster/Kesselman) - resource discovery and information collection
publishing - data management on and between resources
- process management on and between resources
- common security mechanism underlying the above
- In addition, it should include
- process and session recording/accounting
17Grid Architecture
- Architecture identifies the fundamental system
components, specifies purpose and function of
these components, and indicates how these
components interact with each other. - Grid architecture is a protocol architecture,
with protocols defining the basic mechanisms by
which VO users and resources negotiate ,
establish, manage and exploit sharing
relationships. - Grid architecture is also a services
standards-based open architecture that
facilitates extensibility, interoperability,
portability and code sharing. - API and Toolkits are also being developed.
18(No Transcript)
19Layered Grid Architecture
- Fabric Layer - provides the local services of a
resource - computational, storage, network
- Connective Layer - core communication and
authentication protocols - Enables exchange of data between fabric layer
resources - Security and authentication important here
20Layered Grid Architecture (cont.)
- Resource Layer enables resource sharing
- Builds on connectivity layer to control and
access resources (Ex data servers) - Collective Layer - coordinates interactions
across multiple resources - Ties multiple resources and services together
- (Ex metacatalogues)
- Application Layer - user applications use
collective, resource, and connective layers to
perform grid operations in a virtual organization
21Some Solutions
- Middleware Toolkits not all speak (or spoke)
Globus - Condor
- Globus Toolkit
- Legion/Avaki
- Condor (now Sun Grid Engine)
- Unicore
- Higher Level Toolkits (build on Globus)
- JavaCoG
- GridPortal Toolkit, Grid Portal Development
Toolkit (GPDK) - Condor-G
- SGE
22The Globus Toolkit
- Open-source reference software base for
developing Grid infrastructure and applications - Implements GGF standards
- Service-oriented
- Services can be decoupled from any fixed resource
- A service consumes resources, but how is not most
important - A better base abstraction for managing
dependability, end-to-end quality of service
Slide Courtesy of Ian Foster presentation at
Comdex04
23Globus Protocols - Connectivity Layer
- Grid Security Infrastructure (GSI)
- Authentication/authorization, message protection
across institutions - Single sign-on, delegation, identity mapping
- Public key technology
- Certificate authorities, certificate key
management
Ian Foster, et. al., Anatomy of the Grid
24Globus Protocols - Collective Layer
- Metadirectory services
- Resource brokers
- Condor
- Co-reservation/co-allocation services
- Workflow management services
Ian Foster, et. al., Anatomy of the Grid
25Globus Protocols Resource Layer
- Grid Resource Allocation Management (GRAM)
- Remote allocation, control of compute resources
- Furnishes information on state of the resources
to the Metacomputing Directory Service (MDS) - GridFTP
- High-performance data access and transport
- Grid Resource Information Service (GRIS)
- Access to structure and state info (MDS)
- All built on connectivity layer
Ian Foster, et. al., Anatomy of the Grid
26Grid Security Infrastructure (GSI)
- Public key cryptography
- Encryption relies on two keys, related
mathematically so that if either key encrypts a
message, the other must be used to decrypt it - One key is public, the other is kept private
- A user proves own identity by encrypting a
message if the public key can decrypt, the user
is indeed holding the private key - No password is ever exchanged
Ian Foster, et. al., Anatomy of the Grid
27Working With Grids
- A user enrolls himself or his machine with grid
system - The user establishes his identity with CA (this
process may require alternate ways other than
internet) - The CA takes steps to make sure that the user is
in fact who, he claims to be. - The CA makes special certificate available to the
software, which needs to check the identity of
user and his requests to the grid system. - Steps 1-4 may be repeated for the donor
machine(s). A user must keep his security
credentials secure. - User installs the software provided by the grid
system to use the grid and/ donate the machine.
The software may be auto configured or manual by
the user, this configuration is required for - Grid nodes management
- Machine identification information
- Implement constraints on resources access such
as time, type etc. - Providing users IDs on other machines that exist
on grid.
28- A user is required to login to a grid system
using user ID that is enrolled in the grid. - The user can use gird system IDs or operating
system IDs, which ever is enrolled with grid, but
grid system ID is recommended for two reasons - It eliminates the need for matching IDs form
machine to machine - A user can access entire grid as a one large
virtual computer using common ID across the grid. - Globus, as mentioned above uses proxy login
model, which keeps a user logged in for a
specified amount of time, even if a user logs off
and logs back on the operating system, and even
if the machine is rebooted. - Once the user is logged in he can query the grid
or submit job using localization interfaces.
29Key Concepts for GT4
- OGSA, WSRF, and GT4
- These are basic architecture components for GT4
- Open Grid Services Architecture (OGSA)
- Web Services
- OGSA, WSRF, and GT4 are based on standard Web
Services technologies such as SOAP and WSDL. - Ned to be familiar with the Web Services
architecture and languages. - The Web Services Resource Framework
- WSRF is the core of GT4.
- Based on WS-Resources and Web Services, and grid
computing - Java XML
- to use GT4, you need to be able to program in
Java, and to understand basic XML.
30OGSA Key Requirements
- Interoperability and Support for Dynamic and
Heterogeneous Environments - Resource Sharing Across Organizations
- Job Execution
- Data Services
- Security
- Optimization
- Quality of Service (QoS) Assurance
- Administrative Cost Reduction
- Scalability
- Availability
- Ease of Use and Extensibility
31OGSA Defines Basic Capabilities
- Infrastructure Services
- Execution Management Services
- Data Services
- Resource Management Services
- Security Services
- Information Services
- Security Considerations
32GT Architecture
- GT4 comprises both a set of service
implementations (server code) and associated
client libraries. - GT4 provides both Web services (WS) components
and non-WS components - All GT4 WS components use WS-Interoperability-comp
liant transport and security mechanisms - can interoperate with each other and with other
WS components. - All GT4 components support X.509 certificates
- both WS and non-WS
- client can use the same credentials to
authenticate with any GT4 WS or non-WS
component.
33GT4 Services
- Nine GT4 services implement Web services (WS)
interfaces - Job management (GRAM)
- Reliable File Transfer
- Delegation
- Monitoring and Discovery System (MDS)
- MDS-Index, MDS-Trigger, and MDSArchive
- Community Authorization (CAS)
- OGSA-DAI data access and integration
- Grid TeleControl Protocol (GTCP) Grid
- remote instrumentation control
34OGSA, WSRF, GT4 Relationship Diagram
35WS Software stack used by GT4 WSRF
- HTTP Server
- Apache HTTP Server
- Application Server
- Apache Tomcat
- SOAP Engine
- Apache AXIS
- Supports wsdl2java tool - build Java proxies and
skeletons from WSDL docs. - Web Service
- User App
36GT4 Roadmap
37Major Grid Projects
- Earth System Grid, www.earthsystemgrid.org
- Virtual Observatory, http//skyview.gsfc.nasa.gov/
- European Data Grid, http//cern.ch/eu-datagrid
- GriPhyN Project, http//www.griphyn.org/
- PPDG, http//www.ppdg.net/
- HEPGRID, http//www.buyya.com/hepgrid/
- Virtual Laboratory Grid, http//www.jhu.edu/virtla
b/virtlab.html - NEESGRID, http//www.neesgrid.org/
- GEOSIDE, http//www.geodise.org/
- Fusion Grid, http//www.fusiongrid.org/
- IPG Grid, http//www.nas.nasa.gov/About/IPG/ipg.ht
ml - ActiveSeets http//www.csse.monash.edu.au/davida/
nimrod/activesheets.htm - China National Grid (CNGrid), http//www.cngrid.or
g - China Science Grid (CSGrid) http//www.ssc.net.cn/
en/showinfo.asp? categoryid84 - China Semantic Grid, http//kg.ict.ac.cn/
- Shanghai City Information Grid http//www.gridtoda
y.com/05/0131/104536.html
38Research Bodies and Consortiums
- Global Grid Forum (GGF), http//www.ggf.org
- OASIS, http//www.oasis.org
- DMTF, http//www.dmtf.org
- CIM, http//www.dmtf.org/standards/cim
- WBEM, http//www.dmtf.org/standards/wbem
- W3C, http//www.w3.org
- Globus Alliance, http//www.globus.org
- GridBus Project, http//www.gridbus.org/
- Condor Project Home Page, http//www.cs.wisc.edu/c
ondor/ - Legion Project, http//legion.virginia.edu/
- Unicore, http//www.unicore.org
39GlobalCommunity
Slide Courtesy of Ian Foster
40Applications
- Proof concepts
- Academic prototype applications
- Long terms applications
- Grid based Agro-MIS
- Grid based LRMIS
- Grid Based LLDP
- And more..
41Research Areas
- Protocols and Standards Development
- Grid Security
- Service Oriented Architecture
- Resource Management
- Scheduling
- Grid Operating Environments
- Grid Software Development Environments
- Quality of Services
- Grid Localization
- Grid Simulations
- Toolkits and Portals developmentb
42The Process
- Literature Review
- Research Question/ Hypothesis
- Analysis (Generic Specifications)
- Design (Technical Specification)
- Implementation
- Testing
- Documentation
- Presentation
43