Grid Services - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Grid Services

Description:

Grid Services – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 68
Provided by: BenT54
Category:
Tags: emu | grid | services

less

Transcript and Presenter's Notes

Title: Grid Services


1
Grid Services
  • Presented by
  • Karan Bhatia

2
Hype Curve
3
Overview
  • Grid Computing Background
  • Definition
  • Opportunities
  • Markets
  • Technical Challenges
  • Security Infrastructure
  • Resource Management
  • Service Interoperability
  • Summary

4
Grid Computing is
  • Co-ordinated resource sharing and problem
    solving in dynamic multi-institutional virtual
    organization. Foster, Kesselman, Tuecke
  • Co-ordinated - multiple resources working in
    concert, eg. Disk CPU, or instruments
    database, etc.
  • Resources - compute cycles, databases, files,
    application services, instruments.
  • Problem solving - focus on solving scientific
    problems
  • Dynamic - environments that are changing in
    unpredictable ways
  • Virtual Organization - resources spanning
    multiple organizations and administrative
    domains, security domains, and technical domains

5
Grid Computing is (Industry)
  • about finding distributed, underutilized compute
    resources (systems, desktops, storage) and
    provisioning those resources to users or
    applications requiring them. The Grid Report,
    Clabby Analytics
  • Distributed - all the resources laying around in
    departments or server rooms.
  • Underutilized - typical utilization of big iron
    is 5 to 10. Organizations save money by
    increasing utilization versus purchasing new
    resources.
  • Resources - servers and server cycles,
    applications, data resources
  • Provisioning - predict and schedule resource use
    depending on load.

6
Types of Grids
  • Compute Grids
  • Seti_at_home, Entropia, United Devices, Condor
  • Data Grids
  • Storage Resource Broker (SRB), Avaki, BIRN, GEON
  • Collaboration Grids
  • Instrumentation (telescience), applications
  • Enterprise Grids
  • Majority of commercial interest
  • Partner Grids
  • B2B, Academic/Govt Grids
  • Service Grids
  • Utility Computing, On Demand, pervasive,
    autonomic, etc

7
A Grid is
  • the next generation Internet,
  • all about free cycles ala SETI_at_HOME,
  • a distributed object system,
  • a new programming model,
  • a replacement for high performance computing,

8
Example TeleScience Grid
9
Grid Resources - Networks
10
Grid Resources - Compute
11
Top 500.org
12
(No Transcript)
13
Another Grid Example Google
  • Queries
  • 150 M queries/day (2000/s)
  • 100 countries
  • 3.3 B documents
  • Hardware
  • 15,000 Linux systems in 6 data centers
  • 15 Tflop/s and 1000 TB total capacity
  • 40-80 1U/2U servers/cabinet
  • 100 MB Ethernet switches/cabinate with gigabit
    uplinks
  • Growth from 4000 systems (18 M queries/day)

14
Grid Resources - Data
  • SDSC Resources
  • HPSS
  • SDSC's central long-term data storage system,
  • one of the world's largest IBM High Performance
    Storage System (HPSS) units,
  • currently holds more than a petabyte (a million
    gigabytes) of data in approximately 21 million
    files,
  • It has the capacity to store six petabytes of
    data files are added at an average rate of
    10,000 gigabytes per month.
  • Storage-Area Network (SAN)
  • A 72-processor Sun Microsystems SunFire 15K
    high-end server and 11 Brocade switches (1,400
    ports)
  • 225,000 gigabytes of networked disk storage for
    data-oriented applications.
  • 1 TB of data 2500

15
Protein Data Bank (PDB)
16
Putting it all together TeraGrid
17
Grid Market
18
Grid Companies
  • IBM
  • on demand solutions
  • Sun Microsystems
  • N1 initiative
  • Oracle
  • 10g
  • Dell
  • HP
  • utility computing
  • Platform Computing
  • LSF, metaclulstering
  • United Devices
  • Desktop grids
  • DataSynapse
  • Akamai
  • Google?
  • Sony online entertainment?
  • Wheres Microsoft?

19
Grid Organizations
  • Global Grid Forum (GGF)
  • Organization for the Advancement of Structured
    Information Standards (OASIS)
  • Distributed Management Task Force (DMTF)
  • World Wide Web Consortium (W3C)
  • Globus Alliance
  • NSF Middleware Initiative (NMI)
  • NASA IPG
  • DOE Science Grid
  • EU DataGrid
  • NSF TeraGrid

20
Technical Challenges for Grid Computing
21
Challenges Security
  • Grids traverse organizational boundaries
  • Different administration domains have different
    authentication mechanisms
  • Resources have different use agreements and
    sharing priorities
  • Single sign-on
  • Multiple passwords difficult to manage
  • Rights delegation
  • Trust
  • Authentication of users
  • Authorization of users
  • Resource access

22
Security
  • Public Key Infrastructure
  • Public key A.public
  • Private key A.private
  • Supports Encrpyption
  • Message to B
  • m F(m,A.private), send m to B
  • recv m, m F(m,A.public)
  • Digital Signatures
  • Signed message to B
  • m (m,F(m,A.public))
  • Receiver verifies that m is from A and not
    tampered

23
Grid Security Infrastructure (GSI)
  • A central concept in GSI authentication is the
    certificate.
  • Every user and service on the Grid is identified
    via a certificate, a text file containing the
    following information
  • a subject name identifying the person or object
    that the certificate represents,
  • the public key belonging to the subject,
  • the identity of a Certificate Authority (CA) that
    has signed the certificate to certify that the
    public key and the identity both belong to the
    subject,
  • the digital signature of the named CA.

24
Proxy Certificate
  • A proxy consists of a new certificate with a new
    public and private key.
  • The new certificate contains the owner's identity
    modified slightly to indicate that it is a proxy.
  • The new certificate is signed by the owner rather
    than a CA.
  • This is called a self-signed certificate.
  • The certificate also includes a time notation
    after which the proxy should no longer be
    accepted by others.
  • Proxies have limited lifetimes in order to
    minimize the security vulnerability.
  • Because the proxy isn't valid for very long, it
    doesn't have to kept quite as secure as the
    owner's private key.

25
Mutual Authentication
26
Additional Challenges
  • Certificate Management
  • MyProxy
  • Role-based Access Control
  • CAS, VOM
  • Authorization services
  • Integration with applications Portals

27
Challenges Resource Management
  • Resources loosely-coupled
  • Higher network latencies
  • Planned and unplanned disruptions
  • How to provide QoS guarantees?
  • Case Study Entropia Desktop Grids
  • Additional trust/security issues

28
Entropia Inc.
  • 1997 Scott Kurowski developed GIMPS (Great
    Internet Mersenne Prime Search)
  • First generation network
  • Jan 2000 Kurowski and Chien start Entropia Inc.
  • FightAids_at_home with Art Olson, Scripps Research
  • Second generation network
  • July 2002 DCGrid 5.0 released
  • Third generation network

29
Entropia 1 Gimps
  • Over 1.5 Billion CPU hours served
  • 300,000 machines, over 4 years operational
  • Every PC and hardware config imaginable (proc,
    memory, disk, etc.)
  • Every networking hookup imaginable
  • Found 35th, 36th, 37th, 38th, and 39th Mersenne
    Primes

30
Entropia 2 FightAids_at_home
  • Sept 2000 launch
  • Internet-Based
  • 54,657 total machines
  • 10,770,506 total hours of computation
  • 27,881 peak billions of calculations/sec

31
Entropia 3 DCGrid
  • Enterprise focus
  • Tremendous resources available in enterprise
  • Complements other HPC resources
  • Computing Platform
  • Arbitrary application (open scheduling model)
  • Security, unobtrusiveness, manageability
    guaranteed
  • Focus on
  • Pharmaceuticals, Chemicals, and Materials
  • Financial Services

32
DCGrid Architecture
33
Commoditization of Hardware
Vector Processors
PC Grids
Beowulf Clusters
34
Price/Performance
Performance (TFLOPS)
8.0
4.0
2.0
1.0
15,000,000
600,000
3,000,000
Cost
35
Server vs. Desktop Grids
  • Server environment
  • Fixed IP, always connected
  • Always-on operation
  • Moderate number of systems (10s 100s)
  • Dedicated use, trusted systems
  • Desktop environment
  • Dynamic, temporary IP, intermittent connection
  • Off evenings, off weekends, off lunch
  • Large numbers of systems (100s 1000s - ?)
  • Shared resources, potentially untrusted users
  • These differences give rise to desktop Grid
    challenges

36
Typical PC-Grid Environment
37
PC-Grid Challenges
  • Provide a stable compute environment for apps
  • Isolate app from variable desktop environment
  • Operate in environment of dynamic use
  • Unobtrusiveness and Fault Tolerance are key!
  • Provide simple application integration
  • Support ANY Application without modification
  • Provide centralized management console
  • Zero additional management costs

38
Workflow
39
Stable Compute Environment
  • Entropia Proprietary Sandbox
  • Binary-level protection
  • System virtualization (registry, file system,
    network)
  • Open Scheduling Infrastructure
  • Intelligent scheduling (match resources to
    subjobs requirements)
  • Manage subjob redundancy/fault tolerance

40
Manage Dynamic Use
  • PC primary use must be respected!
  • Entropia Proprietary Sandbox
  • Guaranteed to run at idle priority
  • Limit application capability
  • Monitor page faults, network access
  • Management
  • Provide time-of-use windows
  • Different levels of unobtrusiveness
  • Gathers 95 of cycles

41
Application Integration
  • Support any Win32 binary
  • Language Neutral (C, C, Fortran, Java,C, etc.)
  • Compiler/library Neutral

App A
Client1
qsub qstat


App B
Client2

Open Grid Platform
Run Applications
App C
Application Preparation Tools
42
Manageability
43
Application Performance
HMMER
GOLD
AUTODOCK
DOCK
44
Scheduling Performance
45
Challenges Service Interoperability
  • Trying to force homogeneity on users is futile.
    Everyone has their own preferences, sometimes
    even dogma.
  • The Internet provides the model

46
Typical Application
47
Typical Application
  • Implementations are provided by a mix of
  • Application-specific code
  • Off the shelf tools and services
  • Tools and services from the Globus Toolkit
  • Tools and services from the Grid community
    (compatible with GT)
  • Glued together by
  • Application development
  • System integration

48
How it Really Happens(without the Grid)
49
How it Really Happens(with the Grid)
50
Theory -gt Practice
51
What You Get in the Globus Toolkit
  • OGSI(3.x)/WSRF(4.x) Core Implementation
  • Used to develop and run OGSA-compliant Grid
    Services (Java, C/C)
  • Basic Grid Services
  • Popular among current Grid users, common
    interfaces to the most typical services includes
    both OGSA and non-OGSA implementations
  • Developer APIs
  • C/C libraries and Java classes for building
    Grid-aware applications and tools
  • Tools and Examples
  • Useful tools and examples based on the developer
    APIs

52
Components in Globus Toolkit 3.0
GSI
WU GridFTP
JAVA WS Core (OGSI)
Pre-WS GRAM
WS-Security
RFT (OGSI)
OGSI C Bindings
WS GRAM (OGSI)
RLS
Data Management
Security
WS Core
Resource Management
Information Services
53
Components in Globus Toolkit 3.2
JAVA WS Core (OGSI)
Pre-WS GRAM
WS GRAM (OGSI)
OGSI C Bindings
OGSI Python Bindings (contributed)
pyGlobus (contributed)
Data Management
Security
WS Core
Resource Management
Information Services
54
Planned Components in GT 4.0
pyGlobus (contributed)
Authz Framework
Data Management
Security
WS Core
Resource Management
Information Services
55
Grid and Web Services Convergence
  • The definition of WSRF means that the Grid and
    Web services communities can move forward on a
    common base.

56
Grid Services Example
  • (from sotomayor tutorial)
  • MathService API
  • add(int x)
  • subtract(int x)
  • getvalue()

Note 1 How is this different than - Web
Services? - Corba? - COM/DCOM?
Note 2 This is too simple! What about -
co-ordination/workflows - personalization -
presentation - security
57
OGSI (or what is a grid service?)
  • Using web service infrastructure
  • MathService is defined by WSDL (like idl)

lt?xml version"1.0" encoding"UTF-8"?gt ... lttypesgt
ltxsdschema targetNamespace"http//www.gt3tutori
al.org/namespaces/0.2/core/gwsdl/Math"
attributeFormDefault"qualified"
elementFormDefault"qualified"
xmlns"http//www.w3.org/2001/XMLSchema"gt
ltxsdelement name"add"gt
ltxsdcomplexTypegt
ltxsdsequencegt
ltxsdelement name"value" type"xsdint"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"addResponse"gt
ltxsdcomplexType/gt lt/xsdelementgt ... lt/ty
pesgt ltmessage name"AddInputMessage"gt
ltpart name"parameters" element"tnsadd"/gt lt/mess
agegt ltmessage name"AddOutputMessage"gt
ltpart name"parameters" element"tnsaddResponse"/
gt lt/messagegt ...
ltgwsdlportType name"MathPortType"
extends"ogsiGridService"gt ltoperation
name"add"gt ltinput
message"tnsAddInputMessage"/gt
ltoutput message"tnsAddOutputMessage"/gt
ltfault name"Fault" message"ogsiFaultMess
age"/gt lt/operationgt ltoperation
name"subtract"gt ltinput
message"tnsSubtractInputMessage"/gt
ltoutput message"tnsSubtractOutputMessage"/gt
ltfault name"Fault"
message"ogsiFaultMessage"/gt
lt/operationgt ltoperation name"getValue"gt
ltinput message"tnsGetValueInputMe
ssage"/gt ltoutput
message"tnsGetValueOutputMessage"/gt
ltfault name"Fault" message"ogsiFaultMessage
"/gt lt/operationgt lt/gwsdlportTypegt lt/defi
nitionsgt
58
Basic Concepts
59
The GridService PortType
  • a grid service is a web service that implements
    the GridService PortType

ltportType name"GridService"gt ltoperation
name"setServiceData"gt snip lt/operationgt ltoperat
ion name"destroy"gt snip lt/operationgt ltoperati
on name"requestTerminationAfter"gt snip
lt/operationgt ltoperation name"requestTerminationBe
fore"gt snip lt/operationgt ltoperation
name"findServiceData"gt snip
lt/operationgt lt/portTypegt ltgwsdlportType
name"GridService"gt ltsdserviceData
maxOccurs"unbounded" minOccurs"1"
modifiable"false" mutability"constant"
name"interface" nillable"false"
type"xsdQName"/gt ltsdserviceData
maxOccurs"unbounded" minOccurs"0"
modifiable"false" mutability"mutable"
name"serviceDataName" nillable"False"
type"xsdQName"/gt ltsdserviceData
maxOccurs"1" minOccurs"1" modifiable"false"
mutability"mutable" name"factoryLocator"
nillable"true" type"ogsiLocatorType"/gt
ltsdserviceData maxOccurs"unbounded"
minOccurs"0" modifiable"false"
mutability"extendable" name"gridServiceHandle"
nillable"false" type"ogsiHandleType"/gt
ltsdserviceData maxOccurs"unbounded"
minOccurs"1" modifiable"false"
mutability"mutable" name"gridServiceReference"
nillable"false" type"ogsiReferenceType"/gt
ltsdserviceData maxOccurs"unbounded"
minOccurs"1" modifiable"false"
mutability"static" name"findServiceDataExtensibi
lity" nillable"false" type"ogsi
OperationExtensibilityType"/gt ltsdserviceData
maxOccurs"unbounded" minOccurs"1"
modifiable"false" mutability"static"
name"setServiceDataExtensibility"
nillable"false" type"ogsiOperationExtensibility
Type"/gt ltsdserviceData maxOccurs"1"
minOccurs"1" modifiable"false"
mutability"mutable" name"terminationTime"
nillable"false" type"ogsiTerminationTimeType"/gt
ltsdstaticServiceDataValuesgt
ltogsifindServiceDataExtensibility
inputElement"ogsiqueryByServiceDataNames"/gt
ltogsisetServiceDataExtensibility
inputElement"ogsisetByServiceDataNames"/gt
ltogsisetServiceDataExtensibility
inputElement"ogsideleteByServiceDataNames"/gt
lt/sdstaticServiceDataValuesgt lt/gwsdlportTypegt
60
GridService PortType
  • FindServiceData()
  • QueryByServiceDataNames()
  • GetServiceData()
  • SetByServiceDataNames()
  • DeleteByServiceDataNames()
  • RequestTerminationAfter()
  • RequestTerminationBefore()
  • Destroy()

61
Capabilities of a Grid Service
  • 2-level naming (GSH vs. GSR)
  • Factories
  • Lifetime management
  • Service Data Elements
  • Event Notification
  • ServiceGroups

62
GSH versus GSR
  • A GSH (Grid Service Handle) is a unique name for
    a Grid Service Instance
  • A GSR (Grid Service Reference) is a perhaps
    temporary mechanism to access the Grid Service
    Instance

63
Factories
  • Create new instances of services dynamically
  • Individualized Instances
  • lifetime management techniques

64
Service Data Elements
  • Generalized State
  • useful for describing capability
  • Get/Set model similar to javaBeans Properties
  • Can specify initial values in WSDL
  • Integrated with Notification mechanism

65
Service Data ElementsGridService
  • Interface
  • ServiceDataName
  • FactoryLocator
  • GridServiceHandle
  • GridServiceReference
  • TerminationTime

66
Notifications
  • Source
  • implements NotificationSourcePortType
  • sends a notification message (XML Element) to
    Sinks
  • Sink
  • implements NotificationSinkPortType
  • sends a notification subscription request to
    source
  • causes a GridService Instance of porttype
    NotificationSubscription to be created

67
ServiceGroups
  • A grid service that maintains information about
    other grid services
  • Can be used to implement a classic registry model
  • Can be used for dataset replication
  • A grid service can belong to more than one
    Service Group
  • Membership in a ServiceGroup can be homogeneous
    or heterogeneous
  • Service group portTypes are optional

68
Grid Services Summary
  • Extends Web Services to support Transient
    Services
  • WSDL 1.2 expected to include extensions
  • Requires support for factories, lifetime
    management, soft-state management, and
    notifications
  • Java implementation pretty solid
  • Security implementation still shaky

69
Other Challenges
  • Developing user interfaces
  • Data Management
  • Scheduling/co-scheduling of resources
  • Failure management
  • Application development
  • Performance
  • Many others

70
What I hope you got from this talk
  • Grid Computing is about
  • Co-ordinated use of different resources
  • Provisioning resources for increased utilization
  • Scaling to large numbers of resources, services
    and users
  • Many systems being built
  • Many Applications being developed

71
Pop Quiz
  • What is the definition of Grid Computing?
  • What kinds of resources are we talking about?
  • What are the main technical challenges in
    building grids?
  • Why should you care?
  • What is a proxy certificate? And why is it not
    encrypted?
Write a Comment
User Comments (0)
About PowerShow.com