Title: Service Oriented Architecture and Globus Toolkit
1Service Oriented Architecture and Globus Toolkit
- Ravi K Madduri
- Argonne National Laboratory
- University of Chicago
2Agenda
- Principles of Service Oriented Architecture
- The Globus Toolkit
- Web Services Basics
- Grid Services
- What people punt on ?
- Intro to Globus Security, Service Registries
- Workflows we created
- Lessons learned
3Principles of Service Oriented Architecture
- Guiding principles define the ground rules for
development, maintenance, and usage of the SOA - Reuse, granularity, modularity, composability,
componentization and interoperability - Standards compliance (both common and
industry-specific) - Services identification and categorization,
provisioning and delivery, and monitoring and
tracking
4Architectural Principles
- Service encapsulation Many web services are
consolidated to be used under the SOA. - Service loose coupling Services maintain a
relationship that minimizes dependencies and only
requires that they maintain an awareness of each
other - Service contract Services adhere to a
communications agreement, as defined collectively
by one or more service description documents - Service abstraction Beyond what is described in
the service contract, services hide logic from
the outside world
5Architectural Principles
- Service reusability Logic is divided into
services with the intention of promoting reuse - Service composability Collections of services
can be coordinated and assembled to form
composite services - Service autonomy Services have control over the
logic they encapsulate
6Architectural Principles
- Service optimization All else equal,
high-quality services are generally considered
preferable to low-quality ones - Service Discoverability - Services are designed
to be outwardly descriptive so that they can be
found and assessed via available discovery
mechanisms - Service Relevance Functionality is presented at
a granularity recognized by the user as a
meaningful service
7Globus Software dev.globus.org
Globus Projects
OGSA-DAI
GT4
MPICH- G2
Data Rep
Replica Location
Java Runtime
MyProxy
Delegation
GridWay
GridFTP
MDS4
CAS
C Runtime
GSI- OpenSSH
Incubator Mgmt
Reliable File Transfer
GRAM
Python Runtime
C Sec
GT4 Docs
Security
Execution Mgmt
Info Services
Common Runtime
Other
Data Mgmt
8Web Service Basics
- Web Services are basic distributed computing
technology that let us construct client-server
interactions
Borja Sotomayor , http//gdp.globus.org/gt4-tutor
ial/multiplehtml/ch01s02.html
9Web Service Basics 2
- Web services are platform independent and
language independent - Client and server program can be written in diff
langs, run in diff envts and still interact - Web services describe themselves
- Once located you can ask it how to use it
- Web services are ideal for loosely coupled
systems - Unlike CORBA, EJB, etc.
10WSDL Web Services Description Language
Define expected messages for a service, and their
(input or output parameters) An interface groups
together a number of messages (operations)
11Real Web Service Invocation
Borja Sotomayor , http//gdp.globus.org/gt4-tutor
ial/multiplehtml/ch01s02.html
12Web Services Server Applications
- Web service software that exposes a set of
operations - SOAP Engine handle SOAP requests and responses
(Apache Axis) - Application Server provides living space for
applications that must be accessed by different
clients (Tomcat) - HTTP server- also called a Web server, handles
http messages
Borja Sotomayor , http//gdp.globus.org/gt4-tutor
ial/multiplehtml/ch01s02.html
13Lets talk about state
- Plain Web services are stateless
Borja Sotomayor , http//gdp.globus.org/gt4-tutor
ial/multiplehtml/ch01s03.html
14However, Many GridApplications Require State
Borja Sotomayor , http//gdp.globus.org/gt4-tutor
ial/multiplehtml/ch01s03.html
15Keep the Web Serviceand the State Separate
- Instead of putting state in a Web service, we
keep it in a resource - Each resource has a unique key
Borja Sotomayor , http//gdp.globus.org/gt4-tutor
ial/multiplehtml/ch01s03.html
16Resources Can Be Anything Stored
Address of a WS-resource is called an end-point
reference
17Web Services So Far
- Basic client-server interactions
- Stateless, but with associated resources
- Self describing using WSDL
- But wed really like is a common way to
- Name and do bindings
- Start and end services
- Query, subscription, and notification
- Share error messages
18Standard Interfaces
- Service information
- State representation
- Resource
- Resource Property
- State identification
- Endpoint Reference
- State Interfaces
- GetRP, QueryRPs, GetMultipleRPs, SetRP
- Lifetime Interfaces
- SetTerminationTime
- ImmediateDestruction
- Notification Interfaces
- Subscribe
- Notify
- ServiceGroups
Web Service
Client
19WSRF WS-Notification
- Naming and bindings (basis for virtualization)
- Every resource can be uniquely referenced, and
has one or more associated services for
interacting with it - Lifecycle (basis for fault resilient state
management) - Resources created by services following factory
pattern - Resources destroyed immediately or scheduled
- Information model (basis for monitoring
discovery) - Resource properties associated with resources
- Operations for querying and setting this info
- Asynchronous notification of changes to
properties - Service Groups (basis for registries collective
svcs) - Group membership rules membership management
- Base Fault type
20WSRF vs XML/SOAP
- The definition of WSRF means that the Grid and
Web services communities can move forward on a
common base - Why Not Just Use XML/SOAP?
- WSRF and WS-N are just XML and SOAP
- WSRF and WS-N are just Web services
- Benefits of following the specs
- These patterns represent best practices that have
been learned in many Grid applications - There is a community behind them
- Why reinvent the wheel?
- Standards facilitate interoperability
21WS Core Enables FrameworksE.g., Resource
Management
Applications of the framework(Compute, network,
storage provisioning,job reservation
submission, data management,application service
QoS, )
WS-Agreement(Agreement negotiation)
WS Distributed Management(Lifecycle, monitoring,
)
WS-Resource Framework WS-Notification
() (Resource identity, lifetime, inspection,
subscription, )
Web services(WSDL, SOAP, WS-Security,
WS-ReliableMessaging, )
An evolution of Open Grid Services
Infrastructure (OGSI)
22Globus and Web Services
User Applications
GlobusWSRF Web Services
Registry and Admin
Globus Container(e.g., Apache Axis)
WS-A, WSRF, WS-Notification
WSDL, SOAP, WS-Security
Globus Core Java , C (fast, small footprint),
Python
23Globus and Web Services
Globus Core Java , C (fast, small footprint),
Python
24Globus Security
- Extensible authorization framework based on Web
services standards - SAML-based authorization callout
- Security Assertion Markup Language, OASIS
standard - Used for Web Browers authentication often
- Very short-lived bearer credentials
- Integrated policy decision engine
- XACML (eXtensible Access Control Markup Language)
policy language, per-operation policies, pluggable
25Delegation Service
- Higher level service
- Authentication protocol independent
- Refresh interface
- Delegate once, share across services and
invocation
Hosting Environment
Service1
Resources
Service2
EPR
Delegation Service
Service3
Delegate
Refresh
Refresh
EPR
Delegate
Client
Rachana Ananthakrishnan
26Delegation
- Secure Conversation
- Can delegate as part of protocol
- Extra round trip with delegation
- Types Full or Limited delegation
- Delegation Service is preferred way of delegating
- Secure Message and Secure Transport
- Cannot delegate as part of protocol
Rachana Ananthakrishnan
27Globuss Use ofSecurity Standards
Supported, Supported, Fastest,
but slow but insecure so default
28Monitoring and Discovery System(MDS4)
- Grid-level monitoring system
- Aid user/agent to identify host(s) on which to
run an application - Warn on errors
- Uses standard interfaces to provide publishing of
data, discovery, and data access, including
subscription/notification - WS-ResourceProperties, WS-BaseNotification,
WS-ServiceGroup - Functions as an hourglass to provide a common
interface to lower-level monitoring tools
29Taverna
A sample caGrid workflow
caGrid Scavenger with semantic/metadata based
caGrid service query
30Sample Workflow with caDSR
- Scientific value
- To find all the UML packages related to a given
context (caCore). - Not a real scientific experiment.
- Simple.
- Important in caGrid.
- Steps
- Querying Project object.
- Do data transformation.
- Querying Packages object and get the result.
Workflow input
caGrid services
Shim services
Workflow output
31Protein sequence information query
- Scientific value
- To query protein sequence information out of 3
caGrid data services caBIO, CPAS and GridPIR. - To analyze a protein sequence from different data
sources. - Steps
- Querying CPAS and get the id, name, value of the
sequence. - Querying caBIO and GridPIR using the id or name
obtained from CPAS.
32Microarray clustering
- Scientific value
- A common routine to group genes or experiments
into clusters with similar profiles. - To identify functional groups of genes.
- Steps
- Querying and retrieving the microarray data of
interest from a caArrayScrub data service at
Columbia University - Preprocessing, or normalize the microarray data
using the GenePattern analytical service at the
Broad Institute at MIT - Running hierarchical clustering using the
geWorkbench analytical service at Columbia
University
Workflow in/output
caGrid services
Shim services
others
Wei Tan, Ravi Madduri, Kiran Keshav, Baris E.
Suzek, Scott Oster, Ian Foster. Orchestrating
caGrid Services in Taverna. ICWS 08.
33Execution trace
Execution result as xml
1936 gene expressions
34 Lymphoma prediction type prediction
- Scientific value
- Using gene-expression patterns associated with
DLBCL and FL to predict the lymphoma type of an
unknown sample. - Using SVM (Support Vector Machine) to classify
data, and predicting the tumor types of unknown
examples. - (Major) steps
- Querying training data from experiments stored in
caArray. - Preprocessing, or normalize the microarray data.
- Adding training and testing data into SVM service
to get classification result.
Fig. from MA Shipp. Diffuse large B-cell
lymphoma outcome prediction by gene-expression
profiling and supervised machine learning. Nature
medicine, 2002
35 Querying Preprocessing Classifying
predicting
36Lymphoma type prediction
Classification errors are highlighted.
Acknowledgement Juli Klemm, Xiaopeng Bian,
Rashmi Srinivasa (NCI) Jared Nedzel (MIT)
37Lessons Learned
- Service abstraction not applicable to everything
- Virtual Organization concepts still good
- Web services is one way to create service
oriented architectures but not always the best
way - Make implementation agnostic of tools underneath
- True value in ability to create workflows
38Service-Oriented Science
- People create services (data or functions)
- which I discover ( decide whether to use)
- compose to create a new function ...
- then publish as a new service.
- ? I find someone else to host services, so I
dont have to become an expert in operating
services computers! - ? I hope that this someone else can manage
security, reliability, scalability,
!
!
Service-Oriented Science, Science, 2005
39Questions ?