Title: Building Science Gateways
1Building Science Gateways
- Marlon Pierce
- Community Grids Laboratory
- Indiana University
2What Is a Web Portal?
- Web container that aggregates content from
multiple sources into a single display. - Start Pages
- Typically consume RSS/Atom news feeds.
- More powerful versions these days support Flickr,
calendars, games, etc. - Gadgets, widgets
- Examples iGoogle, Netvibes, My Yahoo!
3Grid Computing Overview
- Grid computing software is designed to integrate
large supercomputing facilities. - TeraGrid, Open Science Grid, EGEE, etc.
- This is done via network services
- Key Service Components
- Authentication and authorization framework
(MyProxy) - Remote process access and control (GRAM, Condor)
- Remote file, I/O access (GridFTP)
- Additional Services
- Information services, replica management,
database federation, storage management,
schedulers, etc. - Example Grid Software Stacks CTSS and VDT
4TeraGrid Supercomputing Resources (GPIR)
5Science Portals and Gateways
- Science Gateways adapt Web portal technology to
build user interfaces to the Grid. - Science portals resemble standard portals, but
must also - Support access to computing and storage
resources. - Allow users remote, Unix-like access to these
resources. - Provide access to science applications and data
sets. - And we must provide value added services as well
as user interfaces.
6My 2002 octopus SOA diagram, from the archives.
Browser Interface
HTTP(S)
Portlets Client Stubs
SOAP/HTTP
WSDL
WSDL
WSDL
WSDL
WSDL
WSDL
WSDL
WSDL
WSDL
DB Service
Job Sub/Mon And File Services
Visualization Service
JDBC
DB
DB
Operating and Queuing Systems
Host 1
Host 2
Host 3
7Terminology
- Portlet this is a standard Java component that
generates HTML and can also act as a client to a
remote service. - Lives in a portal container.
- I will also use this term generically.
- Web Service a remotely invokeable function on
the Internet. - SOAP the XML message envelop for carrying
commands over HTTP. - WSDL describes the services API in XML.
- REST A variation of this approach.
- Lots more info http//grids.ucs.indiana.edu/ptliu
pages/presentations/I590WebService.ppt
8But Why?
- Three-tiered Service Oriented Architecture is the
network equivalent of the the famous
Model-View-Controller design pattern. - View the user interface components.
- Controller Web service middleware
- Model the backend resources.
- Independence of tiers gives flexibility
- Services can be reused with alternative user
interfaces - Workflow composers like Taverna
- User interfaces can work with different service
implementations. - Drawback reliability and robustness are issues.
9Two Approaches to the Middle Tier
Fat Client
Thin Client
Portal Client
Portal Client
Grid Client
HTTP SOAP
Web Service
Grid Protocol (SOAP)
Grid Client
Grid Protocol (SOAP)
Grid Service
Grid Service
Backend Resource
Backend Resource
10Disloc output converted to KML and plotted.
11GeoFEST Finite Element Modeling portlet and
plotting tools
12Whats In the Screenshots?
- GeoFEST and Disloc Portlets
- Live on gf7.ucs.indiana.edu
- Manage the users display Web forms, links to
output, graphics. - Save user session state persistently.
- QuakeTables Fault DB Web Service
- Lives on gf2.ucs.indiana.edu
- Contains geometric fault models.
- GeoFEST and Disloc Execution Web Services
- Lives on gf19.ucs.indiana.edu
- Generates input files from fault models.
- Runs and manages codes.
13Best Practice for Scientific Web Services
- There are many tools to choose from.
- .NET, Apache Axis, Sun WS, Ruby on Rails, etc.
- Make them self-contained.
- If possible, generate input files within the
service. - Or have an input file generating service.
- Remember that they may be used by other people
with other client tools. - Communicate data files with URLs.
- Be very careful about exposing the state of the
service. - Dont assume persistent connections.
14Components for Portals
- Open Grid Computing Environments Examples. See
http//www.collab-ogce.org/
15Components for Science Portals
- OGCE is founded on the principal that portals
should be built out of reusable parts. - Key standard in our first phase the JSR 168
portlet specification. - Portlets can run in multiple containers
- uPortal, Sakai, GridSphere, LifeRay, etc.
- Allows us to build Grid specific components and
deploy along side other goodies Sakai
collaboration tools, contributed portlets, etc. - Future Open Social compliant Google Gadgets
16OGCE GPIR portlet can interoperate with TeraGrid
and your own GPIR services.
17Manage TeraGrid MyProxy credentials with the OGCE
ProxyManager portlets.
18OGCE file management client portlets interact
with TeraGrid GridFTP servers.
19General purpose batch and interactive job
submission to GRAM, WS-GRAM is supported.
20Dashboard Portlet
The dashboard portlet allows users to track jobs
on the selected resource. The user can view
either his own set of jobs or get information on
all submitted jobs.
20
21(No Transcript)
22Queue forecasting portlets work with the NWS
QBETS to predict wait times and deadlines.
23PURSe portlets manage user requests for portal
accounts and Grid credentials.
24Condor and Condor-G
25OGCE IFrame Portlet can be used to integrate
external sites.
26Client Libraries for Grid Computing
27Two Major Grid Client Efforts
- The Java COG Kit
- Supports several versions of Globus and SSH.
- Condor-G
- Has a Web Service interface (BirdBath) and Java
client libraries. - Supports Globus (v2 and v4) and several other
Grid middleware systems. - You can build either portlets or Web services
with either of these. - OGCE portlets use primarily COG
- We prefer Condor-G based Web services for long
running jobs.
28CoG Abstraction Layers
Development Support
Nano materials
Bio- Informatics
Disaster Management
Portals
Applications
GT2
GT3 (X)
GT4 WS-RF
Condor
Unicore
SSH
Others
29Task
Task Handler
Task Specification
The class diagram is the same for all grid tasks
(running jobs, modifying files, moving data).
Service
Security Context
Service Contact
Classes also abstract toolkit provider
differences. You set these as parameters GT2,
GT4, etc.
30Coupling CoG Tasks
- The COG abstractions also simplify creating
coupled tasks. - Tasks can be assembled into task graphs with
dependencies. - Do Task B after successful Task A
- Graphs can be nested.
31Problems with Grid Client Development
- Grid portlets typically wrap each single Grid
capability in a separate portlet - Problem is that Grid portlets need to combine
these operations - Portlets are entire web applications, so we need
a component model for portlets reusable portlet
parts - Even with the COG Abstraction Layer, we must
still do a lot of coding to build new
applications. - To address these problems we have adopted Java
Server Faces - Provides several nice Model-View-Controller
features - JSF provides an extensible framework (tag
libraries) for making reusable components. - Apache JSF portlet bridge allows you to convert
standalone JSF applications (development phase)
into portlets (deployment phase).
32GTLAB Example
-
-
-
-
-
- hostnamegf1.ucs.indiana.edu port7512
lifetime2 usernamemnacar - password /
-
- hostnamecobalt.ncsa.teragrid.org
providerGT4 executable/bin/ls - stdouttmp/result
- stderrtmp/error /
-
-
-
32
33(No Transcript)
34Managing Scientific Workflows
35Scientific Workflows
- Portal interfaces encode scientific use cases.
- If you have a rich set of services, it is a lot
of work to make portlets for all possible use
cases. - And power users will have always want something
more. - Example our CICC project has dozens of chemical
informatics Web services. - http//www.chembiogrid.org.wiki
- Workflow composers can simplify this.
- Allow users to encode and execute their own use
cases.
36Web Services and Workflows
- Perform a similarity search on the NIH DTP Human
Tumor data. - Filter the results based on Pharmacokinetic
properties (FILTER) - Convert to 3D (OMEGA)
- Docking into a pre-defined protein (FRED)
- Visualize (JMOL).
Taverna workflow connects remote services.
37OGCEs XBaya Workflow Composer
38Future of Science Gateways
39Updating the Octopus
Browser Interface
HTTP(S)
Social GadgetsAJAX
RSS,JSON/HTTP
REST
REST
REST
REST
REST
WSDL
REST
REST
REST
DB Service
Job Sub/Mon And File Services
Visualization Service
JDBC
DB
DB
Operating and Queuing Systems
Host 1
Host 2
Host 3
40(No Transcript)
41Microformats, KML, and GeoRSS feeds used to
deliver SAR data to multiple clients.
42More Information
- Contact me mpierce_at_cs.indiana.edu
- See what Im up to http//communitygrids.blogspot
.com/ - OGCE software http//collab-ogce.org/
- QuakeSim http//www.quakesim.org/
- CICC http//www.chembiogrid.org/wiki/
- Lots of people worked on all of these.