Title: Computational Science Portals: The SDSC Grid Portal Toolkit GridPort
1Computational Science PortalsThe SDSC Grid
Portal Toolkit (GridPort)
- Mary Thomas
- Scientific Computing Department
- SDSC
- (mthomas_at_sdsc.edu)
2GridPort Team
- Mary Thomas (project manager) background in
physics (lasers, high temp. plasma diagnostics)
and computer science (Grid computing, high
performance Java, a little HPC). - Software Development Team
- Steve Mock interactive/Globus/apps support
- Kurt Mueller informational/SRB/database
- Maytal Dahan accounts/Globus/apps
- Student Interns
- Cathie Mills, Ray Regno, Chris Garsha, Kathy
Seyama
3 Why Use Portals for Computational Science?
- Computational science environment is complex
- Users have access to a variety of distributed
resources (compute, storage, etc.). - Interfaces to these resources vary and change
often - Policies at sites sometimes differ
- Using multiple resources can be cumbersome
- Portals can provide simple interfaces
- Portals are web based and that has advantages -
- Users know understand the web
- Can serve as a layer in the middle-tier
infrastructure of the Grid - Users can be isolated from resource specific
details - Single web interface isolates system
changes/differences - Not and end-all solution
- several issues/challenges here
4Interactive HotPage View
5Informational Services
- Vertical portal to NPACI Resources and Services
- News/events within NPACI
- Documentation, training , news, consulting
- Simple tools
- application search systems information
- generation of batch scripts for all compute
resources - Network Weather System
- Provides dynamic information
- real-time information for each machine (or
summaries) such as - Status Bar live updates/operational
status/utilization - Machine Usage summary of machine status, load,
queues - Queues Summaries displays currently executing
and queued jobs - Node Maps graphical map of running applications
mapped to nodes - Network Weathering System connectivity
information between a users local host and grid
resources
6Interactive Services
- Users have direct access to accounts on resources
- single entry point to all NPACI resources on
which a user has accounts/allocations - Requires portal account, and authentication
- secure access to compute and storage resources
(GSI) - Standard menus for each machine
- allows user to perform common Unix tasks
- create, submit, monitor, cancel or delete jobs
- view output
- compile and execute code
- manipulate and view files, navigate through file
systems - use system commands chmod, mv, ls, cat, mkdir,
cp, rm - perform file transfer
- upload/download/archive files
- archiving and retrieving data between local host
and HPC system - managing accounts and allocations (via Webnewu)
7The GridPort Toolkit
- Based on the architecture developed for the NPACI
HotPage - Focus on computational scientists and application
developers - Comprised of a set of simple, modular services
and tools - Support application level, customized science
portals development - Facilitate seamless web-based access to
distributed compute resources and grid services - Built with commodity technologies
- Sits on top of the middle-tier of the Grid
- An interface to these services for web
8Web Server to HPC Resource Architecture
9Applications running on GridPort
- Current applications in production
- NPACI/PACI/NCSA HotPage (live demo of HotPage)
- https//hotpage.npaci.edu, https//genie.npaci.ed
u - LAPK Portal Pharmacokinetic Modeling (live demo
of Pharmacokinetic Modeling Portal) - https//gridport.npaci.edu/LAPK
- GAMESS (General Atomic and Molecular electronic
Structure System) - an ab initio molecular program originally
developed through the National Resource for
Computational Chemistry - https//gridport.npaci.edu/GAMESS
- Application portals under development (Fall
deployment) - Telescience (Ellisman), https//gridport.npaci.edu
/Telescience - QMView computational 3-D molecular
modeling/visualization - National Biomedical Computation Resource -
Cardiac Physiology modeling project.
10GridPort Toolkit Design Concepts
- Key design idea
- Any site should be able to host a user portal
- Any user should be able to create their own user
portal if they have accounts and certificate - Key Requirements
- Base software design on infrastructure provided
by World Wide Web - use commodity technologies wherever possible
- avoid shell programs/aapplications/applets
- GridPort Toolkit should not require that
additional services be run on the HPC Systems - reduce complexity -- there are enough of these
already - so, leverage existing grid research development
- GSI certificate (considering Kerberos, secure ID)
11Grid Security at all Layers
- GSI authentication for all portal services
- transparent access to the grid via GSI
infrastructure - Security between the client -gt web server -gt
grid - SSL/RC4-40 128 bit key/ SSL RSA X509 certificate
- authentication tracked with cookies coupled to
server data base/session tracking - Single login to portal services provides access
to all NPACI Resources where the GSI available - with full account access privileges for specific
host - use cookies to track stateexploring other
mechanisms - Globus used for client requests on resources
- Use GSI enabled SSH/FTP as a backup
- Use when need to avoid overhead of Globus
gatekeeper - useful for limited portal services if Globus
down/unavailable
12GridPort Based on CommodityWeb Technologies
- Use of commodity web technologies -gt Portability
- contribute to a plug-n-play grid
- Requirements
- Communicator and IE (4.0 or greater),
- HTTP, HTTPS, SSL, HTML/JavaScript, Perl/CGI, SSH,
FTP - Netscape or Apache servers
- Based on simple technology, this software is
easily ported to, and used by other sites. - Needs to also be easy to modify and adapt to
local site policies and requirements - Goal is to design a toolkit that is simple to
implement, support, and develop
13GridPort Architecture
14Portal Services Provide Via GridPort
- Current features (always adding more)
- Authentication - login/logout to grid services
- jobs
- web-based batch script builders
- submit jobs to queues
- monitor jobs and track them
- files
- dir listing, file transfer/archival
- file upload download
- command execution
- any UNIX commands
- accounts
- webnewu
- unix commands (reslist)
15What Can You do With GridPort Client Tools?
- The GridPort Client Tools (GCT) provide
application developers with the ability to create
their own portals. Key Goal - Clients just need to learn some basic HTML -
focus on science. - Features
- Application website can be located on any server.
- Connection to portal services is through the GCT
- https//portals.npaci.edu/client/tools/FUNCTIONS
- Clients do NOT
- Have to install complex code to get started
- webservers, no Globus, no SSH, no SSL, no PKI,
etc. - Have to write complex interface scripts to access
these services (weve done that already) - Takes advantage of the Service Portal (SP)
- Full security model from the GridPort Toolkit
- Connect to ALL PACI resources (expanding to
workstations and cluster in 2001) - Portal user account services (personalization,
etc).
16What Can You do With GridPort Client Tools?
- Limitations
- GCT V1.0 very low level set of features because
of the complexities of managing HTTP/HTML/CGI
environment - How do you control webpage redirection?
- Solutions
- Extend current set of client scripts to allow
more client variable controls - Develop Portal-to-Portal communication (avoids
complexities of webpage redirection - Client and server-side XML
- Goal
- Users create portals with Genie level
capabilities (2002)
17GridPort model two kinds of portals
- Application Portals (AP)
- Client HTML pages written by users
- perfom computational science tasks
- Service Portals (SP)
- used by APs to get the job done
- connect clients to Grid services, provide
security, etc. - How?
- Client AP calls CGI code residing on SP
- Client uses simple HTTP ltFORM ACTIONURLgt
- Clients assing values to pre-defined hidden tags
- Pass data to the CGI scripts
- User login, machine, file name
- URLS to redirect to when tasks are done
18GridPort Client Tools How Do They Work?
FORM/CGI action
19GridPort Client Tools Using Form Elements
- ltFORM ACTIONhttps//portals.npaci.edu/client/tool
s/auth/cgi-bin/login.cgi targetdemoDataFramegt - Usernameltinput typetext nameAUTH_USERNAME gt
- PassphraseltINPUT typepassword name"AUTH_PASSPHRA
SE"gt - Use MyProxyltINPUT typecheckbox
name"AUTH_TYPE_MYPROXY" value"true"gt - lt!----REQUIRED USED BY SERVICE PORTAL TO TRACK
YOUR PORTAL MUST BE UNIQUE AND THE SAME FOR ALL
REQUESTS FROM YOUR PORTAL-? - ltINPUT typehidden name"PORTAL_APP_NAME
value"GCT_EXAMPLES"gt - lt!----REQUIRED THIS IS WHERE THE SERVICE
PORTAL WILL RETUR THE USER AFTER LOGIN. THIS CAN
BE A WELCOME MESSAGE-----gt - ltINPUT typehidden name"AUTH_LOGIN_SUCCESS_URL
value"https//portals.npaci.edu/client/examples
/auth/login_success.html"gt - lt!----REQUIRED THIS IS WHERE THE SERVICE
PORTAL WILL RETURN THE USER IN THE CASE OF A
LOGIN ERROR-----gt - ltINPUT typehidden name"AUTH_LOGIN_ERROR_URL
value"https//portals.npaci.edu/client/examples/a
uth/login_error.html"gt - lt/FORMgt
20Building an Application Portal
21How to Start Using the Tools
- All the tools use the services provided by the
NPACI Service Portal - Read on-line documentation Download (soon)
- https//portals.npaci.edu/client/examples/download
- Read on-line documentation Download (soon)
- https//portals.npaci.edu/client/examples/download
- Client Tools are accessed at
- https//portals.npaci.edu/client/tools
- For a list of tools, see
- http//portals.npaci.edu/client/examples
22What is Needed
- Get a user account on a PACI funded resource
- Get a user portal account
- http//portals.npaci.edu/accounts
- You need a website - your own local site
- Access to website filespace
- HTML
- Perl/CGI
- Javascript (just a little, if you like it)
- Download GCT examples
- Modify any links/data options that you want
- experiment
23Categories of GridPort Client Tools
- Authentication
- Login
- Logout
- Check authentication state
- Jobs
- Sumbit jobs to queues
- Cancel jobs
- Execute commands (command like interface)
- Files
- Upload from local host
- Download to local host
- FTP move FILE
- View Portal FILEpace (?)
- Commands
- Pwd
- Cd
- Whoami
- Etc.
24GridPort Keywords Authentication
25GridPort Keywords Jobs
26GridPort Keywords Files (general)
27GridPort Keywords Files (transfer)
28GridPort Keywords Files (manipulation)
29GridPort Keywords Commands
30User Portal Collaboration
- Collaboration among
- NPACI (SDSC), Alliance (NCSA), NASA/IPG (Ames)
and others (expanding) - Developing interfaces to Grid services such as
- GSI, Globus, NWS, etc.
- GridPort, Java (GPDK)
- Arriving at agreements/plans to accept certs from
collaborator CAs - Formalizing policies/mechanisms/contacts
- Limiting to small list
- Targeting specific list of resources to share
- Collaborators choosing applications to run
- Deployment completed by SC00
31References
- GridPort Toolkit Website
- https//gridport.npaci.edu
- Contact Mary Thomas (mthomas_at_sdsc.edu)
- NPACI Genie User Portal
- https//genie.npaci.edu
- Account
- http//portals.npaci.edu/accounts
- Contact Kurt Mueller (kurt_at_sdsc.edu)
- Download Example Portal (frames based)
- http//portals.npaci.edu/client/examples/demoApp/
- Contact Steve Mock (mock_at_sdsc.edu)
32LAPK Job Submit and Job History
33Laboratory for Applied Pharmacokinetics(LAPK)
Portal
- Users are Doctors, so need extremely simple
interface - Must be portable run from many countries
- Need to hide details such as
- Type of resources (T3E), file storage, batch
script details, compilation,UNIX command line - Uses gridport.npaci.edu portal services/capabiliti
es - File upload/download between local
host/portal/HPC systems - Job Submit
- submission (builds batch script, moves files to
resource, submit jobs) - Job tracking in the background portal tracks
jobs on system and moves results back over to
portal storage when done - Job cancel/delete
- Job History maintains relevant job information
- Major Success!!! LAPK users can now run multiple
jobs at one time using portal. - Not possible before because developers had to
keep codes scripts simple enough for doctors to
use on T3E
34Status of GridPort Toolkit
- Version 1.0 is in beta
- applications are running using the software
- currently, all applications must be hosted on
npaci.edu domain (security/cookies) - complete
- file transfer tool
- generalized dynamic batch script
- Version 2.0 in planning stages
- solve challenge of tracking user states across
multiple webserver domains - enhancing security and authentication
- customization of HotPage
- scheduling jobs based on best available system