Title: Science Gateways on the TeraGrid
1Science Gateways on the TeraGrid
- Von Welch, NCSA
- vwelch_at_ncsa.uiuc.edu
- (with thanks to Nancy Wilkins-Diehr, SDSC for
many slides)
2The TeraGrid Strategy
- Building a distributed system of unprecedented
scale - 40 teraflops compute
- 1 petabyte storage
- 10-40Gb/s networking
- Creating a unified user environment across
heterogeneous resources - User software environment, User support
resources. - Created an initial community of over 500 users,
80 PIs.
- Integrating new partners to introduce new
capabilities - Additional computing, visualization capabilities
- New types of resources- data collections,
instruments
Make it extensible!
3The TeraGrid Team
- Two major components
- 9 Resource Providers (RPs) who provide resources
and expertise - Seven universities
- Two government laboratories
- Expected to grow
- The Grid Integration Group (GIG) who provides
leadership in grid integration among the RPs - Led by Director, who is assisted by Executive
Steering Committee, Area Directors, Project
Manager - Includes participation by staff at each RP
- Funding now provided for people, not just
networks and hardware!
4TeraGrid Resource Partners
5TeraGrid Resources
ANL/UC Caltech IU NCSA ORNL PSC Purdue SDSC TACC
Compute Resources Itanium2 (0.5 TF) IA-32 (0.5 TF) Itanium2 (0.8 TF) Itanium2 (0.2 TF) IA-32 (2.0 TF) Itanium2 (10 TF) SGI SMP (6.5 TF) IA-32 (0.3 TF) XT3 (10 TF) TCS (6 TF) Marvel (0.3 TF) Hetero (1.7 TF) Itanium2 (4.4 TF) Power4 (1.1 TF) IA-32 (6.3 TF) Sun (Vis)
Online Storage 20 TB 155 TB 32 TB 600 TB 1 TB 150 TB 540 TB 50 TB
Mass Storage 1.2 PB 3 PB 2.4 PB 6 PB 2 PB
Data Collections Yes Yes Yes Yes Yes
Visualization Yes Yes Yes Yes Yes
Instruments Yes Yes Yes
Network (Gb/s,Hub) 30 CHI 30 LA 10 CHI 30 CHI 10 ATL 30 CHI 10 CHI 30 LA 10 CHI
Partners will add resources and TeraGrid will add
partners!
6Science GatewaysA new initiative for the TeraGrid
- Increasing investment by communities to build
their own cyberinfrastructure. - Heterogeneity
- Resources - different architectures at local,
national and international levels - Users- from HPC expert to K-12 studentthey
should all benefit from CI. - Software stacks, policies.
- How can centers/Institutions provide, operate,
maintain in this heterogeneous world ? - Working with Gateways, TeraGrid will start to
answer that question by providing generic CI
services to communities. - Integration and interoperability.
7What are Gateways?
- Gateways will
- engage communities that are not traditional users
of the supercomputing centers - by
- providing community-tailored access to TeraGrid
services and capabilities - Three examples
- Web-based Portals that front-end Grid Services
that provide teragrid-deployed applications used
by a community. - Coordinated access points enabling users to move
seamlessly between TeraGrid and other grids. - Application programs running on users' machines
but accessing services in TeraGrid (and
elsewhere) - All take advantage of existing community
investment in software, services, education, and
other components of Cyberinfrastructure.
8Grid Portal Gateways
- The Portal accessed through a browser or desktop
tools - Provides Grid authentication and access to
services - Provide direct access to TeraGrid hosted
applications as services - The Required Support Services
- Searchable Metadata catalogs
- Information Space Management.
- Workflow managers
- Resource brokers
- Application deployment services
- Authorization services.
- Builds on NSF DOE software
- Use NMI Portal Framework, GridPort
- NMI Grid Tools Condor, Globus, etc.
- OSG, HEP tools Clarens, MonaLisa
9Initial Focus on 10 Gateways
10Expanding User Base
A new generation of users that access TeraGrid
via Science Gateways, scaling well beyond the
traditional user with a shell login
account. Projected user community size by each
science gateway project. Impact on society from
gateways enabling decision support is much larger!
11So how will we meet all these needs?
- With RATS! (Requirements Analysis Teams)
- Organized RATS
- Collection, analysis and consolidation of
requirements to jumpstart the work - And milestones
12Rats de Paris
13Traditional HPC Model
- All user have accounts at each site/resource
- NxN matrix of users and sites
- Users access resources through low-level
interfaces - E.g. Unix Shells, FTP session
- Resource takes care of all the security
- AAAA Authentication, Authorization, Auditing,
Accounting
14Traditional HPC Usage
A U T H n
Audit Accounting
OS (Authz)
15Science Gateway Motivation
- Shell-level access to resources is great for
power users, but has steep learning curve - Many SG users just need domain-specific
interface, e.g. they are not developing or
deploying application codes - Each resource/site has to maintain state about
every user - Scalability problems for large/dynamic user
communities - No abstraction - users must adapt to all changes
in resources
16SG Security Model
- SG acts as a interface between the community and
its resources - Much like a traditional Grid Portal, it
provides a domain-specific interface - However, unlike portals, it exists as a trusted
entity in its own right, allowing the resource to
outsource AAAA functionality to the SG - Resources runs all commands in a community
account, which constrains what community can do -
account can be constrained to a few community
applications
17Conceptual Model
18SG AAAA Model
Job- Level Audit
Authn
User- Level Audit
Community-level Authz
Accounting
User-level Authz
- Security functions held by the resource are now
split between resource and Science Gateway - However there is a strong need to communicate
between the two - Resource will want full audit information and
user information to investigate suspicious
activity - SG needs accounting information to do allocations
and reporting (e.g. who is using the SG)
19Outstanding Challenges
- How to identify a job between SG and resource?
- /bin/foo run at 153813 (my time) not very
accurate - Standard template for resource/SG agreement
- Akin to certificate policy
- Acceptance of group accounts
- Convince folks its ok to outsource
- Restricted accounts
- Cookbook to restrict account to certain
applications - Sandboxing of users from each others
- Community administrators
- Those who set up group account
20Outstanding Challenges (cont)
- Each SG forms its own VO
- TeraGrid provides resources
- SG provides the user
- Ive mostly talked about SG/TeraGrid relationship
- But how SGs will manage their users is open
- Authentication, Authorization, Contact
information (the whole list Jill just gave) - Users distributed over multiple domains
- Wanting to get into the 1000s of users
- Different communities for each SG
- TeraGrid would like to help as much as possible
here as well
21Questions?