Title: NPACI Software Roadmap
1NPACI Software Roadmap
- Carl Kesselman
- Chief Software Architect
- National Partnership for Advanced Computational
Infrastructure - DirectorCenter for Grid TechnologiesInformation
Sciences Institute - University of Southern California
- carl_at_isi.edu
2NPACI Software Objectives
- Support strategic goals of the partnership
- Big science
- Big infrastructure
- Big users
- Value added production quality tools and
infrastructure - Not research and development program
- NPACI supports hardening, integration and
transition
3Role of TeraGrid
- TeraGrid central to NPACI strategic goals
- Thus NPACI software roadmap must reflect this
- Provide a view of DTF as a system
- Heterogeneity and distribution should be an
advantage! - Integrate DTF into wider Grid environment
- Enable diverse, new, application classes
- High-throughput, data analysis, interactive data
exploration
4The Grid Vision
- Resource sharing coordinated problem solving
in dynamic, multi-institutional virtual
organizations - On-demand, ubiquitous access to computing, data,
and services - New capabilities constructed dynamically and
transparently from distributed services
- When the network is as fast as the computer's
internal links, the machine disintegrates across
the net into a set of special purpose
appliances (George Gilder)
5The Emergence of Grids
- NPACI partners have been on forefront
- Core Grid technologies developed by NPACI
partners
6TeraGrid Application Exemplars
- Traditional supercomputing made simpler
- remote access to data archives and computers
- Distributed data archive access and correlation
- Remote rendering and visualization
- Remote sensor and instrument coupling
7The Grid World Current Status
- Dozens of major Grid projects in scientific
technical computing/research education - Considerable consensus on key concepts and
technologies - Open source Globus Toolkit a de facto standard
for major protocols services - Far from complete or perfect, but out there,
evolving rapidly, and large tool/user base - Industrial interest emerging rapidly
- Opportunity convergence of eScience and
eBusiness requirements technologies
8Musts
- Must be based on emergent community-defined
standards - NMI, GGF
- Must support existing NPACI applications, and
most especially, Alpha projects - Legacy, transition
- Must incorporate current and future NPACI
value-added technologies - NWS and Globus is first step
- DataCutter underway
- SRB next, APST looks easy, NetSolve too
- Must be easy (inreach and outreach)
- Documentation
- Packaging
9NPACI Software Approach
- Build on past investment
- Focus on integration across software
- NPACKage
- Refocus of existing NPACI Grid technologies for
TeraGrid - Build on NMI foundation
- Contribute to NMI foundation
- Applications via alpha projects
- Already layered on NPACI technologies
10NPACkage
- Goal A complete technology framework for NPACI
value-added tools - Deployed at all partner sites
- Packaged for use by external community
- DTF leverage, leverage, leverage
- Sharpen the focus on integration and deployment
of NPACI - Globus and NWS have begun the process
- Architecture and methodology for putting together
NPACI technology software components - DTF/TeraGrid inspired
- Core and Basic services are Globus
- Monitoring is NWS
- Need to put together a methodology for rapid
integration and deployment
11Generic Grid Architecture
Application
12NPACI Value Added Software Stack
NPACI Grid Applications (Charm, Telescience,
Porous flow)
NPACI Grid Tools (Science Portals, APST, )
NPACI Grid Environments (Legion-G, SRB,
DataCutter, )
NPACI Grid Infrastructure
NPACI Cluster Value Added
13NSF Middleware Initiative
- NSF Funded Project to build national middleware
infrastructure - USC/ISI, SDSC, U. Wisc., ANL, NCSA, I2
- Software Integration (NMI Software Releases)
- Interoperability
- Testing
- Install, Configure, Manage
- University Campus Infrastructure Integration
- Campus Authentication / GSI
- Enterprise Directories / GSI and MDS
- Use NMI as Teragrid Baseline
- Specialize for Teragrid unique aspects (e.g. Viz
resources)
14NPACI Clusters Rocks Clustering Toolkit
- Open Source Toolkit for building Linux clusters
- NPACI Partners - SDSC Cluster Group, UCB
Millennium - Turnkey system for deploying parallel development
platforms - Support IA-32 now. Node-only for IA-64. Complete
IA-64-only support to be released in 2 weeks - 385 Unique Institution Downloads since (Mar 2002)
- Universities (85), Companies (95) , US Gov
(13), other (200) - 5 releases of Rocks since SC2000. Were on the
Linux Curve - Vehicle to transfer the Teragrid tools to larger
Community - Rocks team will insure that Rocks NMI is a
solid and complete Grid-enabled cluster solution
15More on Clusters
- Ganglia Cluster monitoring toolkit from UCB.
- Started in Rocks, now independently distributed
Rocks. - SDSC putting real-time Ganglia cluster data into
Globus MDS 2.1 schema for grid-enabled cluster
monitoring - Were engaging a larger (even international)
community - Linux Competency Center in Singapore provides
PVFS support - International community looking to us as
gathering point for cluster software. - Companies see us as a neutral meeting point, too
(Cray, Compaq) - Impacting large ITRs
- CMS Test cluster at SDSC and Caltech
- Large Tier 1 LHC cluster in Germany (Karlsruhe)
- Important success story for NPACI money impacting
a large community and other government agencies -
16Globus Toolkit
- Provides core middleware services
- Resource management
- Resource monitoring and discovery
- Security (authentication and authorization)
- Low level data-management
- Data movement, data-location
- All other NPACI software layers on Globus
services
17Performance Monitoring and Prediction
- Need a system for gathering performance data
making and making predictions. - Requirements
- On-line use the freshest data possible to make
predictions - Extensible incorporate new data, prediction
methods, presentation formats - High-performance predictions must be available
in enough time to act on them - The Network Weather Service
- Data measurement (sensors)
- Predictions (fast forecasting models)
- Presentation (data and forecast reports via
information infrastructure)
18The Storage Resource Broker
- Established NPACI leadership in data-intensive
computing - Focus on collection management
- Attribute based discovery (MCAT), uniform data
access, replication, security - Production use for number of large collections
- NPACKage integration
- Tighter integration with underlying Grid services
19DataCutter
- A Component-based framework for manipulating
multi-dimensional datasets in a distributed
environment (the Grid) - Indexing Service Multilevel hierarchical indexes
based on R-tree indexing method. - Filtering Service Distributed C component
framework - Application processing is implemented as a set of
interacting components (filters). - filters logical unit of user-defined
computation - streams - how filters communicate
- Evaluation of component-based models for
data-intensive applications - Scheduling of data flow, placement decisions
20A Motivating Scenario
Application // process relevant raw
readings // generate 3D view // compute
features of 3D view // find similar features in
reference db // display new view and similar
cases
21The AppLeS Parameter Sweep Template (APST)
- Parameter Sweep Applications
- Arise in many fields of science
- Bio-informatics, Neuroscience, CFD, Computer
Graphics, Discrete-event simulations, Proteine
folding, database searches - APST provides
- Scheduling (Data Computation)
- Grid deployment (supports Globus, Condor,
NetSolve, ssh, batch systems, GridFTP, scp, GASS,
MDS, NWS, Ganglia, IBP,) - Simple XML-based user interface
- Used in production (Salk, JCSG, UCSD/SDSC, OSU)
- Being adopted by an increasingly number of groups
- Version 1.1.beta to be released 5/1/02
- http//grail.sdsc.edu/projects/apst
22Synaptic Transmission at Ciliary Ganglion
- Predict behavior of biochemically and
geometrically complex system - 5500 MCell/APST runs on Keck-1 in 1 hour on 32
nodes (64 CPUs) - Neuroscience manuscript in preparation
23PACI HotPage
- Access portal to all resources
- Information Portal to all users
- Secure access for authorized users
- PACI Grid Software used
- Globus Toolkit(GRAM, GSI, GRIS, GIIS), SRB,
MyProxy - Built with the GridPort Toolkit
- Services provided
- Resource information/status, job control, data
collection management, command execution
24Job Submission on the NPACI HotPage
25NPACI Grid Computing Portals
- NPACI Hotpage
- Access portal to all NPACI compute resources and
hundreds of users - Provides secure interactive access for authorized
users - Built with GridPort Toolkit
- Employs NPACI Metacomputing Thrust Projects
- Globus Toolkit (GRAM, GSI, MDS/GRIS/GIIS),SRB,
MyProxy - Used to generate production application portals
- NBCR (GAMESS, Amber), LAPK, Norman, etc.
- Ready to integrate Teragrid/DTF resources
interoperable with other organizations. - Collaborations with
- Alliance, the NASA/IPG, Lawrence Berkeley
Laboratories, PNNL, the DOE, and the Global Grid
Forum - New directions include web services and OGSA,
integration of NWS and other grid projects.
26Summary
- NPACI software roadmap focused on integration and
deployment - Driven by TeraGrid and Alpha project experiences
- Integration with community, promote NPACI
leadership and value added