Title: NSFs Evolving Cyberinfrastructure Program
1NSFs Evolving Cyberinfrastructure Program
- Guy Almes ltgalmes_at_nsf.govgt
- Office of Cyberinfrastructure
- Oklahoma Supercomputing Symposium 2005
- Norman
- 5 October 2005
2Overview
- Cyberinfrastructure in Context
- Existing Elements
- Organizational Changes
- Vision and High-performance Computing planning
- Closing thoughts
3Cyberinfrastructure in Context
- Due to the research universitys mission
- each university wants a few people from each key
research specialty - therefore, research colleagues are scattered
across the nation / world - Enabling their collaborative work is key to NSF
4- Traditionally, there were two approaches to doing
science - theoretical / analytical
- experimental / observational
- Now the use of aggressive computational resources
has led to third approach - in silico simulation / modeling
5Cyberinfrastructure Vision
- A new age has dawned in scientific and
engineering research, pushed by continuing
progress in computing, information, and
communication technology, and pulled by the
expanding complexity, scope, and scale of todays
challenges. The capacity of this technology has
crossed thresholds that now make possible a
comprehensive cyberinfrastructure on which to
build new types of scientific and engineering
knowledge environments and organizations and to
pursue research in new ways and with increased
efficacy. - NSF Blue Ribbon Panel report, 2003
6Historical Elements
- Supercomputer Center program from 1980s
- NCSA, SDSC, and PSC leading centers ever since
- NSFnet program of 1985-95
- connect users to (and through) those centers
- 56 kb/s to 1.5 Mb/s to 45 Mb/s within ten years
- Sensors telescopes, radars, environmental, but
treated in an ad hoc fashion - Middleware of growing importance, but
underestimated in importance
7(No Transcript)
8Explicit Elements
- Advanced Computing
- Variety of strengths, e.g., data-, compute-
- Advanced Instruments
- Sensor networks, weather radars, telescopes, etc.
- Advanced Networks
- Connecting researchers, instruments, and
computers together in real time - Advanced Middleware
- Enable the potential sharing and collaboration
- Note the synergies!
9CRAFT A normative example Sensors network
HEC
Univ Oklahoma NCSA and PSC Internet2 UCAR Unidata
Project National Weather Service
10Current Projects within OCI
- Office of Cyberinfrastructure
- HEC X
- Extensible Terascale Facility (ETF)
- International Research Network Connections
- NSF Middleware Initiative
- Integrative Activities Education, Outreach
Training - Social and Economic Frontiers in
Cyberinfrastructure
11TeraGrid One Component
- A distributed system of unprecedented scale
- 30 TF, 1 PB, 40 Gb/s net
- Unified user environment across resources
- User software environment User support resources
- Integrated new partners to introduce new
capabilities - Additional computing, visualization capabilities
- New types of resources data collections,
instruments - Built a strong, extensible Team
- Created an initial community of over 500 users,
80 PIs - Created User Portal in collaboration with NMI
courtesy Charlie Catlett
12Key TeraGrid Resources
- Computational
- very tightly coupled clusters
- LeMieux and Red Storm systems at PSC
- tightly coupled clusters
- Itanium2 and Xeon clusters at several sites
- data-intensive systems
- DataStar at SDSC
- memory-intensive systems
- Maverick at TACC and Cobalt at NCSA
- experimental
- MD-Grape system at Indiana and BlueGene/L at SDSC
13- Online and Archival Storage
- e.g., more than a PB online at SDSC
- Data Collections
- numerous
- Instruments
- Spallation Neutron Source at Oak Ridge
- Purdue Terrestrial Observatory
14TeraGrid DEEP Examples
Aquaporin Mechanism
Animation pointed to by 2003 Nobel chemistry
prize announcement. Klaus Schulten, UIUC
Atmospheric Modeling
Kelvin Droegemeier, OU
Reservoir Modeling
Joel Saltz, OSU
Advanced Support for TeraGrid Applications
- TeraGrid staff are embedded with applications
to create - Functionally distributed workflows
- Remote data access, storage and visualization
- Distributed data mining
- Ensemble and parameter sweeprun and data
management
Lattice-Boltzman Simulations
Groundwater/Flood Modeling
Peter Coveney, UCLBruce Boghosian, Tufts
David Maidment, Gordon Wells, UT
courtesy Charlie Catlett
15CyberresourcesKey NCSA Systems
- Distributed Memory Clusters
- Dell (3.2 GHz Xeon) 16 Tflops
- Dell (3.6 GHz EM64T) 7 Tflops
- IBM (1.3/1.5 GHz Itanium2) 10 Tflops
- Shared Memory Clusters
- IBM p690 (1.3 GHz Power4) 2 Tflops
- SGI Altix (1.5 GHz Itanium2) 6 Tflops
- Archival Storage System
- SGI/Unitree (3 petabytes)
- Visualization System
- SGI Prism (1.6 GHz Itanium2
- GPUs)
courtesy NCSA
16CyberresourcesRecent Scientific Studies at NCSA
Weather Forecasting
Computational Biology
Molecular Science
Earth Science
courtesy NCSA
17Computing One Size Doesnt Fit All
- Trade-off
- Interconnect fabric
- Processing power
- Memory
- I/O
Interconnect
courtesy SDSC
18Computing One Size Doesnt Fit All
courtesy SDSC
19SDSC Resources
- COMPUTE SYSTEMS
- DataStar
- 2,396 Power4 pes
- IBM p655 and p690
- 4 TB total memory
- Up to 2 GB/s I/O to disk
- TeraGrid Cluster
- 512 Itanium2 pes
- 1 TB total memory
- Intimidata
- Early IBM BlueGene/L
- 2,048 PowerPC pes
- 128 I/O nodes
- DATA ENVIRONMENT
- 1 PByte SAN
- 6 PB StorageTek tape library
- DB2, Oracle, MySQL
- Storage Resource Broker
- HPSS
- 72-CPU Sun Fire 15K
- 96-CPU IBM p690s
Support for community data collections and
databases Data management, mining, analysis, and
preservation
- SCIENCE and TECHNOLOGY STAFF, SOFTWARE, SERVICES
- User Services
- Application/Community Collaborations
- Education and Training
- SDSC Synthesis Center
- Community SW, toolkits, portals, codes
courtesy SDSC
20Pittsburgh Supercomputing Center Big Ben System
- Cray Redstorm XT3
- based on Sandia system
- Working with Cray, SNL, ORNL
- Approximately 2000 compute nodes
- 1 GB memory/node
- 2 TB total memory
- 3D toroidal-mesh
- 10 Teraflops
- MPI latency lt 2µs (neighbor)
- lt 3.5 µs (full system)
- Bi-section BW 2.0/2.9/2.7 TB/s (x,y,z)
- Peak link BW 3.84 GB/s
- 400 sq. ft. floor space
- lt 400 KW power
- Now operational
- NSF award in Sept. 2004
- Oct. 2004 Cray announced
- Commercial version of Redstorm, XT3
courtesy PSC
21I-Light, I-Light2, and the TeraGrid Network
Resource
courtesy IU and PU
22Purdue, Indiana Contributions to the TeraGrid
- The Purdue Terrestrial Observatory portal to the
TeraGrid will deliver GIS data from IU and
real-time remote sensing data from the PTO to the
national research community - Complementary large facilities, including large
Linux clusters - Complementary special facilities, e.g., Purdue
NanoHub and Indiana University MD-GRAPE systems - Indiana and Purdue Computer Scientists are
developing new portal technology that makes use
of the TeraGrid (GIG effort)
courtesy IU and PU
23New Purdue RP resources
- 11 teraflops Community Cluster
- (being deployed) 1.3 PB tape robot
- Non-dedicated resources (opportunistic), defining
a model for sharing university resources with the
nation
courtesy IU and PU
24PTO, Distributed Datasets for Environmental
Monitoring
courtesy IU and PU
25TeraGrid as Integrative Technology
- A likely key to all foreseeable NSF HPC
capability resources - Working with OSG and others, work even more
broadly to encompass both capability and capacity
resources - Anticipate requests for new RPs
- Slogans
- Learn once, execute anywhere
- Whole is more than sum of parts
26TeraGrid as a Set of Resources
- TeraGrid gives each RP an opportunity to shine
- Balance
- value of innovative/peculiar resourcesvs value
of slogans - Opportunistic resources, SNS, Grapes as
interesting examples - Note the stress on the allocation process
272005 IRNC Awards
- Awards
- TransPAC2 (U.S. Japan and beyond)
- GLORIAD (U.S. China Russia Korea)
- Translight/PacificWave (U.S. Australia)
- TransLight/StarLight (U.S. Europe)
- WHREN (U.S. Latin America)
- Example use Open Science Grid involving partners
in U.S. and Europe, mainly supporting high energy
physics research based on LHC
28NSF Middleware Initiative (NMI)
- Program began in 2001
- Purpose To design, develop, deploy and support a
set of reusable and expandable middleware
functions that benefit many science and
engineering applications in a networked
environment - Program encourages open source development
- Program funds mainly development, integration,
deployment and support activities
29Example NMI-funded Activities
- GridShib integrating Shibboleth campus
attribute services with Grid security
infrastructure mechanisms - UWisc Build and Test facility community
resource and framework for multi-platform build
and test of grid software - Condor mature distributed computing system
installed on 1000s of CPU pools and 10s of
1000s of CPUs.
30Organizational Changes
- Office of Cyberinfrastructure
- formed on 22 July 2005
- had been a division within CISE
- Cyberinfrastructure Council
- chair is NSF Director members are ADs
- Vision Document started
- HPC Strategy chapter drafted
- Advisory Committee for Cyberinfrastructure
31Cyberinfrastructure Components
Collaboration Communication Tools Services
Data Tools Services
High Performance Computing Tools Services
32Vision Document Outline
- Call to Action
- Strategic Plans for
- High Performance Computing
- Data
- Collaboration and Communication
- Education and Workforce Development
- Complete document by 31 March 2006
33Strategic Plan for High Performance Computing
- Covers 2006-2010 period
- Enable petascale science and engineering by
creating a world-class HPC environment - Science-driven HPC Systems Architectures
- Portable Scalable Applications Software
- Supporting Software
- Inter-agency synergies will be sought
34Coming HPC Solicitation
- There will be a solicitation issued this month
- One or more HPC systems
- One or more RPs
- Rôle of TeraGrid
- Process driven by Science User needs
- Confusion about capacity/capability
- Workshops
- Arlington -- 9 September
- Lisle -- 20-21 September
35HPC Platforms (2000-2005)
Tightly Coupled Platforms
ETF Integrating Framework
36Cyberinfrastructure Vision
NSF will lead the development and support of a
comprehensive cyberinfrastructure essential to
21st century advances in science and engineering.