Title: High Productivity Computing Systems Program
1High Productivity Computing Systems Program
CASC HPC Technology Update
Robert Graybill March 24, 2005
2Outline
- High Computing University Research Activities
- HECURA Status
- High Productivity Computing Systems Program
- Phase II Update
- Vendor Teams
- Council on Competitiveness
- Productivity Team
- Phase III Concept
- Other Related Computing Technology Activities
3HECURA High-End Computing University Research
Activity
- Strategy
- Fund universities in high-end computing research
targeting Nations key long-term needs. - Implementation
- Coordinated FY04 Solicitation led by NSA (DARPA
2M Per Year) - 1M sent to DOE
- 1M sent to NSF
- Participating Agencies
- DARPA, DOE, NSF
- Status
- DOE FY04/05 Fund FASTOS
- NSF FY04 Fund Domain Specific Compilation
Environment - NSF FY05 In Process
Potential High-End Computing Research Areas
4Outline
- High End Computing University Research Activities
- HECURA Status
- High Productivity Computing Systems Program
- Phase II Update
- Vendor Teams
- Council on Competitiveness
- Productivity Team
- Phase III Concept
- Other Related Computing Technology Activities
5High Productivity Computing Systems
- Goal
- Provide a new generation of economically viable
high productivity computing systems for the
national security and industrial user community
(2010)
- Impact
- Performance (time-to-solution) speedup critical
national security applications by a factor of 10X
to 40X - Programmability (idea-to-first-solution) reduce
cost and time of developing application solutions
- Portability (transparency) insulate research and
operational application software from system - Robustness (reliability) apply all known
techniques to protect against outside attacks,
hardware faults, programming errors
HPCS Program Focus Areas
- Applications
- Intelligence/surveillance, reconnaissance,
cryptanalysis, weapons analysis, airborne
contaminant modeling and biotechnology
Fill the Critical Technology and Capability
Gap Today (late 80s HPC technology)..to..Future
(Quantum/Bio Computing)
6HPCS Program Phases I - III
Products
Early Pilot Platforms
HPCS Intermediate Products
Productivity Framework Baseline
Experimental Productivity Framework
Productivity Concepts Metrics
Productivity Assessment (MIT LL, DOE, DoD, NASA,
NSF)
Pilot Systems
System Design Review
Concept Review
PDR
CDR
Industry Milestones
1
4
6
2
5
7
3
Technology Assessment Review
Procurement Decisions
02
05
06
07
08
09
10
03
04
Year (CY)
(Funded Three) Phase II RD
(Fund up to Two) Phase III Full Scale Development
Mission Partners
(Funded Five) Phase I Industry Concept Study
Program Reviews Critical Milestones
Program
Procurements
7 Phase II Program Goals
- Phase II Overall Productivity Goals
- Execution (sustained performance) 1
Petaflop/sec (scalable to greater than 4
Petaflop/sec). Reference Functional Workflow 3 - Development 10X over todays systems.
Reference Functional Workflows 1,2,4,5 - Productivity Framework
- Establish experimental baseline
- Evaluate emerging vendor execution and
development productivity concepts - Provide a solid reference for evaluation of
vendors Phase III designs - Provide technical basis for Mission Partner
investment in Phase III - Early adoption or phase in of execution and
development metrics by mission partners - Subsystem Performance Indicators (Vendor
Generated Goals from Phase I) - 3.2 PB/sec bisection bandwidth
- 64,000 GUPS (RandomAccess)
- 6.5 PB/sec data streams bandwidth
- 2 PF/s Linpack
HPCchallenge
Documented and Validated Through Simulations,
Experiments, Prototypes, and Analysis
8HPCS I/O Challenges
- 1 Trillion files in a single file system
- 32K file creates per second
- 10K metadata operations per second
- Needed for Checkpoint/Restart files
- Streaming I/O at 30 GB/sec full duplex
- Needed for data capture
- Support for 30K nodes
- Future file system need low latency communication
An Envelope on HPCS Mission Partner Requirements
9Phase II Accomplishments
- Unified and mobilized broad government agency
buy-in . (vision, technical goals,
funding and active reviewers) - Driving HPC vendor and industry users vision of
high-end computing ---- To out-compete We must
out-compute! - Completed Program Milestones 1 - 4
- SDR Established credible technical baseline,
assessed program goals and
identified challenges - Technology Assessment Review
- Established Productivity as a key evaluation
criteria rather than only Performance
through HPCS Productivity Team efforts - Released execution time HPCchallenge
in-the-large applications benchmarks - Completed early Development Time experiments
- Early commercial buy-in Parallel Matlab
Announcement - FY04 HEC-URA awards completed through DOE and
NSF - Developed Draft Phase III Strategy
10HPCS System ArchitecturesCray / Sun / IBM
Addressing Time-to-Solution Experimental
Codes Large Multi-Module Codes Porting
Codes Running Codes Administration
RD in New Languages Chapel (Cray) X10
(IBM) Fortress (Sun)
11HPCS Vendor InnovationsNon-Proprietary Version
- Super Sized scaled up HPC development
environments, runtime software, file I/O and
streaming I/O to support 10k to 100K processors - Intelligent continuous processing optimization
(CPO) - Application optimized configurable heterogeneous
computing - Workflow based productivity analysis
- High bandwidth module/cabinet interconnect fabric
- Capacitive proximity chip/module interconnect
Breaks bandwidth cost/performance barriers - Developed prototype high productivity languages
- On the track for 10X improvement in HPC
productivity
HPCS Disruptive Technology Will Result in
Revolutionary HPC Industry Products in 2010 HPCS
Technology has Already Impacted Vendors 2006/2007
Products
12Near Term Meetings
- Petascale Applications Workshop
- March 22-23 Chicago Argonne National Lab
- Next HPCS Productivity Team/Task Group meeting
- June 28-30, 2005 Reston, VA (General
Productivity session individual team meetings) - Second Annual Council on Competitiveness
Conference - HPC Supercharging U.S. Innovation
and Competitiveness - July 13, 2005 Washington, DC
- Milestone V Industry Reviews (Two days)
- Week of July 25th (Sun, Cray) and August 2,3 or 4
(IBM) - Standard Review plus special emphasis on
Productivity
13Outline
- High Computing University Research Activities
- HECURA Status
- High Productivity Computing Systems Program
- Phase II Update
- Vendor Teams
- Council on Competitiveness
- Productivity Team
- Phase III Concept
- Other Related Computing Technology Activities
14HPC Industrial Users SurveyTop-Level Findings
- High Performance Computing Is Essential to
Business Survival - Companies Are Realizing a Range of Financial and
Business Benefits from Using HPC - Companies Are Failing to Use HPC as Aggressively
as They Could Be - Business and Technical Barriers Are Inhibiting
the Use of Supercomputing
- Dramatically More Powerful and Easier-to-Use-Compu
ters Would Deliver Strategic, Competitive
Benefits
15Blue-Collar Computing
8
Ideal Market for HPC
Increased Productivity Gains In Industry and
Engineering
Blue-Collar HPC
Increased Gains in Scientific Discovery
Number of Users
Number of Tasks
Number of Applications
Easy Pickings
Competitive Necessity
Business ROI
Current Market for HPC
Programmer Productivity
Heroes
DoD
1
2
4
64
NSF
DoE
Amount of Computing Power
, Storage
, Capability
of Dollars
16HPC ISV Phase I SurveyEarly Findings Results
in July 05
Biosciences 66CAE 112Chemistry 30Climate
2DCCD 1EDA 21Financial 7General Science
105General Visualization 6Geosciences
21Middleware 79Weather 3Unknown 7Grand
Total 460
- So far we have identified 460 ISV packages that
are supplied by 279 organizations. - Some are middleware and some may be cut as we
refine the data. - Domestic/Foreign Sources will be identified
- Issue is that very few of them will scale to
peta-scale systems
17Productivity Framework
- Captures major elements that go into evaluating a
system - Builds on current HPC acquisition processes
18 Productivity Team
Development Experiments
Existing Code Analysis
Benchmarks
Workflows, Models, Metrics
Vic Basili UMD Cray(3) Sun(5) IBM(5) ARSC UDel
Pitt UCSB(2) UMD(8) MissSt ISI(3) Vanderbilt(2) Li
ncoln(4) LLNL MIT(2) MITRE NSA(2) PSC SDSC(2)
Doug Post LANL Cray(2) Sun(5) IBM(6) ARL UMD
Oregon MissSt DOE HPCMO LANL(5) ISI
Vanderbilt(2) Lincoln(4) ANL MITRE NASA
ORNL(2) SAIC Sandia NSA
David Koester MITRE Cray(2) Sun(6) IBM(3) UIUC(2)
UMD(3) UTK(2) UNM ERDC GWU HPCMO ISI(2) LANL(3)
LBL Lincoln(4) MITRE UMN NSA(2) ORNL OSU Sandia
SDSC(3)
Jeremy Kepner LINCOLN Cray(4) Sun(7) IBM(6) ARL
UMD(4) Oregon MissSt LANL ISI Lincoln(4)
MITRE UMN NASA(2) DOE
19Productivity Research Teams
Benchmark Working Group LeadDavid Koester MITRE
Test Spec Working Group Lead Ashok
Krishnamurthy OSU
Execution Time Working Group Lead Bob Lucas USC
ISI
Workflows Models Metrics Working Group Lead
Jeremy Kepner Lincoln
Existing Codes Working Group Lead Doug Post LANL
Development Time Working Group Lead Vic Basili
UMD
High Productivity Language Systems Working
Group Lead Hans Zima JPL
Distributed Team Involving a Large Cross Section
of the HPC Community
20General Productivity Formula
? productivity utility/ U utility user
specified T time to solution time C total
cost
CS software cost CO operation cost CM
machine cost
- Utility is value user places on getting a result
at time T - Software costs include time spent by users
developing their codes - Operating costs include admin time, electric and
building costs - Productivity formula is tailored by each user
through use of functional work flows - Developing Large multi-module codes
- Developing Small Codes
- Running applications
- Porting codes
- Administration
U
U
21 Level 1 Functional WorkflowsEnable
Time-to-Solution Analysis
Writing Small Codes (2)
(1) Writing Large Multi-Module Codes
(3) Running Codes
Formulate questions
Develop Approach
Develop Code
VV
Analyze Results
Production Runs
Decide Hypothesize
(4) Porting Code
Identify Differences
Change Code
Optimize
(5) Administration
HW/SW Upgrade
Security Management
Resource Management
Problem Resolution
- Mission Partners may create their own HPC usage
scenarios from these basic work flow elements - Item in red represent areas with highest HPC
specific interest
22Small Code Level 2 Work Flow ExampleMarkov Model
- Classroom (UCSB) Data
Formulate
Program
1.0 / 0s
1.0 / 355s
Compile
Debug
1.0 / 49s
.95 / 5s
Test
.002 / 5s
.048 / 9s
1.0 / 629s
Compile
Optimize
.266 / 5s
1.0 / 30s
Run
.699 / 4s
.035 / 3s
23HPCS Benchmark Spectrum
Execution andDevelopment Indicators
System Bounds
Execution Indicators
Discrete Math Graph Analysis Linear Solvers
Signal Processing Simulation I/O
3 Scalable Compact Apps Pattern Matching Graph
Analysis Signal Processing 3 Petascale/s Simulati
on(Compact)Applications Others ClassroomExper
imentCodes
Execution Bounds
Current UM2000 GAMESS OVERFLOWLBMHD/GTC RFCTH HYC
OM Near-Future NWChem ALEGRA CCSM
Local DGEMM STREAM RandomAccess 1D FFT
HPCSSpanning Setof Kernels
Future Applications
Emerging Applications
Existing Applications
Reconnaissance Simulation
Intelligence
Global Linpack PTRANS RandomAccess 1D FFT
8 HPCchallenge Benchmarks
(40) Micro KernelBenchmarks
(10) Compact Applications
9 SimulationApplications
- Spectrum of benchmarks provide different views of
system - HPCchallenge pushes spatial and temporal
boundaries sets performance bounds - Applications drive system issues set legacy code
performance bounds - Kernels and Compact Apps for deeper analysis of
execution and development time
24HPCchallenge Bounds Performance
HPCS Challenge Points HPCchallenge Benchmarks
http//icl.cs.utk.edu/hpcc/
- HPCchallenge
- Pushes spatial and temporal boundaries
- Defines architecture performance bounds
25HPCchallenge WebsiteKiviat Diagram Example
AMD Configurations
Not all TOP500 systems are created equal !!
HPCS/Mission Partner Productivity Team is
Providing an HPC System Analysis Framework
26Development Time Activities (1)Victor R. Basili
- Team Lead
- Created the infrastructure for conducting
experimental studies in the field of high
performance computing program development - Designed and conducted Classroom studies
- A Total of 7 HPC classes were studied and data
from 15 assignments was collected and analyzed - Designed and conducted observational studies
(Study HPC experts working on small assignments) - 2 observational studies have been conducted and
analyzed - Designed and conducted case studies (study HPC
experts working on real projects) - Conducted 2 case studies 1 of which completed
- Developed a refined experimental design for
experiments in 2005
27Development Time Activities (2)
- Developed a downloadable instrumentation package
- Looking for expert volunteers to download and
use the package - Built knowledge about how to conduct experiments
in the HPC environment - Tested and evaluated data collection tools
- Hackystat
- Eclipse
- Developed new hypotheses
- Developed and analyzed list of HPCS folklore
- Developed and analyzed list of common HPCS defects
28Measuring Development Time
Real Applications
2 case studies
Small Projects
7 HPC classes studied (15 projects, 100 students)
Validity
2 observational studies
HPC Center Tutorials
Classroom Studies
new data collection tools (Hackystat, Eclipse)
developed downloadable package
Cost
- Developing a new methodology for conducting these
tests - Comparing programming models and languages
- Measuring performance achieved, effort, and
experties - Workflows steps and time spent in each step
29Outline
- High Computing University Research Activities
- HECURA Status
- High Productivity Computing Systems Program
- Phase II Update
- Vendor Teams
- Council on Competitiveness
- Productivity Team
- Phase III Concept
- Other Related Computing Technology Activities
30HPCS Draft Phase III Program
Productivity Assessment (MIT LL, DOE, DoD, NASA,
NSF)
Final Demo
Early Demo
System Design Review
Concept Review
CDR DRR
SCR
PDR
Industry Milestones
1
4
6
2
5
7
3
Technology Assessment Review
SW Dev Unit
SW Rel 1
SW Rel 3
SW Rel 2
HPLS Plan
MP Peta-Scale Procurements
Deliver Units
Mission Partner Peta-Scale Application Dev
Mission Partner System Commitment
Mission Partner Dev Commitment
MP Language Dev
11
02
05
06
07
08
09
10
03
04
Year (CY)
Phase III System Development Demonstration
(Funded Three) Phase II RD
(Funded Five) Phase I Industry Concept Study
Mission Partners
Program Reviews Critical Milestones
Program
Procurements
31Outline
- High Computing University Research Activities
- HECURA Status
- High Productivity Computing Systems Program
- Phase II Update
- Vendor Teams
- Council on Competitiveness
- Productivity Team
- Phase III Concept
- Other Related Computing Technology Activities
32Related Technologies
Systems That Know What Theyre Doing
- Intelligent Systems
- - Architectures for Cognitive
Information Processing (ACIP) - High-End Application
Responsive Computing - High Productivity Computing
Systems Program (HPCS) - Mission Responsive
Architectures - Polymorphous Computing
Architectures Program (PCA) - Power Management
- Power Aware Computing and
Communications Program (PAC/C)
HECURA
OneSAF Objective System
XPCA