Title: Computing and Computational Sciences Directorate Overview
1Computing and Computational Sciences Directorate
Overview Computer Science and Mathematics
Division Overview
Briefing for RAMS Program Jeff Nichols,
PhD Director, Computer Science and Mathematics
Division Oak Ridge National Laboratory December
10, 2007
2Outline
- Computing History at ORNL
- Current Computing Architecture
- Multi-agency strategy
- Systems
- Storage
- Networks
- Data Analytics
- Facilities
- Software (CS Research, Math and Applications)
- Roadmap for the Future
3World leadership in computing and neutrons
critical to addressing national challenges
4Oak Ridge National Laboratory
Mission Conduct basic and applied research and
developmentto create scientific knowledge and
technological innovationsthat enable the
solution of compelling national problems
- Mission Conduct basic and applied research and
developmentto create scientific knowledge and
technological innovationsthat enable the
solution of compelling national problems
Spallation Neutron Sourceand Center for
Nanophase Materials Sciences
- Managed by UT-Battelle since April 2000
- Key capabilities
- Neutron science
- Ultrascale computing
- Systems biology
- Materials scienceat the nanoscale
- Advanced energytechnologies
- National security
- 4,100 staff
- FY05 budget 1B
LeadershipComputing Facility
Capital/construction
WFO
Other/DOE
National Security
Science
Environment
Energy
5Computing and Computational Sciences Directorate
Associate Laboratory DirectorThomas
Zacharia Linda Malone, Executive Secretary
Operations Chris Kemper, Director
Strategic Programs Gil Weigand, Director
Computational Sciences and Engineering
Division Brian WorleyDirector
ComputerScience and Mathematics
Division Jeff NicholsDirector
Information Technology Services Division Becky
Verastegui Director
Office of the Chief Information Officer Scott
Studham Chief Information Officer
National Center for Computational Sciences Jim
Hack Director LeadershipComputing
Facility Buddy Bland Project Director
Staff Support Debbie McCoy Assistant to the
ALD Nancy Wright Organizational Specialist
Matrix Support Willy BesancenezProcurement Jim
JoyceRecruiting Ursula Henderson, Chuck Marth,
Wanda McCrosky, Kim MilburnFinance
Managers Haifa AbbotHR Manager Kyle
SpenceBusiness Manager Jill Turner HR Assistant
Joint Institute for Computational
Sciences Thomas ZachariaDirector
6ORNL Scientific and Technical computing for
many years
SP2 4341 3033
360-195
KSR-1
Others
7We have a three pronged strategy for sustained
leadership and programmatic impact
- Provide the nations most powerful open resource
for capability computing - Follow a well-defined path for maintainingnationa
l leadership in this critical area - Deliver cutting-edge science relevant to the
missions of key federal agencies - Synergy of requirements and technology
- Unique opportunity for multi-agency collaboration
for science
8DOEs Leadership Computing Facility
- Delivered a series of increasingly powerful and
diverse set of computer systems beginning with
the KSR, Intel Paragon, Compaq Alphaserver, IBM
Power3 and Power4, SGI Altix, Cray X1, XT3, and
XT4 - Worked directly with application teams to port,
scale, and tune codes with great success going
from 100s to 10s of thousands of processors - Operational excellence in managing the systems to
deliver science for government, academia, and
industry - First DOE lab with cyber security program
accredited at the moderate level of controls,
allowing export controlled and proprietary work
9NSFs National Institute for Computational
Sciences
- ORNL/UT Joint Institute for Computational
Sciences is building the most powerful NSF center
from the ground up - Goes operational in February 2008!
- Series of computers culminating in a 1 PF system
in 2010 - Must build networks, storage, servers, web, user
support, computational science liaisons, grid
infrastructure,
9 Managed by UT-Battellefor the Department of
Energy
Operational Assessment Review, August 27, 2007
10DARPA High Productivity Computing Systems250M
Award to Cray-ORNL Team
- Impact
- Performance (time-to-solution) speedup critical
national security applications by a factor of 10X
to 40X - Programmability (idea-to-first-solution) reduce
cost and time of developing application solutions
- Portability (transparency) insulate research and
operational application software from system - Robustness (reliability) apply all known
techniques to protect against outside attacks,
hardware faults, and programming errors
HPCS Program Focus Areas
- Applications
- Intelligence/surveillance, reconnaissance,
cryptanalysis, weapons analysis, airborne
contaminant modeling and biotechnology
Fill the Critical Technology and Capability
Gap Today (late 80s HPC technology)..to..Future
(Quantum/Bio Computing)
Slide courtesy of DARPA
11Multipurpose Research Facility (MRF)ORNL / DoD
Collaboration
- ORNL MRF
- 40,000 ft2 space, 25 MW
- Peta/exa-scale technology partnerships
- Multi-million dollar peta/exa scale system
software center - System design, performance and benchmark studies
- Large-scale system reliability, availability and
serviceability (RAS) improvements - Wide-area network performance investigations
- Advanced mathematics and domain science studies
12Jaguar
119 teraflops and 11,708 dual-core processors
119 teraflops 11,708 dual-core processors
46 terabytes of memory 2.6 GHz 600 terabytes
of scratch disk space
- Will be upgraded to quad-core processors in
December - Peak performance will be over 275 TF
- System memory over 60 TB
13Phoenix Cray X1E 18.5 TF
- Being used by Boeing for engineering validation
studies - Outstanding system for climate modeling and
fusion reactor design
- Phoenix Cray X1E
- Largest Cray vector system in the world
- 1,024 vector processors
- 2 TB shared memory
14NICS Systems
- Cray Baker system Spring 2009
- 10,000 Opteron multi-core processors
- 100 TB of memory
- 2.3 PB of disk space
- Upgrade processors in 2010
1 PetaFLOPs
- Initial Delivery February 2008
- 4,512 Opteron quad-core processors
- 170 TeraFLOPs, faster than todays largest system
at ORNL
15Blue Gene/P Comes to ORNL
- Oak Ridge National Laboratory and IBM have teamed
up to bring the next generation of the IBM Blue
Gene supercomputer, the Blue Gene/P, to ORNL - The new system was accepted in late September
and features 8,192 compute cores and is capable
of more than 27 trillion calculations a second,
or 27 teraflops
Selected chemistry and materials applications
especially have shown strong performance on the
Blue Gene. We look forward to seeing researchers
produce cutting-edge science on this system.
Thomas Zacharia, ORNLs Associate Laboratory
Director for Computing and Computational Sciences
Chemistry and materials applications show promise
16ORNL Institutional Cluster
- 20 TF clusters
- Combination of dual- and quad-core Xeon
- 2300 cores
- Capacity computing for ORNL staff
- Add new processors each year and remove oldest
processors after 3 years
17New Storage Device Online at the NCCS
More storage for data intensive applications
- The NCCS recently received an upgrade to its High
Performance Storage System (HPSS) with the
addition of the Sun StorageTek SL8500 modular
library system - The HPSS helps researchers manage massive amounts
of data necessary to tackle Grand Challenge
Science - HPSS is now able to store more overall data,
giving one
of the nations top super-
computing centers even
more ammunition with which to
tackle todays Grand Challenge
Science
18Facilities Designed for Exascale Computing
- Open Science Center (40,000 ft2)
- Upgrading building power to 15 MW
- 210 MW substation, upgradeable to 280 MW
- Deploying a 6,600 ton chiller plant
- Tripling UPS and generator capability
- National Security Center (40,000 ft2)
- Capability computing for national defence
- 25 MW of power and 8,000 ton chiller
- New Computer Facility (100,000 ft2)
- 100,000 ft2 raised floor expandable with modular
build out to over 250,000 ft2 - Open and Secure operations
- Lights out facility
19State-of-the-art Infrastructure - Power
External grid
Local distribution
Facilities
Computational Sciences Building
Ft. Loudon
210-MWsubstation
Bull Run
Upgradeableto 280 MW
Multipurpose Research Facility
Kingston
4000 substation
20State-of-the-art Infrastructure Networking ORNL
owns fiber optic cable to Chicago, Nashville, and
Atlanta to allow increased bandwidth as needed
20 Managed by UT-Battellefor the Department of
Energy
21Hardware - ORNL LCF Roadmap to Exascale
Phase 4
1 EF (LCF-5)
Phase 3
100 PF gt 250 PF (LCF-4)
20 PF gt 40 PF (LCF-3)
Phase 2
600 TF Cray Granite Prototype (DARPA HPCS)
1 PF Cray Baker (NSF- 1)
1 PF gt
3-6 PF Cray Baker (LCF- 2)
Disruptive Technology?
Phase 1
Cray Baker Prototype (DARPA)
170 TF Cray XT4 (NSF-0)
50 TF gt 100 TF gt 250 TF Cray XT4 (LCF-1)
18.5 TF Cray X1E (LCF- 0)
ORNL Multi-Agency Computer Facility 100,000 to
250,000 ft2
ORNL Multipurpose Research Facility
ORNL Computational Sciences Building
2017
2016
2007
2008
2009
2010
2011
2012
2013
2014
2015
2006
22Software Partnerships to Insure Success
DOE NSF DoD
Scientific Computing Group Cray Supercomputing Center of Excellence Lustre Center of Excellence DOE Core program (FASTOS, etc.) INCITE Applications Teams SciDAC Application Teams SciDAC Centers for Enabling Technology SciDAC Institutes NSF Application Teams NSF PetaApps Neutron Sciences Teragrid gateway CTSS Teragrid software Supporting XRAC, LRAC, and MRAC allocations Data analytics Partnership of Exascale Systems and Applications Collaboration on system, architecture, system software, and algorithms System software for future architectures HPCS software centers HPCS prototypes
23Computer Science and Mathematics Division
The Computer Science and Mathematics Division
(CSMD) is ORNL's premier source of basic and
applied research in high-performance computing,
applied mathematics, and intelligent systems.
Basic and applied research programs are focused
on computational sciences, intelligent systems,
and information technologies.
- Mission
- Our mission includes basic research in
computational sciences and application of
advanced computing systems, computational,
mathematical and analysis techniques to the
solution of scientific problems of national
importance. We seek to work collaboratively with
universities throughout the world to enhance
science education and progress in the
computational sciences. - Vision
- The Computer Science and Mathematics Division
(CSMD) seeks to maintain our position as the
premier location among DOE laboratories and to
become the premier location worldwide where
outstanding scientists, computer scientists and
mathematicians can perform interdisciplinary
computational research.
24Computer Science and Mathematics J. A. Nichols,
Director L. M. Wolfe, Division Secretary
ADVISORY COMMITTEE Jerry Bernholc David
Keyes Thomas Sterling Warren Washington
S. W. Poole, Chief Scientist and Director of
Special Programs
Center For Engineering Science Advanced Research
(CESAR) J. Barhen, Director
Center for Molecular Biophysics J. C. Smith,
Director
J. Cobb, Lead, TeraGrid
Complex Systems J. Barhen P. Boyd Y. Y.
Braiman C. W. Glover W. P. Grice T. S. Humble5 N.
Imam D. L. Jung S. M. Lenhart H. K. Liu Y. Liu L.
E. Parker N. S. V. Rao D. B. Reister7 D. R. Tufano
Computer Science Research G. A. Geist C.
Sonewald P. K. Agarwal D. E. Bernholdt M. L.
Chen J. J. Dongarra W. R. Elwasif C. Engelmann T.
V. Karpinets8 G. H. Kora8 J. A. Kohl R.
Krishnamurthy5a A. Longo2 X. Ma2 T. J.
Naughton5a H. H. Ong B.-H. Park8 N. F.
Samatova S. L. Scott J. Schwidder A. Shet5a C. T.
Symons5a G. R. Vallee5 S. Vazhkudai W. R. Wing M.
Wolf2 S. Yoginath5a
Statistics and Data Science J. A. Nichols
(acting) L. E. Thurston R. W. Counts B. L.
Jackson G. Ostrouchov L. C. Pouchard D. D.
Schmoyer D. A. Wolf
Computational Earth Sciences J. B. Drake
C. Sonewald M. L. Branstetter D. J. Erickson M.
W. Ham F. M. Hoffman G. Mahinthakumar7 R. T.
Mills P. H. Worley
Computational Mathematics E. F. DAzevedo
L. E. Thurston V. Alexiades R. K. Archibald4 B.
V. Asokan5 V. K. Chakravarthy R. Deiterding4 G.
I. Fann S. N. Fata5 L. J. Gray R.
Hartman-Baker5 J. Jia5 A. K. Khamayseh M. R.
Leuze S. PannalaP. Plechac2
Computational Materials Science T. C.
Schulthess L. C. Holbrook J. A. Alford5 G.
Alvarez S. Dag5 R. M. Day5a Y. Gao2 A. A.
Gorin I. Jouline2 T. Kaplan P. R. Kent8 J-Q.
Lu5 T. A. Maier S. Namilae5 D. M. Nicholson P.
K. Nukala Y. Osetskiy B. Radhakrishnan G. B.
Sarma M. Shabrov5 S. Simunovic X. Tao5 X-G.
Zhang J. Zhong C. Zhou5
Computational Chemical Sciences R. J.
Harrison L. E. Thurston J. Alexander1 E.
Apra J. Bernholc A. Beste5 D. Crawford2 T.
Fridman8 B. C. Hathorn A. Kalemos5 V. Meunier W.
Lu2 M. B. Nardelli2 C. Pan W. A. Shelton S.
Sugiki5 B. G. Sumpter
Future Technologies J. S. Vetter T. S.
Darland S. R. Alam R. F. Barrett M. D. Beck N.
Bhatia5a J. A. Kuehn C. B. McCurdy5a J. S.
Meredith K. J. Roche P. C. Roth O. O.
Storaasli W. Yu
Computational Astrophysics A. Mezzacappa P.
Boyd C. Y. Cardall6 E. Endeve5,6 H. R. Hix6 E.
Lentz5,6 B. Messer6
Operations Council
Finance Technical
U. F. Henderson Information and
Communications Human Resource
B. A. Riley6 M. J. Palermo
Facility 5600/5700 Organizational
B. A. Riley6 Management N. Y.
Wright6 Computer Security T. K.
Jones6 Recruiter J. K. Johnson
Quality Assurance R.W.
Counts6 ESH/Safety Officer R. J. Toedte6
CSB Computing
Center Manager M. W. Dobbs6
1Co-op 2Joint Faculty 3Wigner Fellow 4Householder
Fellow 5Postdoc/5aPostmaster 6Dual
Assignment 7Part-time 8JICS
3/6/2007
25Computer Science Research
Perform basic research and develop software and
tools to make high performance computing more
effective and accessible for scientists and
engineers.
- Heterogeneous Distributed Computing PVM,
Harness, OpenMPI(NCCS) - Holistic Fault Tolerance CIFTS
- CCA Changing the way scientific software is
developed and used - Cluster computing management and reliability
OSCAR, MOLAR - Building a new way to do Neutron Science SNS
portal - Building tools to enable the LCF science teams
Workbench - Data-Intensive Computing for Complex Biological
Systems BioPilot - EarthSystem Grid Turning climate datasets into
community resources (SDS, Climate) - Robust storage management from supercomputer to
desktop Freeloader - UltraScienceNet defining the future of national
networking - Miscellaneous electronic lab notebooks,
Cumulvs, bilab, smart dust
26Special Projects / Future Technologies
- Sponsors include
- SciDAC
- Performance Engineering Research Institute
- Scientific Data Management
- Petascale Data Storage Institute
- Visualization (VACET)
- Fusion
- COMPASS
- DOE Office of Science
- Fast OS - Molar
- Software Effectiveness
- DOD
- HPC Mod Program
- NSA
- Peta-SSI FASTOS
- Research Mission - performs basic research in
core technologies for future generations of
high-end computing architectures and system
software, including experimental computing
systems, with the goal of improving the
performance, reliability, and usability of these
architectures for users. - Topics include
- Emerging architectures
- IBM CELL (i.e., Playstation)
- Graphics Processors (e.g., Nvidia)
- FPGAs (e.g., Xilinx, Altera, Cray, SRC)
- Cray XMT Mutlithreaded architecture
- Operating systems
- Hypervisors
- Lightweight Kernels for HPC
- Programming Systems
- Portable programming models for heterogeneous
systems - Parallel IO
- Improving Lustre for Cray
- Performance modeling and analysis
- Improving performance on todays systems
- Modeling performance on tomorrows systems (e.g.,
DARPA HPCS) - Tools for understanding performance
27Complex Systems
Mission Support DOD and the Intelligence
Community Theory Computation
Experiments
- Examples of current research topics
- Missile defense C2BMC (tracking and
discrimination), NATO, flash hyperspectral
imaging - Modeling and Simulation Sensitivity and
uncertainty analysis, global optimization - Laser arrays directed energy, ultraweak signal
detection, terahertz sources, underwater
communications, SNS laser stripping - Terascale embedded computing emerging multicore
processors for real-time signal processing
applications (CELL, HyperX, ) - Anti-submarine warfare ultra-sensitive
detection, coherent sensor networks, advanced
computational architectures for nuclear
submarines, Doppler-sensitive waveforms,
synthetic aperture sonar - Quantum optics cryptography, quantum
teleportation - Computer Science UltraScience network (40-100Gb
per L) - Intelligent Systems neural networks, mobile
robotics
Sponsors DOD(AFRL, DARPA, MDA,ONR, NAVSEA ),
DOE(SC), NASA, NSF, IC (CIA, DNI/DTO, NSA)
UltraScience Net
28Statistics and Data Science
- Chemical and Biological Mass Spectrometer Project
- Discrimination of UXO
- Forensics - Time Since Death
- A Statistical Framework for Guiding Visualization
of Petascale Data Sets - Statistical Visualization on Ultra High
Resolution Displays - Local Feature Motion Density Analysis
- Statistical Decomposition of Time Varying
Simulation Data - Site-wide Estimation of Item Density from Limited
Area Samples - Network Intrusion Detection
- Bayesian Individual Dose Estimation
- Sparse Matrix Computation for Complex Problems
- Environmental Tobacco Smoke (ETS)
- Explosive Detection Canines
- Fingerprint Uniqueness Research
- Chemical Security Assessment Tool (CSAT)
- Sharing a World of Data Scaling the Earth
Systems Grid to Petascale - Group Violent Intent Modeling Project
- ORCAT A desktop tool for the intelligence
analyst - Data model for end-to-end simulations with
Leadership Class Computing
29Computational Mathematics
- Development of multiresolution analysis for
integro-differential equations and Y-PDE - Boundary integral modeling of Functionally Graded
Materials (FGMs) - Large-scale parallel Cartesian structured
adaptive mesh refinement - Fast Multipole / non-uniform FFTs
- New large-scale first principles electronic
structure code - New electronic structure method
- Fracture of 3-D cubic lattice system
- Adventure system
- Eigensolver with Low-rank Upgrades for
Spin-Fermion Models
30The Science Case for Peta- (and Exa-) scale
Computing
- Energy
- Climate, materials, chemistry, combustion,
biology, nuclear fusion/fission - Fundamental sciences
- Materials, chemistry, astrophysics
- There are many others
- QCD, accelerator physics, wind, solar,
engineering design (aircraft, ships, cars,
buildings) - What are key system attribute issues?
- Processor speed
- Memory (capacity, B/W, latency)
- Interconnect (B/W, latency)
31Computational Chemical Sciences
- Application areas
- Chemistry, materials, nanoscience, electronics
- Major techniques
- Atomistic and quantum modeling of chemical
processes - Statistical mechanics and chemical reaction
dynamics - Strong ties to experiment
- Catalysis, fuel cells (H2), atomic microscopy,
neutron spectroscopy - Theory
- Electronic structure, statistical mechanics,
reaction mechanisms, polymer chemistry, electron
transport - Development
- Programming models for petascale computers
- Petascale applications
Computational Chemical Sciences is focused on the
development and application of major new
capabilities for the rigorous modeling of large
molecular systems found in, e.g., catalysis,
energy science and nanoscale chemistry.
Funding Sources OBES, OASCR, NIH, DARPA
32Science Case Example - Chemistry
- 250 TF
- Accurate large-scale, all-electron, density
functional simulations - Absorption on transition metal oxide surface
(important catalytic phenomenon) - Benchmarking of Density-Functionals (importance
of Hartree-Fock exchange) - 1 PF
- dynamics of few-electron systems
- Model of few-electron systems interaction with
intense radiation to a guaranteed finite
precision
- Sustained PF
- Treatment of absorption problem with larger unit
cells to avoid any source of error - 1 EF
- Extension of the interaction with intense
radiation to more realistic systems containing
more electrons
Design catalysts, understand radiation
interaction with materials, quantitative
prediction of absorption processes
33System Attribute Example - Memory B/W
- Application drivers
- Multi-physics, multi-scale applications stress
memory bandwidth as they do node memory capacity - Intuitive coding styles often produce poor
memory access patterns - Poor data structures, excessive data copying,
indirect addressing - Algorithm drivers
- Unstructured grids, linear algebra
- Ex AMR codes work on blocks or patches,
encapsulating small amounts of work per memory
access to make the codes readable and
maintainable - This sort of architecture requires very good
memory BW to achieve good performance - Memory B/W suffers in the multi-core future
- Must apps just have to get used to not having
any? - Methods to (easily) exploit multiple levels of
hierarchical memory are needed - Cache blocking, cache blocking, cache blocking
- Gather-scatter
34System Attribute Interconnect Latency
- Application drivers
- Biology
- Stated goal of 1 ms simulation time per wall
clock day requires 0.1 ms wall clock time per
time step using current algorithms - Others chemistry, materials science
- Algorithm drivers
- Explicit algorithms using nearest-neighbor or
systolic communication - Medium- to fine-grain parallelization strategies
(e.g. various distributed data approaches in
computational chemistry) - Speed of light puts fundamental limit on
interconnect latency - Yet raw compute power keeps getting faster,
increasing imbalance - Path forward
- Need new algorithms to meet performance goals
- The combination of SW and HW must allow the
ability to fully overlap communication and
computation - Specialized hardware for common ops?
- Synchronization global and semi-global
reductions - Vectorization/multithreading of communication?
35Analysis of applications drives architecture
- System design driven by usabilityacross wide
range of applications - We understand the applications needs and
implications on architectures to deliver science
at the exascale
96
Maximum possible score 97 Minimum possible
score 19
91
88
84
82
77
61
56
Node Peak Flops
Memory Bandwidth
44
Interconnect Latency
Memory Latency
Interconnect Bandwidth
40
40
Node Memory Capacity
38
Disk Bandwidth
Local Storage Capacity
WAN Network Bandwidth
Mean Time to Interrupt
Disk Latency
Archival Storage Capacity
36Engaging application communities to determine
system requirements
Community engaged Process of engagement Outcome
Users Enabling Petascale Science and Engineering Applications Workshop, December 2005, Atlanta, GA Science breakthroughs requiring sustained petascale community benchmarks
Application Developers Sustained Petaflops Science Requirements, Application Design and Potential Breakthroughs Workshop, November 2006, Oak Ridge, TN Application walk-throughs identification of important HPC system characteristics
Application Teams Weekly conference calls, November 2006January 2007 Benchmark results model problems
Application Teams Cray Application Projection Workshop, January 2007 Oak Ridge, TN Project benchmarks on XT4 to model problem on sustained petaflops system
37We are Getting a Better Handle on Estimating I/O
Requirements
Disk Bandwidth
Disk Capacity
38SoftwareImplementations
- Fortran still winning
- NetCDF and HDF5 use is widespread, but not their
parallel equivalents - Widespread use of BLAS and LAPACK
39Plan for delivering Exascale systems in a decade
Mission Deploy and operate the computational resources neededto tackle global challenges Vision Maximize scientific productivityand progress on the largest scalecomputational problems
Climate change Terrestrial sequestration of carbon Sustainable nuclear energy Bio-fuels and bio-energy Clean and efficient combustion Nanoscale Materials Energy, ecology and security Providing world class computational resources and specialized services for the most computationally intensive problems Providing a stable hardware/software path of increasing scaleto maximize productive applications development Series of increasingly powerful systems for science
1 EF system with upgrade to 2 EF
100 PF system with upgrade to 250 PF based on
disruptive technologies
Cray Baker 1 PF AMD multi-core socket G3 with
upgrade to 3-6 PF
20 PF system with upgrade to 40 PF
FY2009
FY2014
FY2017
FY2011
40ORNL design for a 20PF System
- 264 compute cabinets
- 54 router cabinets
- 25,000 uplink cables
- Up to 20m long
- 3.125 Gbps signaling
- Potential for optical links
- Fat Tree Network
- 20 PF peak
- 1.56 PB memory
- 234 TB/s global bandwidth
- 6224 SMP nodes, 112 IO nodes
- Disk
- 46 PB, 480 GB/s
- 7,600 square feet
- 14.6 Mwatts
41Performance projected from 68 TF Cray XT4 to 20
PF Cray Cascade
Bottlenecks
42Processor Trends
Processor Performance Trends
43How do Disruptive Technologies change the game?
- Assume 1TF to 2TF per socket
- Assume 400 sockets / rack 400 to 800 TF/rack
- 25 to 50 racks for 20 PF system
- Memory technology investment needed to get the
bandwidth without using 400,000 DIMMS - Roughly 150Kw/Rack
- Water cooled racks
- Lots of potential partners Sun, Cray, Intel,
AMD, IBM, DoD
43
44Building a 100-250 PF system
- Assume 5TF to 10TF per socket
- Assume 400 sockets / rack 2PF to 4PF/rack
- 50 to 100 racks for 100-250 PF system
- Memory technology investment needed to get the
bandwidth without using 800,000 DIMMS - Various 3D technologies/solutions could be
available - Partial Photonic Interconnect
- Roughly 150Kw/Rack
- Water cooled racks
- Liquid Cooled processors
- Hybrid is certainly an option
- Potential partners Sun, Cray, Intel, IBM, AMD,
nVidia??
44
45Building an Exaflop System
- Assume 20TF to 40TF per socket
- Assume 400 sockets / rack 8PF to 16PF TF/rack
- 125 to 250 racks for 1Ex-2Ex system
- Memory technology investment needed to get the
bandwidth without using 1,000,000 DIMMS - Various 3D Technologies will be available, new
memory design(s) - All Photonic Interconnect (No Copper)
- Roughly 250Kw/Rack
- Water cooled racks
- Liquid Cooled Processors
- Hybrid is certainly an option
- Potential partners Sun, Cray, Intel, IBM, AMD,
nVidia??
46What investments are needed?
- Memory bandwidth on processors
- Potential 2-10X
- need 2-4X just to maintain memory BW/FLOP ratio
- Potential joint project with AMD and Sun to
increase effective bandwidth - Another possibility is 3D memory - assume 15M
partners. - Interconnect Performance
- Working with IBM, Intel, Mellanox, Sun
- Define 8,16,32X (4, 12X already defined)
- Define ODR in IBTA
- Already showing excellent latency characteristics
- All Optical (exists currently)
- Systems software, tools, and applications
- Proprietary networks possible 10X, but large
investment and risk. Assume 20M - Packaging density and cooling will continue to
require investment - Optical cabling/interconnect is a requirement,
yet always in the future
47Strengths in five key areas will drive project
success with reduced risk
Exaflops
- Multidisciplinary application development teams
- Partnerships to drive application performance
- Science base and thought leadership
Applications
- Exceptional operational expertise and experience
- In-depth applications expertise in house
- Strong partners
- Proven management team
People
- Broad system software development partnerships
- Experienced performance/optimization tool
development teams - Partnerships with vendors and agencies to lead
the way
Software
- Driving architectural innovation needed for
Exascale - Superior global and injection bandwidths
- Purpose built for scientific computing
- Leverages DARPA HPCS technologies
Petaflops
Systems
- Power (reliability, availability, cost)
- Space (current and growth path)
- Global network access capable of 96 X 100 Gb/s
Facility
48Questions?