Planned Machines: ASCI Purple, ALC and M - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Planned Machines: ASCI Purple, ALC and M

Description:

This work was performed under the auspices of the U.S. Department of Energy by ... Some packages are MPI latency dominated. Some packages are MPI BW dominated ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 9
Provided by: marks117
Category:

less

Transcript and Presenter's Notes

Title: Planned Machines: ASCI Purple, ALC and M


1
Planned Machines ASCI Purple, ALC and MIC MCR
  • Presented to SOS7
  • Mark Seager
  • seager_at_llnl.gov
  • 925-423-3141
  • ICCD ADH for Advanced Technology
  • Lawrence Livermore National Laboratory

This work was performed under the auspices of the
U.S. Department of Energy by the University of
California, Lawrence Livermore National
Laboratory under Contract No. W-7405-Eng-48.
2
Q1 What is unique in structure and function of
your machine?
  • Purples unique structure is fat SMPs with 16
    rails of Federation interconnect
  • MCRALCs unique structure is the shared global
    file system
  • However, most important point is that
    applications are highly mobile between Purple,
    MCRALC, White, Q and other clusters of SMP
    systems

3
Purples unique structure is fat SMPs with 16
rails of interconnect
Fibre Channel 2 I/O Network

16 Federation links per SMP in four switch planes
System Data and Control Networks
System Data and Control Networks
System Data and Control Networks

191 Parallel Batch/Interactive/Visualization Nodes
  • Purple System
  • 100 TF/s 30-45 TF/s delivered on sPPMUMT2000
  • 50 TB memory, 2.0 PB of disk _at_ 108 GB/s delivered
  • 197 x 64-way Armada SMP w 16 Federation Links
  • 4 Login/network nodes
  • Login/network nodes for login/NFS
  • 8x10 Gb/s for parallel FTP on each Login
  • All external networking is 1-10 Gb/s Ethernet
  • Clustered I/O services for cluster wide file
    system
  • Fibre Channel2 I/O attach does not extend
  • Programming/Usage Model
  • Application launch over all compute nodes up to
    8,192 tasks
  • 1 MPI task/CPU and Shared Memory, full 64b
    support
  • Scalable MPI (MPI_allreduce, buffer space)
  • Likely usage
  • multiple MPI tasks/node with 4-16 OpenMP/MPI task
  • Single STDIO interface
  • Parallel I/O to single file, multiple serial I/O
    (1 file/MPI task)

4
Unique feature of ALCMCR is Lustre Lite shared
file system
Cluster wide file system leverages DOE/NNSA ASCI
PathForward Open Source Lustre development
5
Q2 What characterizes your applications?
Examples are Intensities of message passing,
memory utilization, computing, IO, and data.
  • Applications characterized as multi-physics
    package simulations
  • All applications compute/comms intensive
  • Each package pushes performance envelope along a
    different dimension
  • Some packages are MPI latency dominated
  • Some packages are MPI BW dominated
  • Memory BW is critical factor, but expensive
    memory subsystems dont perform much better than
    commodity ones

6
Q3 What prior experience guided you to this
choice?
  • Mission and Applications
  • Budgets
  • Politics
  • Delivered performance
  • Balanced risk and cost performance

7
Strategic Approach straddle multiple curves to
balance risk and opportunity of new disruptive
technologies
  • Three complementary curves
  • Delivers to todays stockpiles demanding needs
  • Production environment
  • For must have deliverables now
  • Delivers transition for next generation
  • Near production environment
  • Provides cycles for science
  • Provides cycles for stockpile
  • Leading to next generation production systems
  • These are the capacity systems in a strategic
    capacity/capability mix
  • Delivers affordable path to petaFLOP/s
  • Research environment, leading transition to
    petaflop systems?
  • Are there other paths to a breakthrough regime by
    2006-7?

Any given technology curve is ultimately limited
by Moores Law
Cell-Based (IBM BG/L)
IA32/ IA64/AMD Linux
Vendor integrated SMP Cluster (IBM SP, HP SC)
170K/TF
7M/TF (Q)
500K /TF
Mainframes (RIP)
Performance
2M/TF (Purple C)
1.2M/TF (MCR)
Straddle strategy for stability and preeminence
10 M/TF (White)
Today
Time
FY05
8
Q4. Other than your own machine, for your needs
what are the best and worst machines? And, why?
  • Clusters of SMPs with full node OS makes system
    administration and programming much easier, but
    scalability is an issue
  • Vectors suck
  • 10x potential speed-up from vectorization on Cray
    YMP class machines yielded only 1.5-2x in
    delivered performance boost to stockpile codes
Write a Comment
User Comments (0)
About PowerShow.com