Building Beowulfs for High Performance Computing

1 / 27
About This Presentation
Title:

Building Beowulfs for High Performance Computing

Description:

Shrinkwrapped 'solutions' or do-it-yourself. Not much more than a nicely installed network of PCs ... price/performance (network) , price/performance (entire ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 28
Provided by: dhpcAdel

less

Transcript and Presenter's Notes

Title: Building Beowulfs for High Performance Computing


1
Building Beowulfs for High Performance Computing
  • Duncan Grove
  • Department of Computer Science
  • University of Adelaide

2
Three Computational Paradigms
  • Data Parallel
  • Regular grid based problems
  • Parallelising compilers, eg HPF
  • Eg physicists running lattice gauge calculations
  • Message Passing
  • Unstructured parallel problems.
  • MPI, PVM
  • Eg chemists running molecular dynamics
    simulations.
  • Task Farming
  • High throughput computing - batch jobs
  • Queuing systems
  • Eg chemists running gaussian.

3
Anatomy of a Beowulf
  • Cluster of networked PCs
  • Intel PentiumII or Compaq Alpha
  • Switched 100Mbit/s Ethernet or myrinet
  • Linux
  • Parallel and batch software support

4
Why build Beowulfs?
  • Science/
  • Some problems take lots of processing
  • Many supercomputers are used as batch processing
    engines
  • Traditional supercomputers wasteful high
    throughput computing
  • Beowulfs
  • useful computational cycles at the lowest
    possible price.
  • Suited to high throughput computing
  • Effective at an increasingly large set of
    parallel problems

5
A Brief Cluster History
  • Caltech Prehistory
  • Berkeley NOW
  • NASA Beowulf
  • Stone SouperComputer
  • USQ Topcat
  • UIUC NT Supercluster
  • LANL Avalon
  • SNL Cplant
  • AU Perseus?

6
Beowulf Wishlist
  • Single System Image (SSI)
  • Unified process space
  • Distributed shared memory
  • Distributed file system
  • Performance easily extensible
  • Just add more bits
  • Is fault tolerant
  • Is simple to administer and use

7
Current Sophistication?
  • Shrinkwrapped solutions or do-it-yourself
  • Not much more than a nicely installed network of
    PCs
  • A few kernel hacks to improve performance
  • No magical software for making the cluster
    transparent to the user
  • Queuing software and parallel programming
    software can create the appearance of a more
    unified machine

8
Stone SouperComputer
9
Iofor
  • Learning platform
  • Program development
  • Simple benchmarking
  • Simple performance evaluation of real applcaions
  • Teaching machine
  • Money lever

10
iMacwulf
  • Student lab by day, Beowulf by night?
  • MacOS with Appleseed
  • LinuxPPC 4.0, soon LinuxPPC 5.0
  • MacOS/X

11
Gigaflop harlotry
  • Machine Cost Processors Peak Speed
  • Cray T3E 10s million 1084 1300Gflop/s
  • SGI Origin 2000 10s million 128 128Gflop/s
  • IBM SP2 10s million 512 400Gflop/s
  • Sun HPC 1s million 64 50Gflop/s
  • TMC CM5 5 Million (1992) 128 20Gflop/s
  • SGI PowerChallenge 1 Million (1995) 20 20Gflop/s
  • Beowulf cluster myrinet 1 Million 256 120Gflop
    /s
  • Beowulf cluster 300K 256 120Gflop/s

12
The obvious, but important
  • In the past
  • Commomdity processors way behind supercomputer
    processors
  • Commodity networks way, way, way behind
    supercomputer networks
  • In the now
  • Commomdity processors only just behind
    supercomputer processors
  • Commmodity networks still way, way behind
    supercomputer networks
  • More exotic networks still way behind
    supercomputer networks
  • In the future
  • Commodity processors will be supercomputer
    processors
  • Will the commodity networks catch up?

13
Hardware possibilities
14
OS possibilities
15
Network technologies topologies
  • So many choices! -gt interfaces, cables, switches,
    hubs, routers
  • ATM, ethernet, fast ethernet, gigabit ethernet,
    firewire, HiPPI, serial HiPPI, Myrinet, SCI
  • latency, bandwidth, availability, price! VIA?
  • Issues price, performance, price/performance
    (network) , price/performance (entire system).

16
Disk subsystems?
  • 1) I/O a problem in parallel systems
  • 2) Data Server itself an interesting idea
  • Beowulf Bulk Data Server
  • cf with slow, expensive tape silos...
  • Eg our chem beowulf will have 0.7TB
  • Could easily put 50GB of cheap disk per node
  • gt 1 TB on-line storage with 20 nodes...
  • RAID. Software or hardware?
  • Distributed/parallel file systems? NOW
  • Home dirs not on compute nodes is a perf. hit
  • cache NFS? CODA (open source AFS replacement)

17
Advantages of Open Source
  • Linux immature eg lacking caching file system.
    Good hpc tools.
  • Recent announcments
  • SGI has release xfs open source.
  • Sun has released its hpc solutions open source.
  • Linux can make use of all of these!!! Tried and
    true good hpc code comes to free open source
    linux for cheap machines.

18
Perseus
  • Machine for chemistry simulations
  • Mainly high throughput computing
  • In excess of 300K
  • 128 nodes. For lt 2K per node
  • Dual processor PII450
  • At least 256MB RAM
  • Some nodes up to 1GB
  • 6GB local disk each
  • 5x24 (2x4) port Intel 100Mbit/s switches

19
Perseus Initial Phase
  • Prototype
  • 16 dual processor PII
  • 100Mbit/s switched Ethernet
  • For sale!

20
Software on perseus
  • Software to support the three computational
    paradigms
  • Data Parallel
  • Portland Group HPF
  • Message Passing
  • MPICH, LAM/MPI, PVM
  • High throughput computing
  • Condor, GNU Queue
  • Gaussian94, Gaussian98

21
Expected performance
  • Loki, 1996
  • 16 Pentium Pro processors, 10Mbit/s Ethernet
  • 3.2 Gflop/s peak, achieved 1.2 real Gflop/s on
    Linpack benchmark
  • Perseus, 1999
  • 256 PentiumII processors, 100Mbit/s Ethernet
  • 115 Gflop/s peak
  • 40 Gflop/s on Linpack benchmark?
  • Compare with top 500!
  • Would get us to about 200 currently
  • Other Australian machines?
  • NEC SX/4 _at_ BOM at 102
  • Sun HPC at 181, 182, 255
  • Fujitsi VPP _at_ ANU at 400

22
Preliminary performance results
23
Reliability in a large systems
  • Build it right! Racks and bolts and cable ties.
  • Heat going to be a problem?
  • Daemon to monitor cluster
  • normal stuff
  • cpu, network, memory, disk utilisation and
    performance
  • switch performance (SNMP)
  • More exotic stuff
  • case and cpu fan speeds
  • motherboard and cpu temperatures
  • More advanced tools? Web interfaces?...
  • Kickstart installations
  • Monitoring software
  • Packaging
  • Node job control
  • Parallel interactive shell?!

24
Beodoh!
  • Load balancing
  • Effects of machines capabilities
  • Desktop machines vs. dedicated machines
  • Resource allocation
  • Scalability - switch fabric limited?
  • Task migration, I/O, fault tol., security!
  • Breakin on iofor!
  • upgrading still problematic, eg latest upgrade.
    Probably only do this every couple of years
  • Maintenance requirements, heterogeneity problems,
    ownership hurdles

25
Beowhere-now?
  • This is mainly an integration problem. We hope to
    be able to make contributions to...
  • System packaging
  • Distributed shared memory
  • System documentation
  • System monitoring control tools (web)
  • Fault-tolerancing
  • Load-balancing
  • Performance models
  • Traffic monitoring
  • Versioning!!!!
  • Write a comprehesive in detail beowulf howto-
    everyone elses is BAD.
  • Build perseus2
  • Real benchmarks actual applications, production
    machine. Etc
  • which ones are integration, which ones research?

26
Summary Slide
  • Beowulf compute for
  • current system is for chemists - mainly for high
    throughput computing. Slow networks, queue
    managed batch jobs
  • they can do parallel in a box with smp
  • The future? Fast networks for their highly
    parallel probelm

27
Top 500
  • The top two machines, ASCI Red and ASCI Blue are
    custom built. 2.1TF and 1.6TF Respectively.
  • 3 (T3E) 891Gflop, 10 510 Gflop, 50 150 Gflop
    100 62Gflop 200 40 Gflop, 500 25 Gflop (up
    from 19 Gflop)
  • SGI 182/500, 7/10, 2
  • IBM 118/500, 1/10, 8
  • Sun 95/100, 0/10, 54
  • H/P 39/100, 0/10, 150
  • Fujitsu 23/500, 0/10, 26
  • NEC 18/500, 0/10, 29
  • Hitachi 12/500, 1/10, 4
  • Compaq 5/500, 0/10, 49
  • Intel 4/500, 1/10, 1
  • Self-made 3/500, 0/10
  • Cplant 129 54 Gflop/s
  • Avalon 160 48 Gflop/s
  • Parnass2 362 29 Gflop/s
  • Others 1/500, 0/10
Write a Comment
User Comments (0)