Title: Cluster Computing: A COTS Solution for Providing HPC Capabilities
1Cluster Computing A COTS Solution for Providing
HPC Capabilities
- Mark Baker
- University of Portsmouth
- http//www.dcs.port.ac.uk/mab/Talks/
2Overview of Talk
- Aims and Objectives
- Beowulf background
- Problems and Issues
- The TFCC
- The Future
3Aims and Objectives
- Provide a broad introduction and overview of
Cluster Computing efforts leave details to
later speakers. - Discuss the state-off-the-art.
- Introduce TFCC and its efforts including the CC
White paper. - Detail some emerging and future technologies.
4Beowulf-class Systems
- Bringing high-end computing to broader problem
domain - new markets - Order of magnitude price/performance advantage
- Commodity enabled no long development lead
times - Low vulnerability to vendor-specific decisions
companies are ephemeral Beowulfs are forever
!!! - Rapid response to technology tracking
- User-driven configuration- potential for
application specific ones - Industry-wide, non-proprietary software
environment
5Beowulf Foundations
- Concept is simple
- Ideas have precedence and have been applied in
other domains. - Execution is straight forward and almost trivial
in implementation. - All facets of underlying principle and practice
are incremental - Driven by a convergence of market trends and
technologies achieving critical mass, ignition! - Impact is revolutionising high-end computing
6Beowulf-class Systems
- Cluster of PCs
- Intel x86
- DEC Alpha
- Mac Power PC
- Pure Mass-Market COTS
- Unix-like O/S with source
- Linux, BSD, Solaris
- Message passing programming model
- PVM, MPI, BSP, homebrews
- Single user environments
- Large science and engineering applications
7Decline of Heavy Metal
- No market for high-end computers
- minimal growth in last five years
- Extinction
- KSR, TMC, Intel, Meiko, Cray, Maspar, BBN,
Convex - Must use COTS
- fabrication costs skyrocketing
- development lead times too short
- US Federal Agencies Fleeing
- NSF, DARPA, DOE, NIST
- Currently no new good IDEAS
8Enabling Drivers
- Drastic reduction in vendor support for HPC
- Component technologies for PCs matches that for
workstations (in terms of capability) - PC hosted software environments similar in
sophistication and robustness to mainframe OS - Low cost network hardware and software enable
balanced PC clusters - MPPs establish low-level of expectation
- Cross-platform parallel programming model
(MPI/PVM/HPF)
9The Market place - www.top500.org
10Taxonomy
11Taxonomy
Tightly Coupled MPP
Loosely Coupled Cluster
12The Paradox
- Time to optimise a parallel application is
generally thought to take an order of magnitude
(x10) more time than for an equivalent sequential
application. - Time required to develop a parallel application
for solving grand challenge problems is equal to
- half the life of parallel supercomputer.
13Beowulf Project - A Brief History
- Started in late 1993
- NASA Goddard Space Flight Center
- NASA JPL, Caltech, academic and industrial
collaborators - Sponsored by NASA HPCC Program
- Applications single user science station
- data intensive
- low cost
14Beowulf Project - A Brief History
- General focus
- single user (dedicated) science and engineering
applications - out of core computation
- system scalability
- Ethernet drivers for Linux
15The Beowulf System at JPL (Hyglac)
- 16 Pentium Pro PCs, each with 2.5 Gbyte disk, 128
Mbyte memory, Fast Ethernet card. - Connected using 100Base-T network, through a
16-way crossbar switch.
- Theoretical peak performance 3.2 GFlop/s.
- Achieved sustained performance 1.26 GFlop/s.
16Beowulf media-profile
17Beowulf Accomplishments
- Many Beowulf-class systems installed
- Experience gained in the implementation and
application - Many applications (some large) routinely executed
on Beowulfs - Basic software fairly sophisticated and robust
- Supports dominant programming/execution paradigm
- Single most rapidly growing area in HPC
- Ever larger systems in development (Cplant_at_SNL)
- Now recognised as mainstream
18Overall Hardware Issues
- All necessary components available in mass market
(M2COTS) - Powerful computational nodes (SMPs)
- Network bandwidth impacts high volume
communication-intensive applications - Network latency impacts random access (with short
messages) applications - Many applications work well with 1 bps per 1
flops - X10 improvements in both bandwidth and latency
- Price-performance advantage of X10 in many cases
19Software Beowulf Gendel Suite
- Targets effective management of clusters
- Embraces NIH (Nothing In-House)
- Surrogate customer for Beowulf community
- Borrow software products from research projects
- Capabilities required
- communication layers
- numerical libraries
- program development tools and debuggers
- scheduling and runtime
- external I/O and secondary/mass storage
- general system admin
20Technology Drivers
- Reduced recurring costs approx 10 of MPPs
- Rapid response to technology advances
- Just-in-place configuration and reconfigurable
- High reliability if system designed properly
- Easily maintained through low cost replacement
- Consistent portable programming model
- Unix, C, Fortran, Message passing
- Applicable to wide range of problems and
algorithms
21Software Stumbling Blocks
- Linux cruftiness
- Heterogeneity
- Scheduling and protection in time and space
- Task migration
- Checkpointing and restarting
- Effective, scalable parallel file system
- Parallel debugging and performance optimization
- System software development frameworks and
conventions
22Linux cruftiness
- Kernel changes too much
- bugs come, go, and reappear with each release
- SMP support
- performance is currently lacking
- unreliable, frequent crashes under moderate
stress - many SMP chipsets have cache consistency bugs
23Coping with HeterogeneityThe Problem
- Multiple generation of processors
- Different clock speeds,
- Different cache sizes,
- Different software architectures
- Non-uniform node configurations
- Disk capacity versus bandwidth
- Non-uniform network configurations
- Mix of workload types across systems
- System space-partitioning
24Coping with Heterogeneity The Solution
- Global configurations must reflect diverse
strengths of subsystems. - Task scheduling must deal with differences
in-node performance (load balancing) - Finer granularity of tasks required to balance
throughput variations. - Distributed file managers must contend with
non-uniform node-disk capacities and complex
access handling patterns
25Coping with Heterogeneity The Solution
- User optimizations have to be effective across
variations in system scale and configuration. - Need an annotated virtualisation of a "sea of
resources" and runtime adaptive allocation. - Effects algorithm, language, compiler, and
runtime software elements.
26System Software Development Framework
- Establish a shared framework for constructing
independently developed co-operating system
services - Define interfaces
- locating system resources
- accessing resource information
- modifying state of system resources
- Conventions common directory structure for
libraries and configuration files - API standardised set of library calls (at
instantiation) - Protocols interface between programs, for
specific services there will be specific
protocols - format for returned information from node status
monitors
27Beowulf and The Future
- 2 Gflops/s peak processors
- 1000 per processor (already there!)
- 1 Gbps at lt 250 per port
- new backplane performance PCI-X, NGIO, SIO/
FutureIO (InfiniBand - http//www.infinibandta.org
) - Light-weight communications lt 10 ?s latency (VIA)
- Optimized math libraries
- 1 GByte main memory per node
- 24 GBytes disk storage per node
- De facto standardised middleware
28Next Generation Beowulf
- Currently today, 3M peak Tflops/s
- Before year 2002 1M peak Tflops/s
- Performance efficiency is serious challenge
- System integration
- does vendor support of massive parallelism have
to mean massive markup? - System administration, boring but necessary
- Maintenance without vendors how?
- New kind of vendors for support
- Heterogeneity will become major aspect
29The TFCC A brief introduction
- The IEEE Computer Society sponsored the formation
the Task Force on Cluster Computing (TFCC) in
February 1999. - We proposed that the TFCC would
- Act as an international forum to promote cluster
computing research and education, and participate
in setting up technical standards in this area. - Be involved with issues related to the design,
analysis and implementation of cluster systems as
well as the applications that use them.
30The TFCC A brief introduction
- Sponsor professional meetings, brings out
publications, set guidelines for educational
programs, as well as helping co-ordinate
academic, funding agency, and industry
activities. - Organise events and hold a number of workshops
that would span the range of activities sponsored
by the Task Force. - Publish a bi-annual newsletter to help keep
abreast of the events occurring within this
field. - See ? http//www.dcs.port.ac.uk/mab/tfcc
31Background - IEEE Task Forces
- A TF is expected to have a finite term of
existence, normally a period of 2-3 years -
continued existence beyond that point is
generally not appropriate. - A TF is expected to either increase their scope
of activities such that establishment of a TC is
warranted, or the task force will be disbanded.
32What is a Cluster? The need for a definition !?
- A cluster is a type of parallel or distributed
system that consists of a collection of
interconnected whole computers used as a single,
unified computing resource. - Where "Whole computer" is meant to indicate a
normal, whole computer system that can be used on
its own processor(s), memory, I/O, OS, software
subsystems, applications.
33Whats in a definition though !!
- That was the first approximation - thanks to Greg
Pfister (In Search of Clusters, PHR) - A cluster may be hard to define, but you know one
when you see one though... - Much discussion though, see http//ww.eg.bucknell
.edu/hyde/tfcc/
34Why we wanted a separate TFCC
- It brings together all the technologies used with
Cluster Computing into one area - so instead of
tracking four or five IEEE TCs there is one... - Cluster Computing is NOT just Parallel,
Distributed, OSs, or the Internet, it is
potentially a heady mix them all, and
consequently different. - The TFCC will be an appropriate for publications
and activities associated with Cluster Computing.
35Those Involved
- Vice-chairs
- David Culler, University of California, Berkeley,
USA. - Andrew Chien, University of California, San
Diego, USA - Technical Area (some of the executive committee)
- Network Technologies Salim Hariri (Arizona, USA)
- OS Technologies Thomas Sterling (Caltech, USA)
- Parallel I/O Erich Schikuta (Wien, Austria)
- Programming Environments Tony Skjellum (MPI
Softech, USA) - Java Technologies Geoffrey Fox (NPAC, Syracuse,
USA) - Algorithms and Applications Marcin Paprzycki,
(USM, USA) and David Bader, UNM, USA. - Analysis and Profiling Tools Dan Reed (UIUC,
USA) - High Throughput Computing Miron Livny
(Wisconsin, USA) - Performance Evaluation Jack Dongarra (UTK/ORNL)
36Affiliations - Journals
- Cluster Computing Baltzer Science Publishers,
ISSN 1386-7857, www.baltzer.nl/cluster/ - Concurrency Practice and Experience, Wiley
Sons, ISSN 1040-3108, www.infomall.org/Wiley/CPE/
37TFCC Book Donation Programme
- High Performance Cluster Computing Architectures
and Systems, R. Buyya (ed.), Prentice Hall, 1999.
(50) - High Performance Cluster Computing Programming
and Applications, R. Buyya (ed.), Prentice Hall,
1999 (50) - In Search of Clusters, 2nd ed., G. Pfister,
Prentice Hall, 1998. (25) - Metacomputing Future Generation Computing
Systems, W. Gentzsch (ed.), Elsevier, 1999 (55) - Parallel Programming Techniques and Applications
Using Networked Workstations and Parallel
Computers, B. Wilkinson and C.M. Allen, Prentice
Hall, 1998 (25) - Morgan Kaufmann Publishers (??)
- Cluster Computing The Journal of Networks,
Software Tools and Applications, Baltzer Science
Publishers, ISSN 1386-7857 (40)
38TFCC Web Site
39TFCC Web Sites
- www.dgs.monash.edu.au/rajkumar/tfcc/
- www.dcs.port.ac.uk/mab/tfcc/
- www-unix.mcs.anl.gov/buyya/tfcc/
- HPCC Book -
- http//www.phptr.com/ptrbooks/ptr_0130137847.html
40TFCC - Events
- Forthcoming Conferences
- IEEE International Conference on Cluster
Computing (Cluster'2000), TFCC's 2nd Annual
Meeting, November 2000, Chemnitz, Germany. - IEEE International Conference on Cluster
Computing (Cluster'2001), TFCC's 3rd Annual
Meeting, Oct 2000, Los Angles, USA. - Past Events
- IWCC99 Melbourne Australia
- TFCC BOF _at_ SC99 Nov 1999, Portland, USA.
- 8th HPDC-8, Aug 1999, Redondo Beach, USA.
41TFCC - Events
- Forthcoming Events
- International Workshop on Personal Computer based
Networks Of Workstations (PC-NOW'200), May 2000,
Cancun, Mexico. - HPCN 2000 Cluster Computing Workshop, May 2000,
Amsterdam, The Netherlands. - Asia-Pacific Symposium on Cluster Computing
(APSCC'2000), May 2000, Beijing, China. - Technical Session on Cluster Computing -
Technologies, Environments, and Applications
(CC-TEA'2000), June 2000, USA. - Workshop on Cluster Computing for Internet
Applications (CCIA2000), July 2000, Japan. - EuroPar'2000 Cluster Computing Workshop, Aug
2000, Germany.
42TFCC Mailing Lists
- Currently three emails lists have been set up
- tfcc-l_at_bucknell.edu a mailing list open to
anyone interested in the TFCC - see TFCC page for
info. on how to subscribe. - tfcc-exe_at_npac.syr.edu - a closed executive
committee mailing reflector. - tfcc-adv_at_npac.syr.edu - a closed advisory
committee mailing reflector.
43TFCC -Future Plans
- Publish White paper (RFC out)
- Associate TFCC with other like minded efforts.
- Further introduce CC related courses and expand
book donation programme. - Publicise TFCC more widely.
- Create a portal for UK-based projects
interchange ideas and promote discussion.
44TFCC White paperA brief taste
- This White Paper is meant to provide an
authoritative review all the hardware and
software technologies that could be used to make
up a cluster now or in the near future. - Technologies range from the network level,
through the operating system and middleware
levels up to the application and tools level. - The White Paper also tackles the increasingly
important area of High Availability as well as
Education, which is considered a crucial area for
the future success of cluster computing.
45The White paper
- 1. Introduction
- 2. Network technologies
- 3. Operating Systems
- 4. Single System Image (SSI)
- 5. Middleware
- 6. Parallel IO
- 7. High Availability
- 8. Numerical Libraries and Tools for Scalable
Parallel - 9. Applications
- 10. Embedded/Real-Time Systems
- 11. Education
46White paperOperating Systems
- Cluster OSs are similar in many ways to
conventional workstation OSs. - How one chooses an OS depends on one's view of
clustering - Can argue that each node of a cluster must
contain a full-featured OS such as Unix, with all
the positives and negatives that implies. - At the other extreme, researchers are asking the
question, "Just how much can I remove from the OS
and have it still be useful?" - these systems are
typified by the work going on in the
Computational Plant project at Sandia National
Laboratories. - Others are examining the possibility of
on-the-fly adaptation of the OS layer,
reconfiguring the available services through
dynamic loading of code into the cluster
operating system.
47White paperSingle System Image (SSI)
- A cluster consists of a collection of
interconnected commodity stand-alone computers
that can act together as a single, integrated
computing resource. - Each node of a cluster is a complete system
having its own hardware and software resources. - However, they offer a view of single system to
the user through a hardware or software mechanism
that enables then to exhibit. a property
popularly called as a SSI. - SSI is the illusion of a single system from
distributed resources.
48White paperMiddleware
- The purpose of middleware is to provide services
to applications running in distributed
heterogeneous environments. - The choice of which middleware may best meet an
organizations needs is difficult as the
technology is still being developed and it may be
some time before it reaches maturity. - The risk of choosing one solution over another
can be reduced if - the approach is based on the concept of a
high-level interface - the concept of a service is associated with each
interface - the product conforms to a standard and supports
its evolution
49White paperNumerical Tools and Libraries
- A good base of software is available to
developers now, both publicly available packages
and commercially supported packages. - These may not in general provide the most
effective software, however they do provide a
solid base from which to work. - It will be some time in the future before truly
transparent, complete efficient numerical
software is available for cluster computing. - Likewise, effective programming development and
analysis tools for cluster computing are becoming
available but are still in early stages of
development.
50ConclusionsFuture Technology Trends
- Systems On a Chip (SOC) new Transputers!
- GHz processors
- VLIW
- 64 bit processors applications that can use
this address space - Gbit DRAM
- micro-disks on a board
- Optical fibre and wave-division multiplexing
51Conclusions Future Enablers
- Very high bandwidth backplanes
- Low-latency/high bandwidth COTS switches
- SMP on a chip
- Processor In Memory (PIM)
- Open Source software
- GRID-based technologies (meta-problems)
52Conclusions Software Stumbling Blocks
- Scheduling and security
- Load balancing and task migration
- Check pointing and restarting
- Effective and scalable PFS
- Parallel debugging and performance optimization
- System software development (frameworks/convention
s) - Programming paradigms MP instead of DSM
- SPMD efficient irregular problems
53The Future
- Common standards and Open Source software
- Better
- Tools, utilities and libraries
- Design with minimal risk to accepted standards
- Higher degree of portability (standards)
- Wider range and scope of HPC applications
- Wider acceptance of HPC technologies and
techniques in commerce and industry. - Emerging GRID-based environments
54Ending
- Like to thank
- Thomas Sterling for use of some the materials
used. - Recommend you monitor TFCC activities
- www.dcs.port.ac.uk/mab/tfcc
- Join TFCCs mailing list
- Send me a reference to your projects
- Join in TFCCs efforts (sponsorship, organise
meetings, contribute to publications) - White paper constructive comments please