High Performance Cluster Computing - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

High Performance Cluster Computing

Description:

Sun: SPARC, ULTRASPARC. HP PA. IBM RS6000/PowerPC. SGI MPIS ... Linux (Beowulf) Microsoft NT (Illinois HPVM) SUN Solaris (Berkeley NOW) IBM AIX (IBM SP2) ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 57
Provided by: Rajk159
Category:

less

Transcript and Presenter's Notes

Title: High Performance Cluster Computing


1
High Performance Cluster Computing
By Rajkumar Buyya, Monash University,
Melbourne. rajkumar_at_ieee.org
http//www.dgs.monash.edu.au/rajkumar
2
Agenda
  • Overview of Computing
  • Motivations Enabling Technologies
  • Cluster Architecture its Components
  • Clusters Classifications
  • Cluster Middleware
  • Single System Image
  • Representative Cluster Systems
  • Berkeley NOW and Solaris-MC
  • Resources and Conclusions

3
Announcement formation of
  • IEEE Task Force on Cluster Computing
  • (TFCC)
  • http//www.dgs.monash.edu.au/rajkumar/tfcc/
  • http//www.dcs.port.ac.uk/mab/tfcc/

4
Computing Power andComputer Architectures
5
Need of more Computing PowerGrand Challenge
Applications
  • Solving technology problems using
  • computer modeling, simulation and analysis

Life Sciences
Aerospace
Mechanical Design Analysis (CAD/CAM)
6
How to Run App. Faster ?
  • There are 3 ways to improve performance
  • 1. Work Harder
  • 2. Work Smarter
  • 3. Get Help
  • Computer Analogy
  • 1. Use faster hardware e.g. reduce the time per
    instruction (clock cycle).
  • 2. Optimized algorithms and techniques
  • 3. Multiple computers to solve problem That is,
    increase no. of instructions executed per clock
    cycle.

7
Sequential Architecture Limitations
  • Sequential architectures reaching physical
    limitation (speed of light, thermodynamics)
  • Hardware improvements like pipelining,
    Superscalar, etc., are non-scalable and requires
    sophisticated Compiler Technology.
  • Vector Processing works well for certain kind of
    problems.

8
Why Parallel Processing NOW?
  • The Tech. of PP is mature and can be exploited
    commercially significant R D work on
    development of tools environment.
  • Significant development in Networking technology
    is paving a way for heterogeneous computing.

9
History of Parallel Processing
  • PP can be traced to a tablet dated around 100 BC.
  • Tablet has 3 calculating positions.
  • Infer that multiple positions
  • Reliability/ Speed

10
Motivating Factors
  • Aggregated speed with
  • which complex calculations
  • carried out by millions of neurons in human
    brain is amazing! although individual neurons
    response is slow (milli sec.) - demonstrate the
    feasibility of PP

11
Taxonomy of Architectures
  • Simple classification by Flynn
  • (No. of instruction and data streams)
  • SISD - conventional
  • SIMD - data parallel, vector computing
  • MISD - systolic arrays
  • MIMD - very general, multiple approaches.
  • Current focus is on MIMD model, using general
    purpose processors or multicomputers.

12
MIMD Architecture
Instruction Stream A
Instruction Stream B
Instruction Stream C
Data Output stream A
Data Input stream A
Processor A
Data Output stream B
Processor B
Data Input stream B
Data Output stream C
Processor C
Data Input stream C
  • Unlike SISD, MISD, MIMD computer works
    asynchronously.
  • Shared memory (tightly coupled) MIMD
  • Distributed memory (loosely coupled) MIMD

13
Shared Memory MIMD machine
Processor A
Processor B
Processor C
Global Memory System
  • Comm Source PE writes data to GM destination
    retrieves it
  • Easy to build, conventional OSes of SISD can be
    easily be ported
  • Limitation reliability expandability. A
    memory component or any processor failure affects
    the whole system.
  • Increase of processors leads to memory
    contention.
  • Ex. Silicon graphics supercomputers....

14
Distributed Memory MIMD
IPC channel
IPC channel
Processor A
Processor B
Processor C
  • Communication IPC on High Speed Network.
  • Network can be configured to ... Tree, Mesh,
    Cube, etc.
  • Unlike Shared MIMD
  • easily/ readily expandable
  • Highly reliable (any CPU failure does not affect
    the whole system)

15
Main HPC Architectures..1a
  • SISD - mainframes, workstations, PCs.
  • SIMD Shared Memory - Vector machines, Cray...
  • MIMD Shared Memory - Sequent, KSR, Tera, SGI,
    SUN.
  • SIMD Distributed Memory - DAP, TMC CM-2...
  • MIMD Distributed Memory - Cray T3D, Intel,
    Transputers, TMC CM-5, plus recent workstation
    clusters (IBM SP2, DEC, Sun, HP).

16
Main HPC Architectures..1b.
  • NOTE Modern sequential machines are not purely
    SISD - advanced RISC processors use many
    concepts from
  • vector and parallel architectures (pipelining,
    parallel execution of instructions, prefetching
    of data, etc) in order to achieve one or more
    arithmetic operations per clock cycle.

17
Parallel Processing Paradox
  • Time required to develop a parallel application
    for solving GCA is equal to
  • Half Life of Parallel Supercomputers.

18
The Need for Alternative Supercomputing Resources
  • Vast numbers of under utilised workstations
    available to use.
  • Huge numbers of unused processor cycles and
    resources that could be put to good use in a
    wide variety of applications areas.
  • Reluctance to buy Supercomputer due to their cost
    and short life span.
  • Distributed compute resources fit better into
    today's funding model.

19
Scalable Parallel Computers
20
Design Space of Competing Computer Architecture
21
Towards Inexpensive Supercomputing
  • It is
  • Cluster Computing..
  • The Commodity Supercomputing!

22
Motivation for using Clusters
  • Surveys show utilisation of CPU cycles of desktop
    workstations is typically lt10.
  • Performance of workstations and PCs is rapidly
    improving
  • As performance grows, percent utilisation will
    decrease even further!
  • Organisations are reluctant to buy large
    supercomputers, due to the large expense and
    short useful life span.

23
Motivation for using Clusters
  • The communications bandwidth between workstations
    is increasing as new networking technologies and
    protocols are implemented in LANs and WANs.
  • Workstation clusters are easier to integrate into
    existing networks than special parallel computers.

24
Motivation for using Clusters
  • The development tools for workstations are more
    mature than the contrasting proprietary solutions
    for parallel computers - mainly due to the
    non-standard nature of many parallel systems.
  • Workstation clusters are a cheap and readily
    available alternative to specialised High
    Performance Computing (HPC) platforms.
  • Use of clusters of workstations as a distributed
    compute resource is very cost effective -
    incremental growth of system!!!

25
Rise Fall of Computing Technologies
Mainframes Minis PCs
Minis PCs Network Computing
1970 1980 1995
26
What is a cluster?
  • Cluster
  • a collection of nodes connected together
  • Network Faster, closer connection than a typical
    network (LAN)
  • Looser connection than symmetric multiprocessor
    (SMP)

27
1990s Building Blocks
  • Building block complete computers(HW SW)
    shipped in 100,000sKiller micro, Killer DRAM,
    Killer disk,Killer OS, Killer packaging, Killer
    investment
  • Interconnecting Building Blocks gt Killer Net
  • High Bandwidth
  • Low latency
  • Reliable
  • Commodity(ATM,.)

28
Why Clusters now?
  • Building block is big enough
  • Workstations performance is doubling every 18
    months.
  • Networks are faster
  • Higher link bandwidth
  • Switch based networks coming
  • Interfaces simple fast
  • Striped files preferred (RAID)
  • Demise of Mainframes, Supercomputers, MPPs

29
Architectural Drivers(cont)
  • Node architecture dominates performance
  • processor, cache, bus, and memory
  • design and engineering gt performance
  • Greatest demand for performance is on large
    systems
  • must track the leading edge of technology without
    lag
  • MPP network technology gt mainstream
  • system area networks
  • System on every node is a powerful enabler
  • very high speed I/O, virtual memory, scheduling,

30
...Architectural Drivers
  • Clusters can be grown Incremental scalability
    (up, down, and across)
  • Individual nodes performance can be improved by
    adding additional resource (new memory
    blocks/disks)
  • New nodes can be added or nodes can be removed
  • Clusters of Clusters and Metacomputing
  • Complete software tools
  • Threads, PVM, MPI, DSM, C, C, Java, Parallel
    C, Compilers, Debuggers, OS, etc.
  • Wide class of applications
  • Sequential and grand challenging parallel
    applications

31
Example ClustersBerkeley NOW
  • 100 Sun UltraSparcs
  • 200 disks
  • Myrinet SAN
  • 160 MB/s
  • Fast comm.
  • AM, MPI, ...
  • Ether/ATM switched external net
  • Global OS
  • Self Config

32
Basic Components
MyriNet
160 MB/s
Myricom NIC
M
M
I/O bus

Sun Ultra 170
33
Massive Cheap Storage Cluster
  • Basic unit
  • 2 PCs double-ending four SCSI chains of 8 disks
    each

Currently serving Fine Art at http//www.thinker.o
rg/imagebase/
34
Cluster of SMPs (CLUMPS)
  • Four Sun E5000s
  • 8 processors
  • 4 Myricom NICs each
  • Multiprocessor, Multi-NIC, Multi-Protocol
  • NPACI gt Sun 450s

35
Millennium PC Clumps
  • Inexpensive, easy to manage Cluster
  • Replicated in many departments
  • Prototype for very large PC cluster

36
Adoption of the Approach
37
So Whats So Different?
  • Commodity parts?
  • Communications Packaging?
  • Incremental Scalability?
  • Independent Failure?
  • Intelligent Network Interfaces?
  • Complete System on every node
  • virtual memory
  • scheduler
  • files
  • ...

38
OPPORTUNITIES CHALLENGES
39
Opportunity of Large-scaleComputing on NOW
40
Windows of Opportunities
  • MPP/DSM
  • Compute across multiple systems parallel.
  • Network RAM
  • Idle memory in other nodes. Page across other
    nodes idle memory
  • Software RAID
  • file system supporting parallel I/O and
    reliablity, mass-storage.
  • Multi-path Communication
  • Communicate across multiple networks Ethernet,
    ATM, Myrinet

41
Enabling Technologies
  • Efficient communication hardware and software
  • Global co-ordination of multiple workstation
    Operating Systems

42
Efficient Communication
  • The key Enabling Technology
  • Communication overheads components
  • bandwidth
  • network latency and
  • processor overhead
  • Switched LANs allow bandwidth to scale
  • Network latency can be overlapped with
    computation
  • Processor overhead is the real problem - it
    consumes CPU cycles

43
Efficient Communication (Contd...)
  • SS10 connected by Ethernet
  • 456 ?s processor overhead
  • With ATM
  • 626 ?s processor overhead
  • Target
  • MPP communication performance low latency and
    scalable bandwidth
  • CM5 user-level network overhead 5.7 ?s

44
  • Cluster Computer and its Components

45
Clustering Today
  • Clustering gained momentum when 3 technologies
    converged
  • 1. Very HP Microprocessors
  • workstation performance yesterday
    supercomputers
  • 2. High speed communication
  • Comm. between cluster nodes gt between processors
    in an SMP.
  • 3. Standard tools for parallel/ distributed
    computing their growing popularity.

46
Cluster Computer Architecture
47
Cluster Components...1aNodes
  • Multiple High Performance Components
  • PCs
  • Workstations
  • SMPs (CLUMPS)
  • Distributed HPC Systems leading to Metacomputing
  • They can be based on different architectures and
    running difference OS

48
Cluster Components...1bProcessors
  • There are many (CISC/RISC/VLIW/Vector..)
  • Intel Pentiums, Xeon, Merceed.
  • Sun SPARC, ULTRASPARC
  • HP PA
  • IBM RS6000/PowerPC
  • SGI MPIS
  • Digital Alphas
  • Integrate Memory, processing and networking into
    a single chip
  • IRAM (CPU Mem) (http//iram.cs.berkeley.edu)
  • Alpha 21366 (CPU, Memory Controller, NI)

49
Cluster Components2OS
  • State of the art OS
  • Linux (Beowulf)
  • Microsoft NT (Illinois HPVM)
  • SUN Solaris (Berkeley NOW)
  • IBM AIX (IBM SP2)
  • HP UX (Illinois - PANDA)
  • Mach (Microkernel based OS) (CMU)
  • Cluster Operating Systems (Solaris MC, SCO
    Unixware, MOSIX (academic project)
  • OS gluing layers (Berkeley Glunix)

50
Cluster Components3High Performance Networks
  • Ethernet (10Mbps),
  • Fast Ethernet (100Mbps),
  • Gigabit Ethernet (1Gbps)
  • SCI (Dolphin - MPI- 12micro-sec latency)
  • ATM
  • Myrinet (1.2Gbps)
  • Digital Memory Channel
  • FDDI

51
Cluster Components4Network Interfaces
  • Network Interface Card
  • Myrinet has NIC
  • User-level access support
  • Alpha 21364 processor integrates processing,
    memory controller, network interface into a
    single chip..

52
Cluster Components5 Communication Software
  • Traditional OS supported facilities (heavy weight
    due to protocol processing)..
  • Sockets (TCP/IP), Pipes, etc.
  • Light weight protocols (User Level)
  • Active Messages (Berkeley)
  • Fast Messages (Illinois)
  • U-net (Cornell)
  • XTP (Virginia)
  • System systems can be built on top of the above
    protocols

53
Cluster Components6aCluster Middleware
  • Resides Between OS and Applications and offers in
    infrastructure for supporting
  • Single System Image (SSI)
  • System Availability (SA)
  • SSI makes collection appear as single machine
    (globalised view of system resources). Telnet
    cluster.myinstitute.edu
  • SA - Check pointing and process migration..

54
Cluster Components6bMiddleware Components
  • Hardware
  • DEC Memory Channel, DSM (Alewife, DASH) SMP
    Techniques
  • OS / Gluing Layers
  • Solaris MC, Unixware, Glunix
  • Applications and Subsystems
  • System management and electronic forms
  • Runtime systems (software DSM, PFS etc.)
  • Resource management and scheduling (RMS)
  • CODINE, LSF, PBS, NQS, etc.

55
Cluster Components7aProgramming environments
  • Threads (PCs, SMPs, NOW..)
  • POSIX Threads
  • Java Threads
  • MPI
  • Linux, NT, on many Supercomputers
  • PVM
  • Software DSMs (Shmem)

56
Cluster Components7bDevelopment Tools ?
  • Compilers
  • C/C/Java/
  • Parallel programming with C (MIT Press book)
  • RAD (rapid application development tools).. GUI
    based tools for PP modeling
  • Debuggers
  • Performance Analysis Tools
  • Visualization Tools

57
Cluster Components8Applications
  • Sequential
  • Parallel / Distributed (Cluster-aware app.)
  • Grand Challenging applications
  • Weather Forecasting
  • Quantum Chemistry
  • Molecular Biology Modeling
  • Engineering Analysis (CAD/CAM)
  • .
  • PDBs, web servers,data-mining

58
Key Operational Benefits of Clustering
  • System availability (HA). offer inherent high
    system availability due to the redundancy of
    hardware, operating systems, and applications.
  • Hardware Fault Tolerance. redundancy for most
    system components (eg. disk-RAID), including both
    hardware and software.
  • OS and application reliability. run multiple
    copies of the OS and applications, and through
    this redundancy
  • Scalability. adding servers to the cluster or by
    adding more clusters to the network as the need
    arises or CPU to SMP.
  • High Performance. (running cluster enabled
    programs)
Write a Comment
User Comments (0)
About PowerShow.com