Parallel computer architecture overview - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Parallel computer architecture overview

Description:

Laptops and supercomputers are fundamentally similar! ... How to best use this type of architecture is still under heavy investigation. ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 19
Provided by: Surf6
Category:

less

Transcript and Presenter's Notes

Title: Parallel computer architecture overview


1
Parallel computer architecture overview
2
  • Parallel computers definition A collection of
    processing elements that cooperate to solve
    large problems fast.
  • Some broad issues that distinguish parallel
    computers
  • Resources
  • how large a collection?
  • how powerful are the elements?
  • how much memory?
  • Data access, communication and synchronization
  • how do the elements cooperate and communicate?
  • how are data transmitted between processors?
  • what are the abstractions and primitives for
    cooperation?
  • Performance and scalability
  • how does it all translate into performance?
  • how does it scale?

3
Trend in parallel computer architecture
development
  • History diverse and innovative organizational
    structures, often tied to novel programming
    models
  • The architecture is often built around one or two
    good ideas in software or hardware.
  • Rapidly matured under strong technological
    constraints
  • The microprocessor is ubiquitous
  • Laptops and supercomputers are fundamentally
    similar!
  • Technological trends cause diverse approaches to
    converge
  • Technological trends make parallel computing
    inevitable
  • Mainstream computing
  • Need to understand fundamental principles and
    design tradeoffs, not just taxonomies

4
Technology trend
  • Figure from Pattersons parallel architectures
    book (1999)
  • The performance of micro-processors is catching
    up with that of supercomputers.

5
  • In terms of performance improvement, nothing
    beats micro-processors.
  • To maintain the improvement, more and more
    supercomputer features are built in
    micro-processors.
  • Use commodity micro-processors to build
    everything (if you cant beat them, join them).
  • Mainframes and minicomputers pretty much
    disappear in todays world, replaced by server
    farms (clusters of servers).
  • Virtualization on clusters.
  • Many supercomputers are clusters of
    servers/workstations (see www.top500.org).

6
Parallel architectures
  • Shared memory architectures
  • Distributed memory architectures
  • Hybrid

7
Shared memory architectures
  • All processors access all memory as global
    address space
  • Changes made by one processor are visible by
    other processors
  • Two types based on the differences in memory
    access speed
  • Uniform memory access (UMA)
  • Non-uniform memory access (NUMA)

8
UMA Shared memory architecture (mostly bus-based
MPs)
  • Micro on a chip makes it natural to connect many
    to shared memory
  • dominates server and enterprise market, moving
    down to desktop
  • Faster processors began to saturate bus, then
    bus technology advanced
  • today, range of sizes for bus-based systems,
    desktop to large servers (Symmetric
    Multiprocessor (SMP) machines).

9
Bus bandwidth in Intel systems
10
NUMA Shared memory architecture
  • Identical processors, processors have different
    time for accessing different part of the memory.
  • Often made by physically linking SMP machines
    (Origin 2000, up to 512 processors).
  • The next generation SMP interconnects (Intel
    Common System interface (CSI) and AMD
    hypertransport) have this flavor, but the
    processors are close to each other.

11
Cache coherence issue in shared memory
architecture
  • Cache coherence
  • There are multiple versions of data (memory copy,
    and cache copies).
  • How to maintain a consistent system view?
  • Need some mechanism to make the memory system
    appear coherent.
  • Cache coherence protocols.

12
Shared memory architecture advantages and
disadvantages
  • Advantages
  • Globally shared memory provides user-friendly
    programming perspective to programmers.
  • Disadvantage
  • Lack of scalability
  • No hope for UMA
  • What about NUMA
  • A lot of small traffic through the interconnect
  • adding processors changes the traffic requirement
    of the Interconnect.
  • Writing correct shared memory parallel programs
    is not straight forward.

13
Distributed memory architectures
  • Processors have their own local memory. Memory
    addresses in one processor do not map to another
    processor.
  • no concept of global address space.
  • No concept of cache coherency.
  • To access data in another processor, use explicit
    communication.

14
Distributed memory architectures
  • The networks can be very different for
    distributed memory architectures
  • Massively parallel processors (MPP) usually use
    a specially designed network (and node).
  • IBM Bluegene, IBM SP series
  • Clusters usually use commodity system/local area
    networks Infiniband, Quadrics, Myrinet, 10 Gbps
    Ethernet.
  • Lemieux at PSC uses Quadrics
  • Ranger (NO. 2 top supercomputer) at TACC uses
    Infiniband
  • UC-TG at Argonne uses Myrinet
  • The raw speed of the network matches that of the
    specially designed network.
  • May not provide some customized support such as
    reduction network.
  • Grid computers use the Internet as the networks.

15
Distributed memory architectures
  • MPP, clusters and grid computers targets
    different types of applications
  • MPP and clusters support tightly coupled
    applications (large amount of interactions among
    processes).
  • Communicate every 1 microsecond.
  • Grid computers can only support coarse-grain
    parallel applications or embarrassingly parallel
    applications.
  • Communicate every second.

16
Advantages and disadvantages
  • Advantages
  • Memory is scalable with number of processors.
    Increase the number of processors and the size of
    memory increases proportionately.
  • Each processor can rapidly access its own memory
    without interference and without the overhead
    incurred with trying to maintain cache coherency.
  • Cost effectiveness can use commodity,
    off-the-shelf processors and networking
  • Disadvantages
  • The programmer is responsible for the details
    associated with data communication.
  • It may be difficult to map existing data
    structures, based on global memory, to this
    memory organization.

17
Converging of the distributed and shared memory
architectures
  • The contemporary distributed and shared memory
    architectures are converging.
  • Nodal architectures have always been similar.
  • Both requires high bandwidth and low latency
    interconnect.
  • The hardware for these two types of machines
    becomes very similar.

18
Hybrid distributed memory systems
  • SMP-CMP clusters are the current
    price/performance sweet spot.
  • The architecture will dominate for the
    foreseeable future.
  • Two-level hierarchy
  • How to best use this type of architecture is
    still under heavy investigation.
Write a Comment
User Comments (0)
About PowerShow.com