Parallel computer architecture overview - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Parallel computer architecture overview

Description:

Technology trend. Figure from Patterson's parallel architectures book (1999) ... Micro-processor architecture trend in parallelism ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 20
Provided by: Surf6
Category:

less

Transcript and Presenter's Notes

Title: Parallel computer architecture overview


1
Parallel computer architecture overview
2
  • Parallel computers definition A collection of
    processing elements that cooperate to solve
    large problems fast.
  • Some broad issues that distinguish parallel
    computers
  • Resource Allocation
  • how large a collection?
  • how powerful are the elements?
  • how much memory?
  • Data access, Communication and Synchronization
  • how do the elements cooperate and communicate?
  • how are data transmitted between processors?
  • what are the abstractions and primitives for
    cooperation?
  • Performance and Scalability
  • how does it all translate into performance?
  • how does it scale?

3
Studying the fundamental principles and design
trade-offs
  • History diverse and innovative organizational
    structures, often tied to novel programming
    models
  • Rapidly matured under strong technological
    constraints
  • The microprocessor is ubiquitous
  • Laptops and supercomputers are fundamentally
    similar!
  • Technological trends cause diverse approaches to
    converge
  • Technological trends make parallel computing
    inevitable
  • In the mainstream
  • Need to understand fundamental principles and
    design tradeoffs, not just taxonomies

4
Technology trend
  • Figure from Pattersons parallel architectures
    book (1999)
  • The performance of micro-processors is catching
    up with that of supercomputers.

5
  • In terms of performance improvement, nothing
    beats micro-processors.
  • To maintain the improvement, more and more
    supercomputer features are built in
    micro-processors.
  • Use commodity micro-processors to build
    everything (if you cant beat them, join them).
  • Mainframes and minicomputers pretty much
    disappear in todays world, replaced by server
    farms (clusters of servers).
  • Virtualization on clusters.
  • Many supercomputers are clusters of
    servers/workstations (see www.top500.org).

6
Micro-processor architecture trend in parallelism
  • Up to 1985 bit level parallelism 4-bit -gt 8 bit
    -gt 16-bit -gt32-bit
  • slows after 32 bit
  • adoption of 64-bit well under way, 128-bit is far
    (not performance issue)
  • great inflection point when 32-bit micro and
    cache fit on a chip
  • Basic pipelining, hardware support for complex
    operations like FP multiply etc.
  • Intel 4004 to 386

7
Micro-processor architecture trend in parallelism
  • Mid 80s to mid 90s instruction level parallelism
  • Pipelining and simple instruction sets,
    compiler advances (RISC)
  • Larger on-chip caches
  • But only halve miss rate on quadrupling cache
    size
  • More functional units gt superscalar execution
  • But limited performance scaling
  • Intel 486 to Pentium III/IV

8
Micro-processor architecture trend in parallelism
  • After mid-90s
  • Greater sophistication out of order execution,
    speculation, prediction
  • to deal with control transfer and latency
    problems
  • Very wide issue processors
  • Dont help many applications very much
  • Need multiple threads (SMT) to exploit
  • Increased complexity and size leads to slowdown
  • Long global wires
  • Increased access times to data
  • Time to market

9
Potentials of ILP
  • Depending on applications, with infinite
    resources (memory bandwidth, perfect branch
    prediction, register renaming, etc), the speedup
    is limited to 1.3 to 17 (results of different
    studies).
  • Next step (happening now) thread level
    parallelism in micro-processors.
  • Multithreading, multicore

10
Parallel architectures
  • Thread level parallelism has traditionally been
    supported by parallel architectures.
  • Shared memory
  • Distributed memory
  • Hybrid

11
Shared memory architectures
  • All processors access all memory as global
    address space
  • Changes made by one processor are visible by
    other processors
  • Two types based on the differences in memory
    access speed
  • Uniform memory access (UMA)
  • Non-uniform memory access (NUMA)

12
UMA Shared memory architecture (mostly bus-based
MPs)
  • Micro on a chip makes it natural to connect many
    to shared memory
  • dominates server and enterprise market, moving
    down to desktop
  • Faster processors began to saturate bus, then
    bus technology advanced
  • today, range of sizes for bus-based systems,
    desktop to large servers (Symmetric
    Multiprocessor (SMP) machines).

13
Bus bandwidth in Intel systems
14
NUMA Shared memory architecture
  • Identical processors, processors have different
    time for accessing different part of the memory.
  • Often made by physically linking SMP machines
    (Origin 2000, up to 512 processors).
  • The next generation SMP interconnects (Intel
    Common System interface (CSI) and AMD
    hypertransport) have this flavor, but the
    processors are close to each other.

15
Shared memory architecture advantages and
disadvantages
  • Advantages
  • Globally shared memory provides user-friendly
    programming perspective to programming.
  • Disadvantage
  • Lack of scalability (adding processors changes
    the traffic requirement of the Interconnect).
  • Not easy to build big ones.
  • Writing correct shared memory parallel programs
    is not straight forward.

16
Distributed memory architectures
  • Processors have their own local memory. Memory
    addresses in one processor do not map to another
    processor.
  • no concept of global address space.
  • No concept of cache coherency.
  • To access data in another processor, use explicit
    communication.

17
Distributed memory architectures
  • The networks can be very different for
    distributed memory architectures
  • Massively parallel processors (MPP) usually use
    a specially designed network.
  • IBM Bluegene, IBM SP series
  • Clusters of workstations usually use system/local
    area networks
  • Lemieux at PSC uses Quadrics
  • Longstar at TACC uses Infiniband
  • UC-TG at Argonne uses Myrinet
  • Sax at CSIT and my Cetus use Gigabit Ethernet
  • Grid computers use the Internet as the networks.

18
Advantages and disadvantages
  • Advantages
  • Memory is scalable with number of processors.
    Increase the number of processors and the size of
    memory increases proportionately.
  • Each processor can rapidly access its own memory
    without interference and without the overhead
    incurred with trying to maintain cache coherency.
  • Cost effectiveness can use commodity,
    off-the-shelf processors and networking
  • Disadvantages
  • The programmer is responsible for the details
    associated with data communication.
  • It may be difficult to map existing data
    structures, based on global memory, to this
    memory organization.

19
Hybrid distributed memory systems
  • Current trends indicate that this type of
    architectures will prevail and increase at the
    high end of computing for the foreseeable future.
  • Advantages/disadvantages common to both
Write a Comment
User Comments (0)
About PowerShow.com