Parallel computer architecture overview

About This Presentation

Title:

Parallel computer architecture overview

Description:

Technology trend. Figure from Patterson's parallel architectures book (1999) ... Micro-processor architecture trend in parallelism ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 20

Provided by: Surf6

Category:

more less

Transcript and Presenter's Notes

Title: Parallel computer architecture overview

1
Parallel computer architecture overview
2

Parallel computers definition A collection of
processing elements that cooperate to solve
large problems fast.
Some broad issues that distinguish parallel
computers
Resource Allocation
how large a collection?
how powerful are the elements?
how much memory?
Data access, Communication and Synchronization
how do the elements cooperate and communicate?
how are data transmitted between processors?
what are the abstractions and primitives for
cooperation?
Performance and Scalability
how does it all translate into performance?
how does it scale?

3
Studying the fundamental principles and design
trade-offs

History diverse and innovative organizational
structures, often tied to novel programming
models
Rapidly matured under strong technological
constraints
The microprocessor is ubiquitous
Laptops and supercomputers are fundamentally
similar!
Technological trends cause diverse approaches to
converge
Technological trends make parallel computing
inevitable
In the mainstream
Need to understand fundamental principles and
design tradeoffs, not just taxonomies

4
Technology trend

Figure from Pattersons parallel architectures
book (1999)
The performance of micro-processors is catching
up with that of supercomputers.

In terms of performance improvement, nothing
beats micro-processors.
To maintain the improvement, more and more
supercomputer features are built in
micro-processors.
Use commodity micro-processors to build
everything (if you cant beat them, join them).
Mainframes and minicomputers pretty much
disappear in todays world, replaced by server
farms (clusters of servers).
Virtualization on clusters.
Many supercomputers are clusters of
servers/workstations (see www.top500.org).

6
Micro-processor architecture trend in parallelism

Up to 1985 bit level parallelism 4-bit -gt 8 bit
-gt 16-bit -gt32-bit
slows after 32 bit
adoption of 64-bit well under way, 128-bit is far
(not performance issue)
great inflection point when 32-bit micro and
cache fit on a chip
Basic pipelining, hardware support for complex
operations like FP multiply etc.
Intel 4004 to 386

7
Micro-processor architecture trend in parallelism

Mid 80s to mid 90s instruction level parallelism
Pipelining and simple instruction sets,
compiler advances (RISC)
Larger on-chip caches
But only halve miss rate on quadrupling cache
size
More functional units gt superscalar execution
But limited performance scaling
Intel 486 to Pentium III/IV

8
Micro-processor architecture trend in parallelism

After mid-90s
Greater sophistication out of order execution,
speculation, prediction
to deal with control transfer and latency
problems
Very wide issue processors
Dont help many applications very much
Need multiple threads (SMT) to exploit
Increased complexity and size leads to slowdown
Long global wires
Increased access times to data
Time to market

9
Potentials of ILP

Depending on applications, with infinite
resources (memory bandwidth, perfect branch
prediction, register renaming, etc), the speedup
is limited to 1.3 to 17 (results of different
studies).
Next step (happening now) thread level
parallelism in micro-processors.
Multithreading, multicore

10
Parallel architectures

Thread level parallelism has traditionally been
supported by parallel architectures.
Shared memory
Distributed memory
Hybrid

11
Shared memory architectures

All processors access all memory as global
address space
Changes made by one processor are visible by
other processors
Two types based on the differences in memory
access speed
Uniform memory access (UMA)
Non-uniform memory access (NUMA)

12
UMA Shared memory architecture (mostly bus-based
MPs)

Micro on a chip makes it natural to connect many
to shared memory
dominates server and enterprise market, moving
down to desktop
Faster processors began to saturate bus, then
bus technology advanced
today, range of sizes for bus-based systems,
desktop to large servers (Symmetric
Multiprocessor (SMP) machines).

13
Bus bandwidth in Intel systems
14
NUMA Shared memory architecture

Identical processors, processors have different
time for accessing different part of the memory.
Often made by physically linking SMP machines
(Origin 2000, up to 512 processors).
The next generation SMP interconnects (Intel
Common System interface (CSI) and AMD
hypertransport) have this flavor, but the
processors are close to each other.

15
Shared memory architecture advantages and
disadvantages

Advantages
Globally shared memory provides user-friendly
programming perspective to programming.
Disadvantage
Lack of scalability (adding processors changes
the traffic requirement of the Interconnect).
Not easy to build big ones.
Writing correct shared memory parallel programs
is not straight forward.

16
Distributed memory architectures

Processors have their own local memory. Memory
addresses in one processor do not map to another
processor.
no concept of global address space.
No concept of cache coherency.
To access data in another processor, use explicit
communication.

17
Distributed memory architectures

The networks can be very different for
distributed memory architectures
Massively parallel processors (MPP) usually use
a specially designed network.
IBM Bluegene, IBM SP series
Clusters of workstations usually use system/local
area networks
Lemieux at PSC uses Quadrics
Longstar at TACC uses Infiniband
UC-TG at Argonne uses Myrinet
Sax at CSIT and my Cetus use Gigabit Ethernet
Grid computers use the Internet as the networks.

18
Advantages and disadvantages

Advantages
Memory is scalable with number of processors.
Increase the number of processors and the size of
memory increases proportionately.
Each processor can rapidly access its own memory
without interference and without the overhead
incurred with trying to maintain cache coherency.
Cost effectiveness can use commodity,
off-the-shelf processors and networking
Disadvantages
The programmer is responsible for the details
associated with data communication.
It may be difficult to map existing data
structures, based on global memory, to this
memory organization.

19
Hybrid distributed memory systems

Current trends indicate that this type of
architectures will prevail and increase at the
high end of computing for the foreseeable future.
Advantages/disadvantages common to both

Write a Comment

User Comments (0)

About PowerShow.com

Parallel computer architecture overview - PowerPoint PPT Presentation

Parallel computer architecture overview

Technology trend. Figure from Patterson's parallel architectures book (1999) ... Micro-processor architecture trend in parallelism ... – PowerPoint PPT presentation