Convergence of Parallel Architectures - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

Convergence of Parallel Architectures

Description:

More attractive than ever because best' building block - the microprocessor - is ... Reexamine traditional camps from new perspective (next week) SIMD. Message Passing ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 26

Provided by: DavidE2

Learn more at: https://people.eecs.berkeley.edu

Category:

Tags: architectures | convergence | parallel

Transcript and Presenter's Notes

Title: Convergence of Parallel Architectures

1
Convergence of Parallel Architectures

CS 258, Spring 99
David E. Culler
Computer Science Division
U.C. Berkeley

2
Recap of Lecture 1

Parallel Comp. Architecture driven by familiar
technological and economic forces
application/platform cycle, but focused on the
most demanding applications
hardware/software learning curve
More attractive than ever because best building
block - the microprocessor - is also the fastest
BB.
History of microprocessor architecture is
parallelism
translates area and denisty into performance
The Future is higher levels of parallelism
Parallel Architecture concepts apply at many
levels
Communication also on exponential curve
gt Quantitative Engineering approach

Speedup
3
History

Parallel architectures tied closely to
programming models
Divergent architectures, with no predictable
pattern of growth.
Mid 80s rennaisance

Application Software
System Software
Systolic Arrays
SIMD
Architecture
Message Passing
Dataflow
Shared Memory
4
Plan for Today

Look at major programming models
where did they come from?
The 80s architectural rennaisance!
What do they provide?
How have they converged?
Extract general structure and fundamental issues
Reexamine traditional camps from new perspective
(next week)

Systolic Arrays
SIMD
Generic Architecture
Message Passing
Dataflow
Shared Memory
5
Administrivia

Mix of HW, Exam, Project load
HW 1 due date moved out to Fri 1/29
added 1.18
Hands-on session with parallel machines in week 3

6
Programming Model

Conceptualization of the machine that programmer
uses in coding applications
How parts cooperate and coordinate their
activities
Specifies communication and synchronization
operations
Multiprogramming
no communication or synch. at program level
Shared address space
like bulletin board
Message passing
like letters or phone calls, explicit point to
point
Data parallel
more regimented, global actions on data
Implemented with shared address space or message
passing

7
Shared Memory gt Shared Addr. Space

Bottom-up engineering factors
Programming concepts
Why its attactive.

8
Adding Processing Capacity

Memory capacity increased by adding modules
I/O by controllers and devices
Add processors for processing!
For higher-throughput multiprogramming, or
parallel programs

9
Historical Development

Mainframe approach
Motivated by multiprogramming
Extends crossbar used for Mem and I/O
Processor cost-limited gt crossbar
Bandwidth scales with p
High incremental cost
use multistage instead
Minicomputer approach
Almost all microprocessor systems have bus
Motivated by multiprogramming, TP
Used heavily for parallel computing
Called symmetric multiprocessor (SMP)
Latency larger than for uniprocessor
Bus is bandwidth bottleneck
caching is key coherence problem
Low incremental cost

10
Shared Physical Memory

Any processor can directly reference any memory
location
Any I/O controller - any memory
Operating system can run on any processor, or
all.
OS uses shared memory to coordinate
Communication occurs implicitly as result of
loads and stores
What about application processes?

11
Shared Virtual Address Space

Process address space plus thread of control
Virtual-to-physical mapping can be established so
that processes shared portions of address space.
User-kernel or multiple processes
Multiple threads of control on one address space.
Popular approach to structuring OSs
Now standard application capability (ex POSIX
threads)
Writes to shared address visible to other threads
Natural extension of uniprocessors model
conventional memory operations for communication
special atomic operations for synchronization
also load/stores

12
Structured Shared Address Space

Add hoc parallelism used in system code
Most parallel applications have structured SAS
Same program on each processor
shared variable X means the same thing to each
thread

13
Engineering Intel Pentium Pro Quad

All coherence and multiprocessing glue in
processor module
Highly integrated, targeted at high volume
Low latency and bandwidth

14
Engineering SUN Enterprise

Proc mem card - I/O card
16 cards of either type
All memory accessed over bus, so symmetric
Higher bandwidth, higher latency bus

15
Scaling Up
M
M
M

Network
Network

M
M
M

P
P
P
P
P
P
Dance hall
Distributed memory

Problem is interconnect cost (crossbar) or
bandwidth (bus)
Dance-hall bandwidth still scalable, but lower
cost than crossbar
latencies to memory uniform, but uniformly large
Distributed memory or non-uniform memory access
(NUMA)
Construct shared address space out of simple
message transactions across a general-purpose
network (e.g. read-request, read-response)
Caching shared (particularly nonlocal) data?

16
Engineering Cray T3E

Scale up to 1024 processors, 480MB/s links
Memory controller generates request message for
non-local references
No hardware mechanism for coherence
SGI Origin etc. provide this

17
Systolic Arrays
SIMD
Generic Architecture
Message Passing
Dataflow
Shared Memory
18
Message Passing Architectures

Complete computer as building block, including
I/O
Communication via explicit I/O operations
Programming model
direct access only to private address space
(local memory),
communication via explicit messages
(send/receive)
High-level block diagram
Communication integration?
Mem, I/O, LAN, Cluster
Easier to build and scale than SAS
Programming model more removed from basic
hardware operations
Library or OS intervention

19
Message-Passing Abstraction

Send specifies buffer to be transmitted and
receiving process
Recv specifies sending process and application
storage to receive into
Memory to memory copy, but need to name processes
Optional tag on send and matching rule on receive
User process names local data and entities in
process/tag space too
In simplest form, the send/recv match achieves
pairwise synch event
Other variants too
Many overheads copying, buffer management,
protection

20
Evolution of Message-Passing Machines

Early machines FIFO on each link
HW close to prog. Model
synchronous ops
topology central (hypercube algorithms)

CalTech Cosmic Cube (Seitz, CACM Jan 95)
21
Diminishing Role of Topology

Shift to general links
DMA, enabling non-blocking ops
Buffered by system at destination until recv
Storeforward routing
Diminishing role of topology
Any-to-any pipelined routing
node-network interface dominates communication
time
Simplifies programming
Allows richer design space
grids vs hypercubes

Intel iPSC/1 -gt iPSC/2 -gt iPSC/860
H x (T0 n/B) vs T0 HD n/B
22
Example Intel Paragon
23
Building on the mainstream IBM SP-2

Made out of essentially complete RS6000
workstations
Network interface integrated in I/O bus (bw
limited by I/O bus)

24
Berkeley NOW

100 Sun Ultra2 workstations
Inteligent network interface
proc mem
Myrinet Network
160 MB/s per link
300 ns per hop

25
Toward Architectural Convergence

Evolution and role of software have blurred
boundary
Send/recv supported on SAS machines via buffers
Can construct global address space on MP (GA
-gt P LA)
Page-based (or finer-grained) shared virtual
memory
Hardware organization converging too
Tighter NI integration even for MP (low-latency,
high-bandwidth)
Hardware SAS passes messages
Even clusters of workstations/SMPs are parallel
systems
Emergence of fast system area networks (SAN)
Programming models distinct, but organizations
converging
Nodes connected by general network and
communication assists
Implementations also converging, at least in
high-end machines

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

CS 258 Parallel Computer Architecture Lecture 2 Convergence of Parallel Architectures PowerPoint PPT Presentation

CS 258 Parallel Computer Architecture Lecture 2 Convergence of Parallel Architectures - ... High-level block diagram ... enabling non-blocking ops ... FPGAs as New Research Platform As ~ 25 CPUs can fit in Field Programmable Gate Array ... | PowerPoint PPT presentation | free to view

Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures PowerPoint PPT Presentation

Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures - Evaluating Sparse Linear System. Solvers on Scalable Parallel. Architectures ... SMPs, multicore SMP aggregates) and programming models (PGAs, Messaging APIs) ... | PowerPoint PPT presentation | free to view

Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures PowerPoint PPT Presentation

Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures - Linear Solvers Grant Kickoff Meeting, 9/26/06. ... of parallel sparse linear system solvers. ... Engineering problems usually produce large sparse linear systems ... | PowerPoint PPT presentation | free to view

Parallel%20Programming%20with%20CUDA PowerPoint PPT Presentation

Parallel%20Programming%20with%20CUDA - CUDA is a platform for performing massively parallel computations on ... This includes dereferencing pointers, even in the host's memory (main system RAM) ... | PowerPoint PPT presentation | free to view

Parallel and Distributed Models in Evolutionary Computing PowerPoint PPT Presentation

Parallel and Distributed Models in Evolutionary Computing - Parallel and Distributed Models in Evolutionary Computing Motivation Parallelization models Distributed models Neural and Evolutionary Computing - Lecture 10 * | PowerPoint PPT presentation | free to view

CS 258 Parallel Computer Architecture PowerPoint PPT Presentation

CS 258 Parallel Computer Architecture - xmlns:stRef='http://ns.adobe.com/xap/1.0/sType/ResourceRef ... adobe:docid:photoshop:d6d0a752-2f14-11d9-8c10-b1cfbd4cb2e 4 ... | PowerPoint PPT presentation | free to view

Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach PowerPoint PPT Presentation

Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach - Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and | PowerPoint PPT presentation | free to view

CS 267: Applications of Parallel Computers Unstructured Multigrid for Linear Systems PowerPoint PPT Presentation

CS 267: Applications of Parallel Computers Unstructured Multigrid for Linear Systems - CS 267: Applications of Parallel Computers Unstructured Multigrid for Linear Systems | PowerPoint PPT presentation | free to view

Parallel Programming: Overview Todd C. Mowry CS 495 September 3-4, 2002 PowerPoint PPT Presentation

Parallel Programming: Overview Todd C. Mowry CS 495 September 3-4, 2002 - Software base not mature; evolves with architectures for performance. So need to open the box ... Red sweep and black sweep are each fully parallel: ... | PowerPoint PPT presentation | free to view

The Grid: From Parallel to Virtualized Parallel Computing PowerPoint PPT Presentation

The Grid: From Parallel to Virtualized Parallel Computing - From Parallel to Virtualized Parallel Computing Michael Welzl http://www.welzl.at DPS NSG Team http://dps.uibk.ac.at/nsg Institute of Computer Science | PowerPoint PPT presentation | free to view

Organization and Emergence of Semantic Knowledge: A Parallel-Distributed Processing Approach PowerPoint PPT Presentation

Organization and Emergence of Semantic Knowledge: A Parallel-Distributed Processing Approach - Organization and Emergence of Semantic Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center for Mind, Brain ... | PowerPoint PPT presentation | free to view

SMA2 Proposal LISA: Leaders in Information Systems and Architectures PowerPoint PPT Presentation

SMA2 Proposal LISA: Leaders in Information Systems and Architectures - SMA2 Proposal LISA: Leaders in Information Systems and Architectures Angela GOH, NTU Stuart MADNICK, MIT | PowerPoint PPT presentation | free to view

Lecture 2: Parallel Programs PowerPoint PPT Presentation

Lecture 2: Parallel Programs - ... can lead to load balancing issues Data Mining Data mining attempts to identify patterns in transactions For example, ... static or dynamic We will ... load ... | PowerPoint PPT presentation | free to view

Adaptive Dataflow: A Database/Networking Cosmic Convergence PowerPoint PPT Presentation

Adaptive Dataflow: A Database/Networking Cosmic Convergence - Title: Database/Network Convergence Author: Joe Hellerstein Last modified by: Joe Hellerstein Created Date: 12/6/2001 9:51:31 PM Document presentation format | PowerPoint PPT presentation | free to view

Parallel System Performance: Evaluation PowerPoint PPT Presentation

Parallel System Performance: Evaluation - Programming model used. ... Execution time with. one processor. Execution time with. an infinite number ... ocean simulation, ray trace, database. ... | PowerPoint PPT presentation | free to view

How to Think Algorithmically in Parallel? PowerPoint PPT Presentation

How to Think Algorithmically in Parallel? - How to Think Algorithmically in Parallel? Uzi Vishkin | PowerPoint PPT presentation | free to view

IP Telephony and Network Convergence PowerPoint PPT Presentation

IP Telephony and Network Convergence - IP Telephony and Network Convergence Raimo.Kantola@hut.fi VoIP in action Voice over IP So, what about header overhead? It seems to make sense to choose voice coding ... | PowerPoint PPT presentation | free to view

Processor Architectures and Program Mapping Programmable Digital Signal Processors PowerPoint PPT Presentation

Processor Architectures and Program Mapping Programmable Digital Signal Processors - Processor Architectures and Program Mapping. Programmable Digital Signal ... e.g. DIT, DIF, radix 2/4, FFT loop unrolling, scaling, shuffling, intialisation ... | PowerPoint PPT presentation | free to view

Why Parallel Architecture? Todd C. Mowry CS 495 January 15, 2002 PowerPoint PPT Presentation

Why Parallel Architecture? Todd C. Mowry CS 495 January 15, 2002 - A parallel computer is a collection of processing elements that cooperate to ... Large parallel machines a mainstay in many industries. Petroleum (reservoir analysis) ... | PowerPoint PPT presentation | free to view

Parallel computer architecture overview PowerPoint PPT Presentation

Parallel computer architecture overview - Technology trend. Figure from Patterson's parallel architectures book (1999) ... Micro-processor architecture trend in parallelism ... | PowerPoint PPT presentation | free to view

Reconfigurable Architectures for High Bandwidth Network Processing Systems PowerPoint PPT Presentation

Reconfigurable Architectures for High Bandwidth Network Processing Systems - Title: Slide 1 Author: Sakir Sezer Last modified by: John McCanny Created Date: 1/13/2005 10:43:13 AM Document presentation format: A4 Paper (210x297 mm) | PowerPoint PPT presentation | free to view

Predictable Design of Embedded Systems using Networked Architectures PowerPoint PPT Presentation

Predictable Design of Embedded Systems using Networked Architectures - Title: Basics of Product Development Last modified by: Medewerker Document presentation format: On-screen Show Other titles: Times New Roman Arial Wingdings Tahoma ... | PowerPoint PPT presentation | free to view

Perspective on Parallel Programming PowerPoint PPT Presentation

Perspective on Parallel Programming - Perspective on Parallel Programming. CS 258, Spring 99. David E. Culler. Computer ... Architect's Perspective. What can be addressed by better hardware design? ... | PowerPoint PPT presentation | free to view

ISDN, B-ISDN, X.25, Frame-Relay, ATM Networks: A Telephony View of Convergence Architectures PowerPoint PPT Presentation

ISDN, B-ISDN, X.25, Frame-Relay, ATM Networks: A Telephony View of Convergence Architectures - A Telephony View of Convergence Architectures Shivkumar Kalyanaraman Rensselaer Polytechnic Institute shivkuma@ecse.rpi.edu http://www.ecse.rpi.edu/Homepages/shivkuma | PowerPoint PPT presentation | free to view

Parallel Coordinate Descent for L1-Regularized Loss Minimization PowerPoint PPT Presentation

Parallel Coordinate Descent for L1-Regularized Loss Minimization - Title: Distributed Inference in Sensor Networks Author: S Last modified by: Joseph Bradley Created Date: 12/3/2003 4:12:11 AM Document presentation format | PowerPoint PPT presentation | free to view

ISDN, BISDN, X.25, FrameRelay, ATM Networks: A Telephony View of Convergence Architectures PowerPoint PPT Presentation

ISDN, BISDN, X.25, FrameRelay, ATM Networks: A Telephony View of Convergence Architectures - Separate Voice network (PSTN) and Data Networks (Frame Relay, SMDS, etc. ... Frame Switching is a service that implements both ... | PowerPoint PPT presentation | free to view

New Interdomain Routing Architectures PowerPoint PPT Presentation

New Interdomain Routing Architectures - ... effects (e.g., intradomain changes leading to BGP changes) ... With the goal of leading to a better architecture. Create models of the fundamental limits ... | PowerPoint PPT presentation | free to view