Parallel Programming Models - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Parallel Programming Models

Description:

Compiler figures out details. Automatic Parallelization ... stick to standards. high-level, implicit parallel ... We stick to standards for modern architectures ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 29

Provided by: barbara179

Category:

more less

Transcript and Presenter's Notes

Title: Parallel Programming Models

1
Parallel Programming Models

Overview

2
Parallel Programming Models

Historically, programming models were designed
for a given class of architectures
vector computers and vector code
SIMD computers and array operations
distributed memory computers and message passing
shared memory and threads

3
Parallel Programming Models

Idea is to make it easy for programmer to get
performance on that architecture
Include primitives that are natural for the
architecture
Programmer should consider how to best use the
primitives in code

4
Parallel Programming Models

But in order to execute code on new architecture
it must be rewritten
Some programming models were written for just one
vendors machines!
Fortunately, industry standards are now widely
available
Pthreads, OpenMP
MPI

Shared memory
5
Parallel Programming Models

Standards developed for programming a class of
machines
Some adapted for more than one class of machines
MPI has been most successful in this respect
OpenMP now also on different kinds of platforms

6
Threads

Process has its own address space
A process may be executed by a team of threads
A thread shares its address space with other
threads in same team
But thread stack provides space for data local
(private) to thread
Threads used for shared memory parallel
programming and multitasking

7
Threads

One thread per processor for shared memory
parallel programming
One thread per task for time slicing (possibly on
single processor)

This is our focus
8
How Can We Exploit Threads?

A thread programming model must provide (at
least) the means to
Create and destroy threads
Distribute the computation among threads
Coordinate actions of threads on shared data
Name threads
(usually) specify which data is shared and which
is private to a thread

9
Parallel Programming Models

Low-level parallel programming
Programmer must describe parallelism explicitly
Data and computation to be performed on each
processor specified exactly by coder
For shared memory, create threads and specify all
details of their work and their interactions

10
Parallel Programming Models

High-level parallel programming
Programmer describes parallelism implicitly
Details of data and computation to be performed
on each processor determined by compiler
Compiler creates threads and determines details
of their work and interactions

11
Parallel Programming Models

Low-level parallel programming models
pthreads for shared memory, MPI for distributed
memory
High-level parallel programming
OpenMP for shared memory, HPF for distributed
memory

12
How Does OpenMP Enable Us to Exploit Threads?

OpenMP provides thread programming model at a
high level.
The user does not need to specify all the details
Especially with respect to the assignment of work
to threads
Creation of threads
User makes strategic decisions
Compiler figures out details

13
Automatic Parallelization

Many compilers now have an automatic
parallelization option for shared memory
platforms
Idea is that compiler detects dependences,
constructs parallel threads that respect them
Works well on very simple programs
But very hard to do on real programs
Dynamic improvement (run-time compiling) may help

14
Memory Models

Main difference in modern SMP architectures
Uniform memory access (UMA) same cost of access
from any processor
realized in physically (true) shared memory
Non-uniform memory access (NUMA) different cost
of access from different processors
true for physically distributed memory, including
distributed shared memory also for some SMPs

15
Models of Memory Management

Symmetric shared memory memory is shared, same
cost of access from any processor/core
Non-symmetric shared memory memory is shared,
different cost of access from different
processors/cores
Distributed shared memory memory is shared,
different cost of access from different
processors
Distributed memory memory is distributed,
different cost of access from different processors

16
Shared Memory

This means that a variable x, a pointer p, or an
array a refer to the same object, no matter
what processor the reference originates from
Each processor can access a variable in the same
amount of time as any other
Actually, this second statement is not true on
some major platforms today, even for some with
just a few processors
We will later discuss programming techniques that
take access time differences into account

17
Shared Memory

All threads access same data space

Shared memory space
a
proc1
proc2
proc3
procN
18
More Realistic View of Shared Memory Architecture

Shared memory
a
cache1
cache2
cache3
cacheN
proc1
proc2
proc3
procN
a
19
Cache in Shared Memory

Copies of shared data are held in local cache
Or even in registers
Without extra effort, there may be
inconsistencies
Thread 1 updates variable a
Thread 2 needs to use it
If thread 1 has not written a back to main
memory, thread 2 will use a stale value
This is the memory consistency problem

20
Distributed Memory

It is no longer the case that a variable a, a
pointer p, or an array a refer to the same
location, independent of the processor a process
executes on
It can be slow for a process on one processor to
access data stored in memory associated with a
different processor

21
Distributed Memory
a
a
a
a
proc1
proc2
proc3
procN
network
22
Distributed Shared Memory

A variable a, a pointer p, or an array a refers
to the same location, independent of the
processor a process executes on
It can be slow for a process on one processor to
access data stored in memory associated with a
different processor

23
Distributed Shared Memory

mem2
mem3
memN
mem1
cache2
cache1
cache4
cache3
proc1
proc2
proc3
procN
24
Important Note

Software Distributed Shared Memory can provide
the illusion of shared memory on a distributed
memory machine
No matter what the implementation, it
conceptually looks like shared memory
There may be some very large performance
differences

25
Programming vs. Hardware

One can implement a shared memory programming
model
on shared or distributed memory hardware
(in software or in hardware)
One can implement a message passing programming
model
on shared or distributed memory hardware
There may be large performance differences

26
Portability of programming models
shared memory programming
distributed memory programming
distr. memory machine
shared memory machine
27
Programming Models

We look at several programming models
stick to standards
high-level, implicit parallel programming
low-level, explicit parallel programming
Goal understand the model
Get experience in its usage

28
Summary

Different kinds of architectural parallelism and
memory organization have led to different
programming models
We stick to standards for modern architectures
There are several ways to find parallelism in a
code
Programmer has to decide which way is best for a
program
We discuss this soon, but first we get started
with the API