Title: Introduction to OpenMP
1Introduction to OpenMP
- Eric Aubanel
- Advanced Computational Research Laboratory
- Faculty of Computer Science, UNB
- Fredericton, New Brunswick
2Shared Memory
3Shared Memory Multiprocessor
4Distributed vs. DSM
Network - Global address space
Memory
Memory
Memory
Processes
Processes
Processes
5Parallel Programming Alternatives
- Use a new programming language
- Use a existing sequential language modified to
handle parallelism - Use a parallelizing compiler
- Use library routines/compiler directives with an
existing sequential language - Shared memory (OpenMP) vs. distributed memory
(MPI)
6What is Shared Memory Parallelization?
- All processors can access all the memory in the
parallel system (one address space). - The time to access the memory may not be equal
for all processors - not necessarily a flat memory
- Parallelizing on a SMP does not reduce CPU time
- it reduces wallclock time
- Parallel execution is achieved by generating
multiple threads which execute in parallel - Number of threads (in principle) is independent
of the number of processors
7Threads The Basis of SMP Parallelization
- Threads are not full UNIX processes. They are
lightweight, independent "collections of
instructions" that execute within a UNIX process. - All threads created by the same process share the
same address space. - a blessing and a curse "inter-thread"
communication is efficient, but it is easy to
stomp on memory and create race conditions. - Because they are lightweight, they are
(relatively) inexpensive to create and destroy. - Creation of a thread can take three orders of
magnitude less time than process creation! - Threads can be created and assigned to multiple
processors This is the basis of SMP parallelism!
8Processes vs. Threads
code
heap
IP
Process
stack
code
stack
heap
IP
Threads
stack
IP
9Methods of SMP Parallelism
- 1. Explicit use of threads
- Pthreads see "Pthreads Programming" from
O'Reilly Associates, Inc. - 2. Using a parallelizing compiler and its
directives, you can generate pthreads "under the
covers." - can use vendor-specific directives (e.g. !SMP)
- can use industry-standard directives (e.g. !OMP
and OpenMP)
10OpenMP
- 1997 group of hardware and software vendors
announced their support for OpenMP, a new API for
multi-platform shared-memory programming (SMP) on
UNIX and Microsoft Windows NT platforms. - www.openmp.org
- OpenMP provides comment-line directives, embedded
in C/C or Fortran source code, for - scoping data
- specifying work load
- synchronization of threads
- OpenMP provides function calls for obtaining
information about threads. - e.g., omp_num_threads(), omp_get_thread_num()
11OpenMP example
Subroutine saxpy(z, a, x, y, n) integer i, n real
z(n), a, x(n), y !omp parallel do do i 1, n
z(i) a x(i) y end do return end
12OpenMP Threads
- 1.All OpenMP programs begin as a single process
the master thread - 2.FORK the master thread then creates a
team of parallel threads - 3.Parallel region statements executed
in parallel among
the various team threads - 4.JOIN threads
synchronize and terminate, leaving only the
master thread
13Private vs Shared Variables
Global shared memory
All data references to global shared memory
Serial execution
z
a
x
y
n
Global shared memory
References to z, a, x, y, n are to global shared
memory
Parallel execution
Each thread has a private copy of i
References to i are to the private copy
14Division of Work
n 40, 4 threads
Global shared memory
Subroutine saxpy(z, a, x, y, n) integer i, n real
z(n), a, x(n), y !omp parallel do do i 1, n
z(i) a x(i) y end do return end
local memory
i 11, 20
i 21, 30
i 31, 40
i 1, 10
15Variable Scoping
- The most difficult part of shared memory
parallelization. - What memory is shared
- What memory is private (i.e. each processor has
its own copy) - How private memory is treated vis à vis the
global address space. - Variables are shared by default, except for loop
index in parallel do - This must mesh with the Fortran view of memory
- Global shared by all routines
- Local local to a given routine
- saved vs. non-saved variables (through the SAVE
statement or -save option)
16Static vs. Automatic Variables
- Fortran 77 standard allows subprogram local
variables to become undefined between calls,
unless saved with a SAVE statement - STATIC AUTOMATIC
- AIX (default) -qnosave
- IRIX -static -automatic (default)
- SunOS (default) -statckvar
17OpenMP Directives in Fortran
- Line continuation
- Fixed form
- !OMP PARALLEL DO
- !OMPPRIVATE (JMAX)
- !OMPSHARED(A, B)
- Free form
- !OMP PARALLEL DO
- !OMP PRIVATE (JMAX)
- !OMP SHARED(A, B)
18OpenMP in C
- Same functionality as OpenMP for FORTRAN
- Differences in syntax
- pragma omp for
- Differences in variable scoping
- variables "visible" when pragma omp parallel
encountered are shared by default - static variables declared within a parallel
region are also shared - heap allocated memory (malloc) is shared (but
pointer can be private) - automatic storage declared within a parallel
region is private (ie, on the stack)
19OpenMP Overhead
- Overhead for parallelization is large (eg. 8000
cycles for parallel do over 16 processors of SGI
Origin 2000) - size of parallel work construct must be
significant enough to overcome overhead - rule of thumb it takes 10 kFLOPS to amortize
overhead
20OpenMP Use
- How is OpenMP typically used?
- OpenMP is usually used to parallelize loops
- Find your most time consuming loops.
- Split them up between threads.
- Better scaling can be obtained using OpenMP
parallel regions, but can be tricky!
21OpenMP vs. MPI
- Only for shared memory computers
- Easy to incrementally parallelize
- More difficult to write highly scalable programs
- Small API based on compiler directives and
limited library routines - Same program can be used for sequential and
parallel execution - Shared vs private variables can cause confusion
- Portable to all platforms
- Parallelize all or nothing
- Vast collection of library routines
- Possible but difficult to use same program for
serial and parallel execution - variables are local to each processor
22References
- Parallel Programming in OpenMP, by Chandra et al.
(Morgan Kauffman) - www.openmp.org
- Multimedia tutorial at Boston University
- scv.bu.edu/SCV/Tutorials/OpenMP/
- Lawrence Livemore online tutorial
- www.llnl.gov/computing/training/
- European workshop on OpenMP (EWOMP)
- www.epcc.ed.ac.uk/ewomp2000/