SHARED-MEMORY PROGRAMMING 6th week - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

SHARED-MEMORY PROGRAMMING 6th week

Description:

Title: Parallel Processing Course Author: Vu Le Hung Last modified by: LEHUNG Created Date: 7/15/2001 12:28:49 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 36
Provided by: VuLe
Category:

less

Transcript and Presenter's Notes

Title: SHARED-MEMORY PROGRAMMING 6th week


1
SHARED-MEMORY PROGRAMMING6th week
2
SHARED-MEMORY PROGRAMMING6th week
  • References
  • Introduction
  • The ANSI X3H5 Shared-Memory Model
  • The POSIX Threads Model
  • The OpenMP Standard

3
REFERENCES
  • Scalable Parallel Computing Technology,
    Architecture and Programming, Kai Hwang and
    ZhiweiXu, ch12
  • Parallel Processing Course
  • Yang-Suk Kee(yskee_at_iris.snu.ac.kr)
  • School of EECS, Seoul National University

4
Introduction to Shared-Memory Programming Model
Processor
Memory
System
Thread (Process)
Thread (Process)
read(X)
write(X)
X
Shared variable
Shared-Memory Model / Shared Address Space (SAS)
Model
5
Introduction (contd)
  • Naming
  • Any process can name any variable in shared space
  • Operations
  • Loads and stores, plus those needed for ordering
  • Simplest Ordering Model
  • Within a process/thread sequential program order
  • Across threads some interleaving (as in
    time-sharing)
  • Additional orders through synchronization
  • Again, compilers/hardware can violate orders
    without getting caught

6
SYNCHORNIZATION
  • Mutual exclusion (locks)
  • Ensure certain operations on certain data can be
    performed by only one process at a time
  • Room that only one person can enter at a time
  • No ordering guarantees
  • Event synchronization
  • Ordering of events to preserve dependences
  • e.g. producer gt consumer of data
  • 3 main types
  • point-to-point
  • global
  • group

7
NAMING AND OPEATIONS
  • Naming and operations in programming model can be
    directly supported by lower levels, or translated
    by compiler, libraries or OS
  • Example
  • Shared virtual address space in programming model
  • Hardware interface supports shared physical
    address space
  • Direct support by hardware through v-to-p
    mappings, no software layers

8
NAMING AND OPERATIONS (contd)
  • Hardware supports independent physical address
    spaces
  • Can provide SAS through OS, so in system/user
    interface
  • v-to-p mappings only for data that are local
  • remote data accesses incur page faults brought
    in via page fault handlers
  • same programming model, different hardware
    requirements and cost model
  • Or through compilers or runtime, so above
    sys/user interface
  • shared objects, instrumentation of shared
    accesses, compiler support

9
SHARED-MEMORY STANDARDS
  • No widely-accepted standard
  • Three popular platform-independent Shared-Memory
    standards are
  • X3H5
  • OpenMP
  • POSIX Pthreads

10
THE ANSI X3H5 MODEL
  • Established in 1993
  • Has greatly influencence on many commercial
    shared-memory systems
  • Defines one conceptual standard programming model
    and 3 bindings for C, Fortran 77 and Fortran 90

11
THE ANSI X3H5 MODEL (contd)
  • Main features
  • Parallelism Constructs
  • Parallel Blocks
  • Parallel Loop
  • Implicit Barrier
  • Support for thread Interaction and
    synchronization

12
PARALLELISM CONSTRUCTS
  • Is a pair of parallel and end parallel with the
    enclosed code
  • Program starts in sequential mode with one
    initial thread (base thread/ master thread)
  • When the program encounters a parallel, it
    switches to parallel mode by creating a number of
    children threads.
  • The team of master thread and children threads
    execute in parallel till an end parallel
  • After the end parallel, the program switches back
    to sequential mode (only base thread continues
    execution)

13
PARALLEL CONSTRUCTS IN AN X3H5 PROGRAM
  • Program main
  • A
  • paralllel
  • B
  • psections
  • section
  • C
  • section
  • D
  • end psections
  • psingle
  • E
  • end psingle
  • pdo i1,6
  • F(i)
  • end pdo no wait
  • G
  • end parallel
  • H

executed by only the base thread
executed by every thread in the team (parallel
mode)
executed by one team member
executed by another thread
executed by only one thread (sequential mode )
all threads share 6 iterations of the loop to
execute
14
PARALLEL CONSTRUCTS IN AN X3H5 PROGRAM
ILLUSTRATION
Q
R
P
Threads
A
Implicit barrier
B
B
B
C
D
Implicit barrier
E
Implicit barrier
F(12)
F(34)
F(56)
no Implicit barrier
G
G
G
Implicit barrier
H
15
OTHER CONSTRUCTS
  • Inside a parallel construct, there are
  • Work-sharing constructs
  • Parallel block
  • Parallel loop (pdoend pdo)
  • A single process (psingleend psingle)
  • Other code to be duplicatedly executed by every
    thread in the team
  • Parallel Block
  • Consists of many sections (psectionsend
    psections)
  • Used to specify MPMD parallelism
  • Each section is to be executed by a team member

16
OTHER CONSTRUCTS (contd)
  • Parallel Loop ( pdo end pdo)
  • Used to specify SPMD parallelism
  • The same code is to be executed by all team
    members

17
OTHER FEATURES OF X3H5
  • Implicit Barrier
  • At parallel, end parallel, end psections, end pdo
    and end psingle ( use no wait to avoid this)
  • Fence operation forces all memory acceses up to
    the barrier point to be consistent
  • Parallel and Work-sharing constructs can be
    nested
  • Support for thread interaction
  • shared/privated variable in a parallel construct
  • implicit and explicit barrirer
  • 4 types of synchornization objects latch, lock,
    event and ordinal
  • Support for thread synchronization
  • Lock/event synchornization
  • Critical region and ordinal objects

18
THE POSIX THREADS (Pthreads) MODEL
  • Established by IEEE in 1995
  • Functionality and interface are similar to those
    of Solaris Threads
  • Defines a set of primitive routines to manage and
    synchornize threads
  • Uses mutex objects and conditional variables for
    thread synchronization

19
THE Pthreads MODEL (contd)
  • Thread management
  • pthread_create
  • pthread_exit
  • pthread_join
  • pthread_self
  • Thread synchornization primitives
  • pthread_mutex_init
  • pthread_mutex_destroy
  • pthread_mutex_lock
  • pthread_mutex_trylock
  • pthread_mutex_unlock
  • pthread_cond_init
  • pthread_cond_destroy
  • pthread_cond_wait
  • pthread_cond_timedwait
  • pthread_cond_signal
  • pthread_cond_broadcast

20
HELLO WORLD PROGRAMPTHREAD VERSION
.(contd) //create N threads
for(i0 ilt4 i) pthread_create(thread
i, attr, thrfunc,
(void)argi) //wait for the N threads to
finish for(i0 ilt4 i)
pthread_join(threadi, NULL) //end main
int main(void) pthread_t thread4
pthread_attr_t attr int arg4 0,1,2,3
int i // setup joinable threads with
// system scope pthread_attr_init(attr)
pthread_attr_setdetachstate(attr,
PTHREAD_CREATE_JOINABLE)
pthread_attr_setscope(attr,
PTHREAD_SCOPE_SYSTEM) ..
include ltpthread.hgt include ltstdio.hgt void
thrfunc(void arg) printf(hello from thread
d\n, (int)arg) //end
thrfunc
21
THE OpenMP STANDARD
  • An Application Program Interface (API) to be used
    to explicitly direct multi-threaded, shared
    memory parallelism
  • Inherits many concepts from ANSI X3H5 model
  • Three API components
  • Compiler Directives
  • Runtime Library Routines
  • Environment Variables
  • Portable
  • APIs for C/C and Fortran
  • Multiple platforms most Unix platforms and
    Windows NT

22
THE OpenMP STANDARD (contd)
  • Standardized
  • Jointly proposed by a group of major computer
    hardware and software vendors
  • Expected to become an ANSI standard
  • What does OpenMP stand for?
  • Open specifications for multi-processing
  • Collaborative work with interested parties from
    the hardware and software industry, government
    and academia

23
OpenMP IS NOT
  • Distributed memory parallel systems by itself
  • Implemented identically by all vendors
  • Guaranteed to make the most efficient use of
    shared memory
  • There are no data locality constructs

24
GOALS OF OpenMP
  • Standardization
  • Provide a standard among a variety of shared
    memory architectures(platforms)
  • High-level interfaces to thread programming
  • Lean and Mean
  • A simple and limited set of directives for shared
    address space programming
  • Just 3 or 4 directives are enough to represent
    significant parallelism

25
HELLO WORLD OpenMP VERSION
include ltomp.hgt include ltstdio.hgt int
main(void) pragma omp parallel
printf(hello from thread d\n,
omp_get_thread_num())
26
GOALS OF OpenMP (contd)
  • Ease of use
  • Incrementally parallelize a serial program
  • Unlike all or nothing approach of message-passing
  • Implement both coarse-grain and fine-grain
    parallelism
  • Portability
  • Fortran (77, 90, and 95), C, and C
  • Public forum for API and membership

27
MATRIX MULTIPLICATION SEQUENTIAL VERSION
for (i0 iltN i) for (j0 jltN j)
temp 0 for (k0 kltN k)
temp aik bkj cij
temp
28
MATRIX MULTIPLICATIONOPENMP VERSION
pragma omp parallel for private(temp),
schedule(static) for (i0 iltN i) for
(j0 jltN j) temp 0 for
(k0 kltN k) temp aik
bkj cij temp
Add directive
29
PROGRAMMING MODEL
  • Thread Based Parallelism
  • A shared memory process with multiple threads
  • Based upon multiple threads in the shared memory
    programming paradigm
  • Explicit Parallelism
  • Explicit (not automatic) programming model
  • Offer the programmer full control over
    parallelization

30
PROGRAMMING MODEL (contd)
  • Fork - Join Model
  • All OpenMP programs begin as a single sequential
    process the master thread
  • Fork at the beginning of parallel constructs
  • The master thread creates a team of parallel
    threads
  • The statements enclosed by the parallel region
    construct are executed in parallel
  • Join at the end of parallel constructs
  • The threads synchronize and terminate after
    completing the statements in the parallel
    construct
  • Only the master thread exists

31
FORK-JOIN MODEL
32
PROGRAMMING MODEL (contd)
  • Compiler Directive Based
  • Parallelism is specified through the use of
    compiler directives imbedded in C/C or Fortran
    source code
  • Nested Parallelism Support
  • Parallel constructs may include other parallel
    constructs inside.
  • Implementation-dependent
  • Dynamic Threads
  • Alter the number of threads used to execute
    parallel regions
  • Implementation-dependent

33
GENERAL CODE STRUCTURE
include ltomp.hgt main () int var1, var2,
var3 Serial code ... /Beginning of
parallel section. Fork a team of threads.
Specify variable scoping / pragma
omp parallel private(var1, var2) shared(var3)
Parallel section executed by all
threads ... All threads join
master thread and disband Resume serial
code
34
OPENMP COMPONENTS
  • Directives
  • Work-sharing constructs
  • Data environment clauses
  • Synchronization constructs
  • Runtime libraries
  • Environment variables

35
COMPARISON OF 5 SHARED-MEMORY PROGRAMMING STANDARD
Attribute X3H5 MPI Pthreads HPF OpenMP
Scalable No Yes Sometimes Yes Yes
Fotran binding Yes Yes No Yes Yes
C binding Yes Yes Yes No Planned
High level Yes No No Yes Yes
Performace oriented No Yes No Yes Yes
Supports data parallelism Yes No No Yes Yes
Portable Yes Yes Yes Yes Yes
Vendors support No Widely Unix SMP Widely Starting
Incremental parallelization Yes No No No Yes
Courtesy OpenMP Standards Board, 1997
Write a Comment
User Comments (0)
About PowerShow.com