Title: Parallel Algorithms
1Parallel Algorithms
2Parallel Models
- Hypercube
- Butterfly
- Fully Connected
- Other Networks
- Shared Memory v.s. Distributed Memory
- SIMD v.s. MIMD
3The PRAM Model
- Parallel Random Access Machine
- All processors act in lock-step
- Number of processors is not limited
- All processors have local memory
- One global memory accessible to all processors
- Processors must read and write global memory
4A Pram Algorithm
- Every Processor knows its own index (usually
indicated by variable i) - Vector Sum
Read Mi Into x Read Min Into y x x
y Write x into Mi
5Binary Fan-In
Read Mi into Largest Write Mi into
Min Delta 1 For k 1 to élg nù Read
MiDelta into x Largest
Maximum(x,Largest) Write Largest into Mi
Delta Delta 2 End For
6Parallel Addition
Read Mi into Total Write 0 into Min Delta
1 For k 1 to élg nù Read MiDelta
into x Total x Total Write Total
into Mi Delta Delta 2 End For
7Pointer Jumping
Read Mi Into Total For k 1 to élg nù
Read Nexti into Ptr If Ptr ¹ 0 Then
Read MPtr Into x Total Total x
Write Total into Mi Read
NextPtr Into NewPtr Write NewPtr into
Nexti End If End For
8Initialization of Nexti
If i n Then Write 0 Into Nexti Else
Write i1 Into Nexti End If
9Calculate Node Depth I
If there is a Left Child
1
-1
To 1 of Left Child
0
From -1 of Left Child
10Calculate Node Depth 2
If there is no left child
1
-1
0
11Calculate Node Depth 3
If there is a Right Child
1
-1
From -1 of Right Child
0
To 1 of Right Child
12Calculate Node Depth 4
If there is no right child
1
-1
0
13Concurrent Reads Writes
- EREW - Exclusive Read, Exclusive Write
- CREW - Common Read, Exclusive Write
- CRCW - Common Read, Common Write
- All common writes must write the same thing
- Highest Priority Processor wins contest
- CREW is more powerful than EREW
- CRCW is more powerful than CREW
14Finding Max
- Square Array of Processors Indexed by i,j
Write True into Ri Read Mi into x Read Mj
into y If x lt y Then Write False Into
Ri Else If y lt x Then Write False Into
Rj End If
15CRCW V.S. CREW
- CRCW Max runs in constant time
- CREW Max runs in lg n time
- CRCW cannot be any better than lg p faster than
EREW
16EREW V.S. CREW
- Finding Roots by Shortcutting Pointers
- CREW Runs in lg lg n Time
- EREW Runs in lg n Time
17Optimal Parallel Algorithms
- NC -- The class of algorithms that run in
Q(logmn) time using Q(nk) processors - General Boolean Functions Cannot be Computed any
Faster than Q(lg n) - Q(lg n) is optimal for computing the sum of n
integers