Computer Science 320 - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Computer Science 320

Description:

Title: Computer Science 111 Author: Ken Lambert Last modified by: Kenneth Lambert Created Date: 12/19/1997 3:34:08 PM Document presentation format – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 15
Provided by: KenL84
Learn more at: http://home.wlu.edu
Category:

less

Transcript and Presenter's Notes

Title: Computer Science 320


1
Computer Science 320
  • Reduction

2
Estimating p
Throw N darts, and let C be the number of darts
that land within the circle quadrant of a unit
circle Then, C / N should be about the same
ratio as circle area / square area Circles area
p R2, and circle quadrants area is p / 4,
where R 1 Then C / N p / 4, and p 4 C /
N
3
Sequential Program PiSeq
// Generate n random points in the unit square,
count how many are in // the unit circle. count
0 for (long i 0 i lt N i) double x
prng.nextDouble() double y
prng.nextDouble() if (x x y y lt 1.0)
count // Stop timing. time
System.currentTimeMillis() // Print
results. System.out.println("pi 4 " count
" / " N " " (4.0
count / N))
4
Parallel Program PiSmp3
new ParallelTeam().execute (new
ParallelRegion() public void run() throws
Exception execute (0, N-1, new
LongForLoop() // Set up per-thread
PRNG and counter. Random prng_thread
Random.getInstance(seed) long
count_thread 0 // Extra padding
to avert cache interference. long
pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7
long pad8, pad9, pada, padb, padc,
padd, pade, padf // Parallel loop
body. public void run (long first,
long last) // Skip PRNG ahead to
index ltfirstgt prng_thread.setSeed(
seed) prng_thread.skip(2
first) // Generate random
points. for (long i first i lt
last i) double x
prng_thread.nextDouble()
double y prng_thread.nextDouble()
if (x x y y lt 1.0)
count_thread
5
Reduction Step, SMP-Style
static SharedLong count . . . . . . public
void finish() // Reduce per-thread counts
into shared count. count.addAndGet(count_threa
d)
6
Monte Carlo Design for a Cluster
  • Could keep global counter in process 0, but that
    would involve too many messages
  • Use reduction instead, so message passing is
    minimal
  • Each process has its own PRNG, with its own split
    sequence

7
Reduction vs Gather
  • Could allocate an array of K cells for results,
    where the ith processors result is in the ith
    cell then gather these into process 0 and let
    process 0 reduce the end result from these
  • Instead, the reduce method employs all processes
    in computing the reduction

8
Reduction in Cluster
  • Concentrate data into fewer and fewer processes
  • When K 8,
  • processes 4-7 send their data to processes 0-3
  • processes 2-3 send their results to processes 0-1
  • process 1 sends its results to process 0
  • At most log2(K) messages!

9
Reduction Tree for K 8
Messages are sent in parallel at each level,
starting at the bottom When results have been
computed, messages are sent from the next level
10
Example Add the Results
Initial state
After first set of messages
11
Example Add the Results
After second set of messages
After third set of messages
12
Its Automatic reduce
world.reduce(0, buf, InegerOp.SUM)
// Compute the count in each processor ... //
Perform the reduction step LongItemBuf buf new
LongItemBuf() buf.item count world.reduce(0,
buf, InegerOp.SUM) count buf.item ... ... if
(rank 0) // Output the count and the
estimate of PI
13
Reduction in Mandelbrot Histogram
int histogram new intmaxiter
1 ... world.reduce(0, IntegerBuf.buffer(histogra
m), InegerOp.SUM)
14
Reduction in Mandelbrot Histogram
int histogram new intmaxiter
1 ... world.reduce(0, IntegerBuf.buffer(histogra
m), InegerOp.SUM)
Write a Comment
User Comments (0)
About PowerShow.com