Pthreads examples - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Pthreads examples

Description:

Given a list of cities, a matrix of distances between them, and a starting city ... Second (I, j) loop nest can be parallelized ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 28
Provided by: Surf6
Category:

less

Transcript and Presenter's Notes

Title: Pthreads examples


1
Pthreads examples
  • Traveling Salesman (TSP) (task parallelism)
  • SOR (data parallelism)

2
TSP (Traveling Salesman)
  • Goal
  • Given a list of cities, a matrix of distances
    between them, and a starting city
  • Find the shortest tour in which all cities are
    visited exactly once
  • Example of an NP-hard search problem
  • Algorithm branch-and-bound

3
Branching
  • Initialization
  • Go from starting city to each of remaining cities
  • Put research partial path into priority queue,
    ordered by the current length of the tour
  • Further (repeatedly)
  • Take head element out of priority queue
  • Expand by each one remaining cities
  • Put resulting partial path into priority queue

4
Finding the solution
  • Eventually, a complete path will be found.
  • Remember its length as the current shortest path.
  • Every time a complete path is found, check if we
    need to update current best path.
  • When priority queue is empty, best path is found.

5
Using a simple bound
  • Once a complete path is found, we have a lower
    bound on the length of the shortest path.
  • No use in exploring partial path that is already
    longer than the current bound
  • Such partial paths can be removed from the queue.

6
Sequential TSP data structure
  • Priority queue of partial paths.
  • Current best solution and its length.
  • For simplicity, we will ignore bounding.

7
Sequential TSP
  • Init_q() init_best()
  • While ((p dequeue()) ! NULL)
  • for each expansion by one city
  • q addcity (p)
  • if (complete(q)) update_best(q)
  • else enqueue(q)

8
Parallel TSP Possibilities
  • Have each process do one expansion.
  • Have each process do expansion of one partial
    path
  • Have each process do expansion of multiple
    partial paths.
  • Issue of granularity/performance, not an issue of
    correctness.
  • Assumption a thread expands one partial path at
    one time.

9
Parallel TSP synchronization
  • True dependence between process that puts partial
    path in queue and the one that takes it out.
  • Dependences arise dynamically
  • Required synchronization need to make process
    wait if q is empty.

10
Parallel TSP
Dequeue wait if q is empty Enqueue signal that
q is no longer empty
  • Thread I
  • while ((pdequeue())!NULL)
  • for each expansion by one city
  • q addcity(p)
  • if complete (q) update_best(q)
  • else enqueue(q)

11
Parallel TSP more synchronization
  • All threads operate, potentially at the same
    time, on q and best.
  • This must not be allowed to happen.
  • Critical section only one process can execute in
    critical section at once.
  • Enqueue/dequeue must be protected.
  • Update best must be protected.

12
Parallel TSP synchronization summary
  • Need critical section
  • In update_best
  • In enqueue/dequeue
  • In dequeue
  • Wait if q is empty
  • Terminate if all processes are waiting
  • In enqueue
  • Signal q is no longer empty

13
Parallel TSP mutual exclusion
  • Enqueue()/dequeue()
  • pthread_mutex_lock(queue)
  • pthread_mutex_unlock(queue)
  • Update_best()
  • pthread_mutex_lock(best)
  • pthread_mutex_unlock(best)

14
Parallel TSP signal/wait
  • Dequeue()
  • pthread_mutex_lock(queue)
  • while ((q is empty) and (not done))
  • waiting
  • if (waiting p)
  • done true
  • pthread_cond_broadcast(empty)
  • else
  • pthread_cond_wait(empty, queue)
  • waiting --
  • if (done) return NULL
  • else remove and return head of the queue
  • pthread_mutex_unlock(queue)

15
Second application SOR sequential version
16
Parallel SOR
  • First (I, j) loop nest can be parallelized
  • Second (I, j) loop nest can be parallelized
  • Must wait until all processors have finished
    first loop nest before starting second
  • Must wait until all processors have finished
    second loop nest of the previous iteration before
    starting first loop nest in the next iteration.
  • Given n/p rows to each processor.

17
Pthreads SOR first loop
18
Pthreads SOR second loop
19
Pthreads SOR main program
20
Barrier synchronization
  • A barrier operation causes a thread to wait until
    all threads have reached the barrier operation.
  • At that point, all proceed.
  • Can be used to replace creating and destroying
    threads multiple times with lower overheads.

21
Barrier implementation in pthreads
  • Count the number of arrivals at the barrier
  • Wait if this is not the last arrival
  • Make everyone unblock if this is the last
    arrival.
  • Use mutex to protect the count.

22
Barrier implementation in pthreads
23
Parallel SOR with barriers
  • Void sor(void arg)
  • initialization
  • for some number of iteartions
  • for (I) for (j ) temp
  • barrier()
  • for (I) for (j ) grid
  • barrier()

24
Parallel SOR with barriers main
25
Pthreads discussion
  • Expressiveness
  • Task parallelism
  • Data parallelism
  • Ease of use
  • Explicit thread maintenance
  • Explicit synchronization among threads
  • Lock, conditional variables, semaphors, where do
    we normally implement these?
  • These are the mechanisms to get the job done.

26
Pthreads discussion
  • Exposing architecture features
  • Same approach as running sequential programs
    architecture oblivious
  • What does it take to run threads efficiently?
  • Is it the same when multiple threads run on SMP,
    CMP, and SMT processors?
  • Independent and co-operative threads
  • Resource (cache, functional unit, load/store
    unit) contention issues
  • Are resources exposed?
  • Solution more OS/architecture support.

27
Pthreads conclusion
  • On the software layers, pthreads provides the
    functionality of what layer?
  • Good expressiveness, supports all types of
    parallelisms.
  • Bad usability similar to directly using the UNIX
    system call interface.
  • How many people program at this level at this
    time?
  • What are people using now?
  • Need to do the same thing for pthreads
  • Assembly ? C, pthreads ? ???
Write a Comment
User Comments (0)
About PowerShow.com