Array Maximum Problem - PowerPoint PPT Presentation

About This Presentation
Title:

Array Maximum Problem

Description:

Algorithm: Going over an array and keeping the maximum. On a computer ... Bi with Cn 1-i the largest of the two is going to A1 and the other to A2. ... Xn-1 ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 63
Provided by: meirb
Category:

less

Transcript and Presenter's Notes

Title: Array Maximum Problem


1
Concurrent Programming
???? ?? ??? ???? ???? 027382977 ????? ?????
32033946
2
Module
  • The module we are talking about is
  • computer with multiple processors but only one
    memory unit.
  • All the processors are synchronized using the
    same clock.
  • The processors are all connected to each other
    and to the memory.
  • If more then one processor writes the same value
    to the same address in memory at the same time
    then the value will be written correctly. If the
    values are not the same then any value can be
    written.

3
Module
  • More then one processor can read the same memory
    address at the same time.
  • Other modules
  • The processors are on different computers.
  • There is no sheared memory for all the
    processors.
  • The processors are not using the same clock.

4
Array Maximum Problem
  • On a computer with one processor
  • Time O(N).
  • Algorithm Going over an array and keeping the
    maximum.
  • On a computer with K processors
  • Time O(N/K).
  • Algorithm Each processor handles N/K elements
    from the array. And all the sum's of the parts of
    the array are summed together.

5
Array Maximum Problem
  • On a computer with O(N) processors.
  • Time O(log(N)).
  • Algorithm On the first stage every processor
    will add 2 items. So after the first round will
    have N/2 numbers. On the next round N/4
    processors each will take 2 numbers and sum them
    so we will have on ly N/4 result after the 2
    round. After log(N) rounds we will have the sum
    of the array.

6
Array Maximum Problem

1 2 3 4 5 6
7 8
Example 8 elements time 3 Log(8).
7
Array Maximum Problem
  • The number of commutations that are performed is
    7 (4 in the first round, 2 in the second and 1 in
    the last). This is the same number of computation
    that is being done in the serial algorithm but
    its being done in less time.
  • This Algorithm will work for a lot of other
    functions not just Max like Min, Sum, Avg.
  • It will work for every Associative function.

8
Finding The Two Greatest Numbers
  • Simple solution for O(N) processors.
  • Algorithm Find the first maximum remove it from
    the array and find the second.
  • Time 2 Log(N).
  • Smart algorithm for O(N) processors.
  • Algorithm
  • First round each processor handles 2 items find
    the max and puts the other item in a.
  • Rounds 2..log(n) each processors handles 2 of
    the result of the second round compares the 2 Max
    values takes the Max as the new Max. and Takes
    the candidate group of the new max adds the max
    of the second group to it as the new candidate
    group.

9
Finding The Two Greatest Numbers
  • On The last round the Max of the array is the
    maximum and the second max is the maximum of the
    candidate group.
  • Sample
  • Array 7, 10, 1, 3, 100, 8, 55, 6.

10
Finding The Two Greatest Numbers
100
8
55
10
10
100
7
8
3
55

10 3
100 55
7 1
8 6
7 10 1 3 100
8 55 6
Results The maximum is the maximum of the array
(100) and the second maximum is the maximum of
the candidate group (55).
11
Finding The Two Greatest Numbers
  • Time
  • Log(N) LogLog(N).
  • Log(N) to find the first maximum and the
    candidate group.
  • LogLog(N) to find the maximum in the candidate
    group.
  • The candidate group size grows in 1 in each round
    (the maximum of the other group) so at the end
    its size is Log(N).

12
Merge problem
  • Description We have 2 sorted N size arrays B, C
    and we need to divide them into 2 new N sized
    arrays A1, A2 that the N largest items from both
    B and C will be in A1 and the N smallest will be
    in A2.
  • Simple solution We can merge B and C into one
    sorted array A and copy the firs N elements to A1
    and the last N elements to A2. But with this
    algorithm we cant use multiple processors the
    cost will still be O(N).

13
Merge problem
  • Smart algorithm for O(N) processors.
  • Processor I compares Bi with Cn1-i the largest
    of the two is going to A1 and the other to A2.
  • Correction proof.
  • If Bi gt Cn1-i the Bi gt B1..Bi-1 and Cn1-i gt
    C1..Cn-iso Bi is larger then N elements (I - 1
    from B and N - i 1 from C) so Bi needs to be in
    A1.
  • If Cn1-i gt Bi then Cn1-i is larger then N
    elements ( N - I from C and I from B ) so Cn1-i
    needs to be in A1.

14
Merge problem
  • Example B 1, 8, 10, 17C 9, 12, 67, 100(B1,
    Cn), (B2, Cn-1), (B3, Cn-2), (B4, Cn-3).A1
    100, 67, 12, 17.A2 1, 8, 10, 9.
  • Time We can do all the comparisons at the same
    time so the cost will be O(1).

15
Prefix Problem
  • Description Find the sum of the elements
    group.S11 X1S12 X1 X2S1n X1 X2
    Xn-1Xn
  • Simple solution Compute the sums with N
    processors time O(NLogN) N sums where each one
    takes O(LogN).

16
Prefix Problem
  • Algorithm
  • for I 0 to n-1 doip
  • Si Xi
  • for j 0 to log n do
  • for I 2j to n-1 doip
  • Si Si Si-2j
  • The doip means do in parallel in the different
    processor.
  • At the end the results are in the array s.

17
Prefix Problem
  • Example With 8 numbers X1..X8 Sij is Xi
    Xi1 Xj.

X1 X2 X3 X4 X5
X6 X7 X8
S11 S12 S23 S34 S45
S56 S67 S78
S11 S12 S13 S14 S25
S36 S47 S58
S11 S12 S13 S14 S15
S16 S17 S18
18
Prefix Problem
  • Timeeach round we get double the result S1i so
    after log(n) rounds we will get all the result.
  • In order to use this algorithm each processor
    needs to be connected to log(n) other processors.

19
Prefix Problem
  • Usage exampleProblem we have an arithmetic
    expression and we need to test if the brackets
    arrangement is legal. Algorithm we will create
    an array x by adding 1 for each ( and -1 for
    each ). And run the prefix algorithm. The
    results needs to be.S11 1 and S11..S1n-1gt0
    and S1n 0.Time with N processors O(logN)
    log(N) for the prefix algorithm and O(1) for the
    test.

20
Partition Problem
  • Description We have and array X that some of
    its element are signed we need to move all the
    signed elements to one array and the none signed
    to another array.
  • Simple solution We take 2 stacks we push the
    signed into one stack and the none signed into
    the other stack. It will take o(N) time.
  • Simple solution 2 We take two indexes one for
    the start of the array and one to the end. The
    first search for signed and the second for none
    signed and when they both find they exchange the
    items they point to and move on until they meet.
    This will take o(N) time too but its more
    parallel.

21
Partition Problem
  • Smart algorithm for O(N) processors
  • Create a new array B but in be if the element i
    is signed Bi 1 else Bi 0.
  • Create an array C with the prefix sums of B that
    is Ci B1 B2 Bi.
  • If Xi is signed then Y1Ci Xi.
  • If Xi is not signed then Y2i-Ci Xi.

22
Partition Problem
  • Example X 2, 4, 7, 8, 1, 3, 10, 12, 15.

X 2, 4, 7, 8, 1, 3, 10, 12, 15
B 0, 1, 0, 0, 0, 1, 1, 0, 1
C 0, 1, 1, 1, 1, 2, 3, 3, 4
Y1 4, 3, 10, 15
Y2 2, 7, 8, 1, 12
23
Partition Problem
  • Time with O(N) processor.Computing B
    O(1).Computing C O(log(n)) using the prefix
    algorithm.Computing Y1 and Y2 O(1).Total
    O(log(n)).

24
Sorting Algorithm
  • Description Sorting array A using O(N2)
    processors and put the result into array C.
  • Simple algorithm The serial algorithm for
    sorting an array takes a minimum of O(Nlog(N))
    time.
  • Smart algorithm
  • Create a matrix B size of NN and initialize it
    with zeroes at all cells.
  • We will look at the N2 processor as a matrix of
    processors. Processor Pi,j will compute AigtAj if
    true then Bi,j 1.

25
Sorting Algorithm
  • For each i from 1 to N CSum(i) Ai. When
    Sum(i) is the sum of Bi,1 to Bi,N.
  • Example A3, 5, 2, 9, 1Matrix B 1
    2 3 4 5 1 1 0 1 0 1
    2 1 1 1 0 1 3 0 0
    1 0 1 4 1 1 1 1 0 5
    0 0 0 0 1

26
Sorting Algorithm
  • C 1, 2, 3, 5, 9.
  • Time
  • Using O(N2) processors finding B matrix will
    take O(1) and finding C will cost O(log(N)).
  • So the total cost of the algorithm will be
    O(log(N)).
  • Using O(N) processors finding B will take O(N)
    time and finding C will take O(N) time so the
    total will be O(N).

27
Sorting Algorithm
  • Description Sorting array A using O(N2)
    processors and put the result into array C.
  • Algorithm Merge sort the largest cost in the
    merge sort algorithm is the cost of the merge.
    Using a serial algorithm the cost of merging 2
    sorted arrays is O(N) and the cost of the merge
    sort algorithm is O(Nlog(N)). We will use the
    regular algorithm but with a smarter merge
    algorithm.

28
Sorting Algorithm
  • Smart merge algorithm
  • Description We need to merge two sorted arrays
    A, B to a sorted array R.
  • Algorithm We will describe a recursive algorithm
    Merge.Cmerge(even(A), odd(B)).Dmerge(odd(A),
    even(B)).Where odd(A) is all the items in A with
    an Odd index. And Even(A) is all the items in A
    with an even index.

29
Sorting Algorithm
  • When C C0, C1, C2.Cn D D0, D1,
    D2.DnEC0, D0, C1, D1Cn, Dn.Compare each
    Ci,Di and if CigtDi then replace Ci and Di in
    array E.And array E is the merger of C and D.

30
Sorting Algorithm
  • Example A 3, 5, 8, 10 B 4, 7, 9,
    12Even(A) 5 ,10 Odd(A) 3, 8Even(B) 7,
    12 Odd(B) 4, 9C 3, 7, 8, 12D 4, 5, 9,
    10E 3, 4, 7, 5, 8, 9, 12, 10After replacing
    in EE 3, 4, 5, 7, 8, 9, 10, 12
  • Time Using O(N) processors the merge will take
    O(log(N)) time The merge sort runs the merge
    algorithm log(N) times so the total cost of the
    merge sort is O(log2(N)).

31
Find Algorithm
  • Description If array X contains the value Val
    the Res needs to be True else Res needs to be
    False.
  • Simple Algorithm Using a serial algorithm it
    will take O(N) time.
  • Smart Algorithm Using O(N) processor.
    Res False. Each process i tests if XI Val
    if true Res True.
  • Time O(1).

32
Model Description
  • Many processors.
  • Processors can send messages to each other
    through communication.
  • We will want that each processor will have a
    unique identification.
  • Since we have O(n) processors we need O(logn) bit
    to represent the Id.

33
Model Description
  • Clean Net when a processor doesnt now anything
    about his neighbors, not even their Ids. he only
    knows how many neighbors he have.
  • We will explicitly mention when dealing with
    Clean Net, otherwise every processor has a unique
    Id.

34
Model Description
  • Message should include sender and receiver Id and
    some information - total O(logn) bits.
  • If X wants to send message to Y through Z, it
    will cost 2 steps to send the message.

X
Z
Y
35
Model Description
  • Local computation doesnt take time.
  • we will analyzetime complexity - the number of
    steps the algorithm takes in the worst
    case.communication complexity - the total number
    of messages that we sent in the execution of the
    algorithm in the worst case.

36
Distributed vs. Sequential
  • Communication - we need in the distributed model
    but not in the sequential.
  • Partial knowledge - together all the processor
    knows everything, but not all the processors
    necessarily knows everything.
  • There can be processors or communication channels
    down.

37
Distributed vs. Sequential
  • Synchronization - we need to synchronize the
    processor.

38
Synchronic Model
  • there is a global clock.
  • In any clock cycle each of the processor- send
    messages to his neighbors.- receive messages
    from his neighbors.- make local computation in 0
    time.- change state.

39
Asynchronies Model
  • There is no global clock.
  • if a message was sent it will eventually arrive
    to its destination (with no fall downs) but we
    can't assume anything about the arrival time.
  • we will start the time from the beginning of the
    execution until the last processor stooped.

40
Asynchronies Model
  • We will force the assumption that any of the
    messages arrived in one time unit in the worst
    case for time complexity calculations.

41
Model Representation
  • We can represent the processors net with a graph.
  • Each node in the graph is a processor.
  • There is an edge between two nodes if there is a
    direct communication channel between the two
    processors they represent.

42
Complexity
  • C(?, G, I) - communication complexitythe total
    number of messages that were sent in the
    execution in the worst case.
  • T(?, G, I) - time complexitythe number of clock
    cycles that the execution take in the worst case.
  • Where ? is the protocol, G is the graph and I is
    the input.

43
Complexity - examples
  • The following examples are in a full graph.

2
1
n
44
Complexity - example 1
  • Protocol A node 1 send the message m to node 2.
  • C(A, G, I) 1.
  • T(A, G, I) 1.

1
2
m
45
Complexity - example 2
  • Protocol B node 1 send the message mi to the
    node i.
  • C(B, G, I) n.
  • T(B, G, I) 1.

1
i
mi
? i?G
46
Complexity - example 3
  • Protocol C node i send the message mi to node
    i1.
  • C(C, G, I) n.
  • T(C, G, I) 1.

i
i1
mi
? i?G
47
Complexity - example 4
  • Protocol D node i send the message m to node i1
    in cycle i.
  • C(D, G, I) n.
  • T(D, G, I) n.

m
1
2
m
2
3
. . .
48
Transmission Problem
  • Input there is a message m in the node V0.
  • Output the message m is written in all the nodes
    in the graph.
  • dG(x,y) - the shortest path from x to y in graph
    G.
  • D Diameter(G) max x,y?V dG(x,y) .

49
Algorithms for the Transmission Problem
  • Direct Delivery.
  • Spanning Tree.
  • DFS.
  • Flooding.

50
Direct Delivery
  • Bases on the assumptions- there is a routing
    system, such as that messages are sent in the
    shortest path.- V0 knows the addresses of all
    other nodes in the graph.
  • V0 send the message m n-1 times, each time to a
    different node.

51
DD Communication Complexity
  • V0 sends n messages.
  • It takes O(D) steps for each massages.
  • C(DD, G, I) O(nD).

52
DD Time Complexity
  • Under the assumptions1. synchronic model.2. V0
    sends one new message in any clock cycle.
  • There wont be collisions between messages,
    because messages goes in the shortest path, and
    therefore we cant have more then one message for
    a given distance from V0.

53
DD Time Complexity
  • The last messages will be sent in the n-1 cycle.
  • It will take O(D) steps for the last message to
    arrive.
  • T(DD, G, I) O( nD ).

54
DD Time Complexity
  • We can show the same time complexity even without
    assumption 2.
  • If we will have two messages in a node competing
    for the same edge. We will send the message that
    should arrive to the node with the smaller Id.
  • the message for node i, in time t, must be in a
    distance t-i1 from V0 (or in Vi).

55
Spanning Tree
  • AssumptionsWe have a spanning tree in the
    graph, that all the node aware off (each node
    knows which of his edges is part of the spanning
    tree).
  • Each node that receive the message send it on the
    spanning tree edges.

56
Spanning Tree Complexity
  • We send the message once for each spanning tree
    edge.
  • C(ST) n-1.
  • We need tree depth rounds until the last node
    receive the message.
  • T(ST) O( Depth( tree, V0 ) ).
  • If we choose a BFS tree T(ST) O(D).

57
Building a Spanning Tree
  • If we dont have a spanning tree, we can built
    one using any algorithm A for Transmission.
  • Execute algorithm A.
  • each node V choose as a parent the node W from
    which it received the message for the first time.

58
Building a Spanning Tree
  • V inform W that he is his parent.
  • The edge E(W,V) is marked as a spanning tree
    edge.
  • Since transmission algorithm deliver the message
    to all nodes, we know that all the nodes are in
    the spanning tree.
  • We have no cycles since V choose only one parent.

59
DFS
  • We traverse the graph in DFS order.
  • If we reached a new node we leave a copy of the
    message, mark the node and continue the
    traversal.
  • If we reached a marked node we go back.

60
DFS Complexity
  • In the DFS algorithm we move on each edge exactly
    twice.
  • C(DFS) T(DFS) O(E).

61
Flooding
  • Each node that receive the message for the first
    time, sent it to all of his neighbors.
  • When a node receive a message in the next times,
    it just dump the message.
  • Flooding is affective also in a Clean Net.

62
Flooding Complexity
  • In each edge the message will pass twice, once in
    each direction.
  • C(Flood) O(?E?).
  • After t time unit the message will reach all the
    nodes that their distance from V0 is smaller or
    equal to t.
  • T(Flood) O(D).
Write a Comment
User Comments (0)
About PowerShow.com