An Associative Program for the MST Problem - PowerPoint PPT Presentation

About This Presentation
Title:

An Associative Program for the MST Problem

Description:

An Associative Program for the MST Problem Part 2 of Associative Computing * The Greedy-choice Property. This property says a global optimal configuration can be ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 106
Provided by: ObertaASl6
Learn more at: https://www.cs.kent.edu
Category:

less

Transcript and Presenter's Notes

Title: An Associative Program for the MST Problem


1
An Associative Program for the MST Problem
  • Part 2 of Associative Computing

2
Overview
  • In this set of slides, we will explore an
    alternate associative algorithm for the minimal
    spanning tree (MST) problem.
  • Only slides with light blue titles will be
    covered in class.
  • The other slides are reference slides so that
    students can obtain an overview of the ASC
    language.
  • As mentioned earlier, Professor Potter developed
    an associative programming language called ASC
    and a simulator for this language.
  • ASC has also been implemented on 3-4 SIMD
    computers.
  • We will treat the ASC code for the MST included
    here as a detailed pseudocode description of this
    algorithm.
  • The goal of this set of slides is to prepare
    students to write a Cn (ClearSpeed) program for
    this algorithm.

3
Content Covered in Light Blue
  • References
  • The MST example and background
  • Variables and Data Types
  • Operator Notation
  • Input and Output
  • Mask Control Statements
  • Loop Control Statements
  • Accessing Values in Parallel Variables
  • Performance Monitor
  • Subroutines and other topics
  • Basic Program Structure
  • Software Location Execution Procedures
  • An Online ASC Program Data File to Execute
  • ASC Code for the MST algorithm for a directed
    graph
  • The Shortest Path homework problem

4
References
  • ASC Primer by Professor Jerry Potter is the
    primary reference for basic ASC
  • A copy is posted on lab website under software
  • Lab website is at www.cs.kent.edu/parallel/
  • Associative Computing book by Jerry Potter has
    a lot of additional information about the ASC
    language.
  • Both references use a directed-graph version of
    the Minimal Spanning Tree as an important
    example.

5
Features of Potters MST Algorithm
  • Both versions of MST are based on Prims
    sequential MST algorithm
  • In most algorithm books (e.g., see Baase, et. al.
    in references)
  • A drawback of Potters version is that it
    requires 1 PE for each graph edge, which in worst
    case is n2 n ?(n2)
  • Unlike earlier MST algorithm, this is not an
    optimal cost parallel algorithm
  • An advantage is that it works for undirected
    graphs
  • The earlier MST algorithm covered might be
    possibly be extended to work for directed graphs.
  • Uses less memory for most graphs than earlier
    algorithm
  • True especially for sparse graphs
  • Often will require a total of only O(n) memory
    locations, since the memory required for each PE
    is a small constant.
  • In the worst case, at most O(n2) memory locations
    are needed.
  • Earlier algorithm always requires ?(n2) memory
    locations, as each PE stores a row of the
    adjacency matrix.

6
Representing the Graph in a Procedural Language
  • We need to find edges that are incident to a node
    of the graph. What kind of data structure could
    be used to make this easy? Typically there are
    two choices
  • An adjacency matrix
  • Label the rows and columns with the node names.
  • Put the weight w in row i and column j if edge i
    is incident to edge j with weight w.
  • Doing this, we would use the representation of
    the graph in the problem as follows ...

7
Graph Example for MST
8
Adjacency Matrix For Preceding Graph
A B C D E F G H
I
2 7 3
2 4 6
4 2 2
2 1 8
1 6 2
7 6 5
3 6 3 1
2 8 3 4
2 5 1 4
A B C D E F G H I
9
An Alternative Useful Representation for
Sequential Algorithms
  • Another possibility is to use adjacency lists,
    which can allow some additional flexibility for
    this problem in representing the rest of the data
    namely the sets v1, v2, and v3.
  • It is in these type of representations that we
    see pointers or references play a role.
  • We link off of each node, all of the nodes which
    are incident to it, keeping them in increasing
    order by label.

10
Adjacency Lists for the Graph in the Problem
B 2
A 2
F 7
C 4
G 3
G 6
A
B
C
D
E
F
G
H
I
ETC..... G, H, and I will have 4 entries all
others have 3. In each list, the nodes are in
increasing order by node label. Note if the node
label ordering is clear, the A, B, ... need not
be stored.
11
Adding the Other Information Needed While Finding
the Solution
  • Consider one of the states during the run, right
    after the segment AG is selected

B
C
F
A
I
G
H
V1
V2
How will this data be maintained?
12
  • Cont.
  • We need to know
  • Which set each node is in.
  • What is each nodes parent in the tree formed
    below by both collections.
  • A list of the candidate nodes.

V2
13
A Typical Data Structure Used For This Problem Is
Shown
I
V2lnk
V2 elements are linked via yellow entries with
V2lnk the head and ? the tail I ? H
? C ?F Light blue boxes appeared in earlier
states, but are no longer in use. Red entries say
what set the node is in. Green entries give
parent of node and orange entries give edge
weights. The adjacency lists are not shown, but
are linked off to right.
1
A 2 A 1
F 4 B 2
3
3
? 7 A 2
F 3 A 1
C 3 G 2
H 1 G 2
A B C D E F G H I
14
I Is Now Selected and We Update
I
V2lnk
1
A 2 A 1
F 4 B 2
3
3
? 7 A 2
F 3 A 1
C 3 G 2
H 1 G 1
I is now in V1 so change its set value to 1. Look
at nodes adjacent to I E, F, G, H and add
them to V2 if they are in V3 E is added ...
A B C D E F G H I
15
E was Just Added to V2
E
V2lnk
Store Is link H in Es position and E in V2lnk.
This makes Is entry unreachable. So V2 is now
E ? H ? C ? F Now we have to add relevant edges
for I to any node in V2.
1
A 2 A 1
F 4 B 2
3
H 2
? 7 A 2
F 3 A 1
C 3 G 2
H 1 G 1
A B C D E F G H I
16
Add Relevant Edges For I to V2 Nodes
E
V2lnk
Walk Is adjacency list E ? F ? G ? H E was
just added so select EI with weight 2. wgt(FI)
5 lt wgt(FA) was 7, so drop FA and add FI (see
black blocks) G is in V1 so dont add GI. wgt(HI)
4 gt wgt(HG) 3, so no change. This is now
ready for next round.
1
A 2 A 1
F 4 B 2
3
H 2 I 2
? 5 I 2
F 3 A 1
C 3 G 2
H 1 G 1
A B C D E F G H I
17
Complexity Analysis (Time and Space) for Prims
Sequential Algorithm
  • Assume
  • the preceding data structure is used.
  • The number of nodes is n
  • The number of edges is m
  • Space used is 4n plus the space for adjacency
    lists.
  • The adjacency list are T(m), which in worst case
    is T(n2)
  • This data structure sacrifices space for time.
  • Time is T(n2) in the worst case.
  • The adjacency list of each node is traversed only
    once when it is added to tree. The total work of
    comparing weights and updating the chart during
    all of these traversals is T(m),
  • There are n-1 rounds, as one tree node is
    selected each round.
  • Walking the V2 list to find the minimum could
    require n-1 steps the first round, n-2 the
    second, etc for a max of T(n2) steps

18
Alternate ASC Implemention of Prims Algorithm
using this Approach
  • After setting up a data structure for the
    problem, we now need to code it by manipulating
    each state as we did on the preceding slides.
  • ASC model provides an easier approach.
  • Recall that ASC does NOT support pointers or
    references.
  • The associative searching replaces the need for
    these.
  • Recall, we collectively think of the PE processor
    memories as a rectangular structure consisting of
    multiple records.
  • We will next introduce basic features of the ASC
    language in order to implement this algorithm.

19
Structuring the MST Data for ASC
  • There are 15 bidirectional edges in the graph or
    30 edges in total.
  • Each directed edge will have a head and a tail.
  • So, the bidirectional edge AB will be represented
    twice once as having head A and tail B and
    once as having head B and
    tail A
  • We will use 30 processors and in each PEs
    memory we will store an edge representation as
  • State is 0, 1, 2, or 3 and will be explained
    shortly.

head tail weight state
20
ASC Data Types and Variables
  • ASC has eight data types
  • int (i.e., integer), real, hex (i.e., base 16),
    oct (i.e., base 8), bin (i.e., binary), card
    (i.e., cardinal), char (i.e., character),
    logical, index.
  • Card is used for unsigned integer data.
  • Variables can either be scalar or parallel.

21
ASC Parallel Variables
  • Parallel variables reside in the memory of
    individual processors.
  • Consequently, tail, head, weight, and state will
    be parallel variables.
  • In ASC, parallel variables are declared using an
    array-like notation, with in index
  • char parallel tail, head
  • int parallel weight, state

22
ASC Scalar and Index Variables
  • Scalar variables in ASC reside in the IS (i.e.,
    the front end computer), not in the PEs
    memories.
  • They are declared as
  • char scalar node
  • Index variables in ASC are used to manipulate the
    index (i.e. choice of an individual processor) of
    a field. For example,
  • graphxx
  • They are declared as
  • index parallel xx
  • They occupy 1 bit of space per processor

23
Logical Variables and Constants
  • Logical variables in ASC are boolean variables.
    They can be scalar or parallel.
  • ASC does not formally distinguish between the
    index parallel and logical parallel variables
  • The correct type should be selected, based on
    usage.
  • If you prefer to work with the words TRUE and
    FALSE, you can define logical constants by
  • deflog (TRUE, 1)
  • deflog (FALSE, 0)
  • Constant scalars can be defined by
  • define (identifier, value)

24
Logical Parallel Variables needed for MST
  • These are defined as follows
  • logical parallel nextnod, graph,
    result
  • The use of these will become clear in later
    slides.
  • For the moment, recognize they are just bit
    variables, one for each PE.

25
Array Dimensions
  • A parallel variable can have up to 3 dimensions
  • First dimension is , the parallel dimension
  • The array numbering is zero-based, so the
    declaration
  • int parallel A,2
  • creates the following 1dimensional variables
  • A,0, A,1, A,2

26
Mixed Mode Operations
  • Mixed mode operations are supported and their
    result has the natural mode. For example, given
    declarations
  • int scalar a, b, c
  • int parallel p, q, r, t,4
  • index parallel x, y
  • then
  • c a b is a scalar integer
  • q a p is a parallel integer variable
  • a px is a integer value
  • r tx,23p is a parallel integer
    variable
  • x p .eq. r is an index parallel
    variable
  • More examples are given on page 9-10 of ASC Primer

27
The Memory Layout for MST
  • As with most programming languages, the order of
    the declarations determines the order in which
    the variables are identified in memory.
  • To illustrate, suppose we declare for MST
  • char parallel tail, head
  • int parallel weight, state
  • int scalar node
  • index parallel xx
  • logical parallel nexnod, graph,
    result
  • The layout in the memories is given on next slide
  • Integers default to the word size of the machine
    so ours would be 32 bits.

28
The Memory Layout for MST
tail head weight state xx nxt gr
res
PE 0 1 2 3 4 p-1 p





Last 4 are bit fields. The last 3 are
named nxtnod graph result


29
Operator Notation
  • Relational and Logical Operators
  • Original syntax came from FORTRAN and the
    examples in the ASC Primer use that syntax.
  • However, the more modern syntax is supported
  • .lt. lt .not. !
  • .gt. gt .or.
  • .le. lt .and.
  • .ge. gt .xor. --
  • .eq.
  • .ne. !
  • Arithmetic Operators
  • addition
  • multiplication
  • division /

30
Parallel Input in ASC
  • Input for parallel variables can be interactive
    or from a data file in ASC.
  • We will run in a command window so file input
    will be handled by redirection
  • If you are not familiar with command window
    handling or Linux (Unix), this will be shown.
  • In either case, the data is entered in columns
    just like it will appear in the read command.
  • Do not use tabs.
  • THE LAST LINE MUST BE A BLANK LINE!

31
Parallel read and Associate Command
  • The format of the Parallel read statement is
  • read parvar1, parvar2,... in ltlogical parallel
    vargt
  • The command only works with parallel variables,
    not scalars.
  • Input variables must be associated with a logical
    parallel variable before the read statement.
  • The logical variable is used to indicate which
    PEs was used on input.
  • After the read statement, the logical parallel
    variable will be true (i.e., 1) for all
    processors holding input values.

32
Parallel Input in ASC
  • The associate command and the read command for
    MST would be
  • associate head, tail, weight, state
    in graph
  • read tail, head, weight in graph
  • Blanks can be used rather than commas, as
    indicated by MST example on pg 35 of Primer.
  • Commenting Code
  • / This is the way to comment code in ASC /

33
Input of Graph
  • Suppose we were just entering the data for AB,
    AG, AF, BA, BC, and BG.
  • Order is not important,
  • but the data file would
  • look like

and memory would like tail head weight
graph A B 2 1 A
G 5 1 A F
9 1 B A 2
1 B C 4 1 B
G 6 1 0 0
0 0 ?
A B 2 A G 5 A F 9 B A 2 B C
4 B G 6 blank line
34
Scalar variable input
  • Static input can be handled in the code.
  • Also, define or deflog statements can be used to
    handle static input.
  • Dynamic input is currently not supported
    directly, but can be accomplished as follows
  • Reserve a parallel variable dummy (of desired
    type) for input.
  • Reserve a parallel index variable used.
  • Values to be stored in scalar variables are first
    read into dummy using a parallel-read and then
    transferred using get or next to the appropriate
    scalar variable.
  • Example
  • read dummy in usedx
  • get x in used
  • scalar-variable dummyx
  • endget x

35
Input Summary
  • Direct scalar input is not directly supported.
  • Scalars can be set as constants or can be set
    during execution using various commands.
  • We will see this shortly
  • We will be able to output scalar variables
  • This will also be handy for debugging purposes.
  • The main problem on input is to remember to
    include the blank line at the end.
  • I suggest always printing your input data
    initially so you see it is going in properly.

36
Parallel Variable Output
  • Format for parallel print statement is
  • print parvar1, parvar2,... in ltlogical parallel
    vargt
  • Again, variables to be displayed must be
    associated with a logical parallel variable
    first.
  • You can use the same association as for the read
    command
  • associate tail, head, weight with
    graph
  • read tail, head, weight in graph
  • print tail, head, weight in graph
  • You can use a logical parallel variable that has
    been set with another statement, like an IF
    statement, to control which PEs will output data.

37
MST Example
  • Suppose state holds information about whether
    a node is in V1, V2, etc.
  • Then, you could set up an association by
  • if (state 1) then result TRUE endif
  • You can print with this association as follows
  • print tail, head, weight in result
  • Only those records where state 1 would be
    printed.

38
Output Using msg
  • The msg command
  • Used to display user text messages.
  • Used to display values of scalar variables.
  • Used to display a dump of the parallel variables.
  • The entire parallel variable contents printed
  • Status of active responders or association
    variables ignored
  • Format msg string list
  • msg The answers are max BBX B
  • See Page 13-14 of ASC Primer

39
Assignment Statements
  • Assignment can be made with compatible
    expressions using the equal sign with
  • scalar variables
  • parallel variables
  • logical parallel variables
  • The data types normally have to be the same on
    both sides of the assignment symbol i.e. dont
    mix scalar and parallel variables.
  • A few special cases are covered on the next slide

40
Some Assignment Statement Special Cases
  • Declarations for Examples
  • int scalar k
  • int parallel b
  • Index parallel xx
  • If xx is an index variable with a 1 in at least
    one of its components, then following is valid
  • k aaxx 5
  • Here, the component of aa used is one where xx is
    1.
  • While selection is arbitrary (e.g., pick-one),
    this implementation selects the smallest index
    where xx is 1.
  • The assignment of integer arithmetic expressions
    to integer parallel variables is supported.
  • bxx 3 5
  • This statement assigns an 8 to the xx component
    of b.
  • The component selected is identified by first 1
    in xx.
  • See pg 9-10 of Primer for more examples.

41
Exampleaa b c (1)
  • Before
  • mask aa b c
  • 1 2 3 4
  • 1 3 5 3
  • 0 2 4 -3
  • 0 6 4 1
  • 1 2 -3 -6
  • After
  • mask aa b c
  • 1 7 3 4
  • 1 8 5 3
  • 0 2 4 -3
  • 0 6 4 1
  • 1 -9 -3 -6

1 Note As an article, a is a reserved word in
ASC and so it cant be used as a variable name.
(see ASC Primer, pgs 29-30 and 39)
42
Setscope Mask Control Statement
  • Format
  • setscope ltlogical parallel variablegt
  • body
  • endsetscope
  • Resets the parallel mask register
  • setscope jumps out of current mask setting to the
    new mask given by its logical parallel variable.
  • One use is to reactivate currently inactive
    processors.
  • Also allows an immediate return to a previously
    calculated mask, such as an association.
  • Is an unstructured command such as go-to and
    jumps from current environment to a new
    environment.
  • Use sparingly
  • endsetscope resets mask to preceding setting.

43
Example
logical parallel used...used aa
5setscope used tail 100endsetscope
  • After setscope
  • used
  • aa mask tail
  • 5 1 100
  • 22 0 6
  • 5 1 100
  • 41 0 7
  • Before setscope
  • mask aa used tail
  • 1 5 1 7
  • 1 22 0 6
  • 1 5 1 9
  • 0 41 0 7

After endsetscope aa mask tail 5
1 100 22 1 6 5 1
100 41 0 7
44
The Scalar IF Statement
  • Scalar IF similar to what you have used before
    i.e. a branching statement with the else part
    optional.
  • Example
  • int scalar k
  • ...
  • if k 5 then sum 0
  • else b sum
  • endif

45
The Parallel IF Mask Control Statement
  • Looks like scalar IF except instead of a scalar
    logical expression, a parallel logical expression
    is encountered.
  • Format
  • if ltlogical parallel expressiongt then
  • ltbody of thengt
  • else
  • ltbody of elsegt
  • endif
  • Although it looks similar, the execution is
    considerably different.
  • The parallel version normally executes both
    bodies, each for the appropriate processors
  • Useful as a parallel search control statement

46
Operation Steps of Parallel IF
  • Save the mask bit of processors that are
    currently active.
  • Broadcast code to the active processors to
    calculate the IF boolean expression.
  • If the boolean expression is true for an active
    processor, set its individual cell mask bit to
    TRUE otherwise set its mask bit to FALSE.
  • Broadcast code for the then portion of the IF
    statement and execute it on the (TRUE)
    responders.
  • Compliment the mask bits for the processors that
    were active at step 1.
  • Ones originally FALSE remain FALSE
  • Broadcast code for the else portion of the IF
    statement and execute it on the active
    processors.
  • Reset the mask to original mask at Step 1.

47
Example
if (b 1) then b 2 else b -1
endif
  • Before
  • b mask
  • 1 1
  • 7 1
  • 2 1
  • 1 1
  • 1 0
  • After
  • b then mask else mask
  • 2 1 0
  • -1 0 1
  • -1 0 1
  • 2 1 0
  • 1 0 0

48
IF (ELSE-NOT-ANY) Format
  • if ltlogical parallel expressiongt then
  • body of if
  • elsenany
  • body of elsenany
  • endif
  • Note this is an if statement with an embedded
    ELSENANY clause.
  • Either responders to if execute if-body or
    else all active responders execute
    elsenany-body.
  • While this extension is occasionally useful,
    could get by with just any command
  • any command is covered in next construct.

49
The IF-ELSENANY Mask Control Statement
  • Only one part of this IF statement is executed.
  • Useful as a parallel search control statement
  • Steps
  • Evaluate the conditional statement.
  • If there are one or more active responders,
    execute the then block.
  • If there is no active responders, the
    ELSE-NOT-ANY (ELSENANY) block is executed.
  • When executing the ELSENANY part, the original
    mask is used i.e. the one prior to the
    IF-NOT-ANY statement.

50
Example
if aa gt 1 aa lt 4 /sets
mask/ if b 12 then c 1 / search
for b 12 / elsenany c 9
endif / action if no b is 12/ endif
  • Before
  • aa b c
  • 1 17 0
  • 2 13 0
  • 2 8 0
  • 3 12 0
  • 2 9 0
  • 4 67 0
  • 0 0 0
  • 0 12 0
  • After
  • mask1 mask2 aa b c
  • 0 0 1 17 0
  • 1 0 2 13 0
  • 1 0 2 8 0
  • 1 1 3 12 1
  • 1 0 2 9 0
  • 0 0 4 67 0
  • 0 0 0 0 0
  • 0 0 0 12 0
  • Recall uses set mask

51
Example
if aa gt 1 aa lt 4 /sets mask/ if b
12 then c 1 / search for b 12 /
elsenany c 9 endif / action if
no b is 12/ endif
  • Before
  • aa b c
  • 1 17 0
  • 2 13 0
  • 2 8 0
  • 3 4 0
  • 2 9 0
  • 4 67 0
  • 0 0 0
  • 0 12 0
  • After
  • mask1 mask2 aa b c
  • 0 0 1 17 0
  • 1 0 2 13 9
  • 1 0 2 8 9
  • 1 0 3 4 9
  • 1 0 2 9 9
  • 0 0 4 67 0
  • 0 0 0 0 0
  • 0 0 0 12 0
  • Recall uses original mask

52
The ANY Mask Control Statement
  • Format
  • any ltlogical parallel expressiongt
  • body
  • elsenany
  • body
  • endany
  • ANY is the primary construct used in ASC to
    support the AnyResponders associative property
  • The body of ANY is executed by all active
    processors if any data item satisfies the
    conditional statement.
  • The ELSENANY provides a sometimes useful but
    non-essential extension of the ANY command.

53
The ANY Statement
  • Used to search for data items that satisfy the
    conditional expression.
  • There must be at least one responder for the body
    statement to be performed.
  • If there are no responders, the ANY statement
    does nothing unless an ELSENANY is used.
  • The mask used to execute the ANY body is the
    original mask prior to the ANY statement.
  • Consequently, all active responders are effected
    if the conditional expression of the ANY
    evaluates to TRUE.
  • If there are no responders, then the body of
    ELSENANY is executed by all active processors.

54
Example
if aa gt 7 then / set mask / any aa 10
b 11 endany endif
  • Before
  • mask aa b
  • 1 3 0
  • 0 9 0
  • 1 16 0
  • 1 10 0
  • 1 8 0
  • 0 0 0
  • 1 0 0
  • After
  • mask aa b
  • 0 3 0
  • 0 9 0
  • 1 16 11
  • 1 10 11
  • 1 8 11
  • 0 0 0
  • 0 0 0

55
The Loop Control Statements
  • Loop controlled by either a scalar test or a
    parallel test
  • LOOP-UNTIL statement
  • Conditional is evaluated every iteration
  • Loop controlled by a parallel test
  • Parallel FOR-Loop
  • Conditional is evaluated only once
  • Parallel While-Loop
  • Conditional is evaluated every iteration
  • The FOR and WHICH loop statement are the ones
    normally used.
  • LOOP-UNTIL included for mostly for completeness.

56
The LOOP-UNTIL Statement
  • Similar to REPEAT UNTIL loops in other languages.
  • However, it is more flexible since the UNTIL
    conditional test can appear anywhere in the body
    of the loop.
  • Format
  • first
  • initialization
  • loop
  • body1
  • until (logical scalar expression) or
  • (logical parallel expression) or
  • (NANY logical parallel expression)
  • body 2
  • endloop
  • Parallel exit conditions
  • The UNTIL exits when responder(s) are detected
  • With NANY, the UNTIL exits when a no-responder
    condition occurs
  • body 2 represents statements executed if UNTIL
    not satisfied.

57
Example
first i 0 loop if aa i then b
b 2 endif i i 1 until i gt 4
endloop
  • Before
  • mask aa b
  • 1 0 3
  • 1 3 4
  • 1 0 1
  • 0 1 3
  • 1 1 5
  • 1 4 6
  • 1 5 2
  • After i0
  • mask mask1 aa b
  • 1 1 0 5
  • 1 0 3 4
  • 1 1 0 3
  • 0 0 1 3
  • 1 0 1 5
  • 1 0 4 6
  • 1 0 5 2

58
Example
first i 0 loop if aa i then b
b 2 endif i i 1 until i gt 4
endloop
  • Before
  • mask aa b
  • 1 0 3
  • 1 3 4
  • 1 0 1
  • 0 1 3
  • 1 1 5
  • 1 4 6
  • 1 5 2
  • After i0 i1
  • mask aa b b
  • 1 0 5 5
  • 1 3 4 4
  • 1 0 3 3
  • 0 1 3 3
  • 1 1 5 7
  • 1 4 6 6
  • 1 5 2 2

59
Example
first i 0 loop if aa i then b
b 2 endif i i 1 until i gt 4
endloop
  • Before
  • mask aa b
  • 1 0 3
  • 1 3 4
  • 1 0 1
  • 0 1 3
  • 1 1 5
  • 1 4 6
  • 1 5 2
  • After i0 i1 i3
  • mask aa b b b
  • 1 0 5 5 5
  • 1 3 4 4 6
  • 1 0 3 3 3
  • 0 1 3 3 3
  • 1 1 5 7 7
  • 1 4 6 6 6
  • 1 5 2 2 2

60
Example
first i 0 loop if aa i then b
b 2 endif i i 1 until i gt 4
endloop
  • Before
  • mask aa b
  • 1 0 3
  • 1 3 4
  • 1 0 1
  • 0 1 3
  • 1 1 5
  • 1 4 6
  • 1 5 2
  • After i0 i1 i3 i 4
  • mask aa b b b b
  • 1 0 5 5 5 5
  • 1 3 4 4 9 9
  • 1 0 3 3 3 3
  • 0 1 3 3 3 3
  • 1 1 5 7 7 7
  • 1 4 6 6 6 8
  • 1 5 2 2 2 2

Note The example is to illustrate only it could
be done easier.
61
The Parallel FOR-LOOP
  • FOR is used for looping and retrieving
  • Used when a process must be repeated for each
    cell that satisfies a certain condition.
  • It is similar to the sequential FOR, but the
    conditional logical expression must be a parallel
    one.
  • Initially, the conditional expression is
    evaluated and the active responders are stored in
    an index variable.

62
The Parallel FOR-LOOP (cont)
  • The top responder is processed during each pass
    through the FOR-loop until no responders remain.
  • The contents of the index variable is updated at
    the bottom of the loop (i.e., the top 1 is
    changed to 0)
  • The index variable is used to walk through the
    responders and to retrieve each responders
    records.
  • The conditional condition is never re-evaluated.

63
Example
  • sum 0
  • for xx in tail ! 999 /evaluates and
    stores in xx/
  • sum sum valuexx
  • endfor xx
  • tail xx value
  • 3 1 10 1st time sum
    sum 10 10
  • 5 1 20 2nd time sum sum
    20 30
  • 999 0 30
  • 6 1 40 3rd time sum sum
    40 70

64
The Parallel WHILE Loop
  • Similar to LOOP-UNTIL loop except it re-evaluates
    the conditional expression before each iteration.
  • Format
  • WHILE ltpara index vargt in ltpara logical
    expressiongt
  • body
  • endwhile ltpara index vargt
  • The iteration terminates when there are no
    responders to the parallel logical expression.
  • Note the number of responders can increase,
    decrease, or remain the same during a run.
  • Unlike the FOR loop, this loop can be infinite.

65
The Parallel WHILE Loop
  • Unlike the FOR statement, this construct
    re-evaluates the logical conditional statement
    prior to each execution of the body of the while.
  • The bit array resulting from the evaluation of
    the conditional statement is assigned to the
    index parallel variable on each pass.
  • The index parallel array is available for use
    within the body for each loop and can be changed
    within the body.
  • The iteration is terminated when the conditional
    statement is tested and there are no responders.
  • That is, all zeros in the index parallel
    variable.
  • See ASC Primer pg 21-22 for more information

66
sumit 0 while xx in (aa 2) sumit
sumit bxx if (cxx 1) then if
(aa 2) then aa 5 endif else
aaxx 7 endif msg "In loop, sumit is "
sumit print aa, c in active
endwhile xx
  • Before
  • aa b c
  • 1 17 0
  • 2 13 0
  • 2 8 1
  • 3 11 1
  • 2 9 0
  • 4 67 0

After 2nd loop In loop, sumit is 21 DUMP OF
ASSOCIATION ACTIVE FOLLOWS AA,C,
1 0 7 0 5 1 3
1 5 0 4 0
  • After 1st loop
  • In loop, sumit is 13
  • DUMP OF ASSOCIATION
  • ACTIVE FOLLOWS
  • AA,C,
  • 1 0
  • 7 0
  • 2 1
  • 3 1
  • 2 0
  • 4 0

67
When is Conditional Tested in Loops?
  • UNTIL loops evaluate the test condition each time
    the UNTIL statement is encountered.
  • WHILE loops have the test condition reevaluated
    before each iteration.
  • The FOR loop evaluates the conditional expression
    initially and stores the resulting active
    responders in an index variable. This index
    variable is then used to retrieve items
    successively.

68
Special Commands to Obtain Parallel Variable
Values
  • Special Commands
  • Get Statement
  • Next Statement
  • Minimum and maximum values
  • These commands are needed to implement some of
    the associative functions.
  • In particular, get and next allow the
    programmer to select an active responder for
    further processing.
  • get next implement the PickOne property.

69
GET Statement
  • Used to access a specific field in the memory of
    an active processor.
  • Format
  • get ltparallel index vargt in ltparallel logical
    expressiongt
  • body
  • elsenany
  • body
  • The parallel logical expression is evaluated and
    its value assigned to the parallel index
    variable.
  • The parallel index variable will identify the
    first active responder (if one exists) that
    satisfies the conditional test
  • first active responder executes the commands in
    the GET body.
  • If there are no responders, the GET body is not
    executed.
  • If GET contains an ELSENANY statement, its body
    is executed by all active processors when GET has
    no responders.

70
Example
get xx in tail 1 valxx 0 endget xx
  • After
  • tail val
  • 10 100
  • 1 0
  • 2 77
  • 1 83
  • Before
  • tail val
  • 10 100
  • 1 90
  • 2 77
  • 1 83

71
The NEXT Statement
  • Similar to GET statement, except NEXT deactivates
    the responder accessed each time it is called.
  • Format
  • next ltparallel index vargt in ltparallel logical
    expressiongt
  • body
  • elsenany
  • body
  • Unlike GET, two successive calls to NEXT is
    expected to select two distinct PEs and
    association records.
  • NEXT is almost always used within a looping
    statement to walk through the selected PEs to do
    something in each.

72
Example
int parallel aa, b used aa
4 logical parallel used next xx in
used index parallel xx
bxx -1
endnext xx
  • After
  • aa used b
  • 1 0 2
  • 4 0 -1
  • 4 1 2
  • 19 0 2
  • 4 1 2
  • Before
  • aa used b
  • 1 0 2
  • 4 1 2
  • 4 1 2
  • 19 0 2
  • 4 1 2

Caution xx in aa 4 is not allowed. A
logical variable used must be involved and its
top 1 is changed.
73
Example see next slide for results
  • main tryout
  • int scalar k
  • int parallel aa, b, c
  • logical parallel used , active
  • index parallel xx
  • associate aa, b, c with active
  • read aa, b, c in active
  • print aa, b, c in active / to
    see input /
  • / Tryout of assignment statements /
  • b aa 5
  • c 3 5
  • used aa 5 / selects all
    processors with 5 in aa field /
  • next xx in used / selects the top
    processor in used /
  • k bxx 2 / could do this next
    line in one line /
  • cxx k / done this way to
    show a scalar can be /
  • endnext xx / set /
  • print aa, b, c in active

74
b aa 5 next xx in
used c 3 5
k bxx2 used aa 5
cxx k endnext xx
  • Before
  • DUMP OF ASSOCIATION ACTIVE FOLLOWS
  • AA,B,C,
  • 1 2 3
  • 2 3 4
  • 5 6 7
  • 8 9 10
  • 11 12 13
  • 5 1 2
  • 5 2 1
  • After
  • DUMP OF ASSOCIATION ACTIVE FOLLOWS
  • AA,B,C,
  • 1 6 8 Arrows show
  • 2 7 8 PEs in used
  • 5 10 12
  • 8 13 8 xx is first PE
  • 11 16 8
  • 5 10 8
  • 5 10 8

75
Printing Scalars, Text Messages, and Dumping the
Entire Parallel Array for a Field
Format msg string list Example msg "The
values are " aa, k The values are PE 0
0 1 2 5 8 11 5 5 0 0 0 0 0
0 0 0 PE 16 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 PE 32 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 PE 48 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 ... PE288 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 PE304 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 12
76
MAXVAL and MINVAL Functionsand Other Functions
  • MAXVAL(MINVAL) returns the maximum (minimum)
    value among active responders.
  • If (tail ! 1) then k maxval(weight)
  • endif
  • MAXDEX (MINDEX) returns the index of an entry
    where the maximum (minimum) value of the
    specified item occurs among the active
    responders.
  • Recall With an associative SIMD, above are
    constant time functions as they are supported in
    hardware.
  • With a SIMD that is not associative, they can
    still be performed, but they are not constant
    time functions and their timings depend upon the
    interconnection network.
  • There are several variations of above functions
    in ASC Primer i.e. finding the nth smallest
    value.
  • The function COUNT() returns the number of active
    responders. It can be useful in debugging.

77
Dynamic Storage Allocation
  • allocate is used to identify a processor whose
    association record is currently unused.
  • Will be used to store a new association record
  • Creates a parallel index that points to the
    processor selected
  • release is used to de-allocate storage of
    specified records in an association
  • Can release a single record or multiple records
    simultaneously.
  • Example
  • char parallel node, parent
  • logical parallel tree
  • index parallel x
  • associate node, level, parent with
    tree
  • ......
  • allocate x in tree
  • nodex B
  • endallocate x
  • release parent .eq. A from tree.

78
Performance Monitor
  • Keeps track of number of scalar and parallel
    operations.
  • It is turned on and off using the PERFORM
    statement
  • perform 1
  • perform 0
  • The number of scalar and parallel operations can
    be printed using the MSG command
  • MSG Number of parallel and scalar operations
    are PA_PERFORM SC_PERFORM
  • The ASC Monitor is important for evaluation and
    comparison of various ASC algorithms and
    software.
  • It can also be used to determine or estimate
    running time.
  • See Pg 30-31 of ASC Primer for more information

79
Additional Features
  • Restricted subroutine capability is currently
    available
  • See call and include on pg 25-7 of ASC Primer.
  • ASC has a rather simplistic subroutine
    capability.
  • While not difficult, the subroutine details will
    not be covered in slides.
  • Assignment will not require use of subroutines.
  • Use of personal pronouns and articles in ASC make
    code easier to read and shorter.
  • See page 29 of ASC Primer.
  • Again, the details are not covered in slides.

80
Basic Program Structure
  • Main program_name
  • Constants
  • Variables
  • Associations
  • Body
  • End

81
Software
  • Compiler and Emulator
  • DOS/Windows, UNIX (Linux)
  • WaveTracer
  • Connection Machine
  • http//www.cs.kent.edu/parallel/ and look under
    software
  • Use any text editor.
  • Careful on moving files between DOS and UNIX!

Anyprog.asc
-e -wt -cm
ASC Compiler
Anyprog.iob
-e -wt -cm
ASC Emulator
Standard I/O
File I/O
82
Simple ASC Program
  • Example
  • Consider an ASC Program that computes the area of
    various simple shapes (circle, rectangle,
    triangle).
  • Here is an example shapes.asc
  • Here is the data shapes.dat
  • Here is the shapes.out
  • NOTE Above links are only active during the
    slide show.

83
Software
  • To compile the previous program
  • asc1.exe e shapes.asc
  • To execute your program
  • asc2.exe e shapes.iob
  • asc2.exe e shapes.iob lt shapes.dat
  • asc2.exe e shapes.iob lt shapes.dat gt
    shapes.out
  • Commands are executed in Windows from a command
    window.
  • See CMD command-line Environment document at
    http//www.cs.kent.edu/jbaker/PDC-F07/references/
    CMD_Commands.doc
  • Can execute UNIX (Linux) commands from line
    prompt
  • Dont forget to change mode of compiler
    emulator to be executable using chmod command.

84
MST Program Examplein ASC Primer
  • View ASC code as pseudocode and consider how to
    create equivalent Cn code for the ClearSpeed Board

85
The Graph and Its Data File
1 2 2 1 6 7 1 7 3 2 1 2 2 3 4 2 7 6 3 2 4 3 8 2 3
4 2 4 5 1 4 8 8 4 3 2 5 4 1 5 9 2 5 6 6 6 1 7
6 9 5 6 5 6 7 2 6 7 1 3 7 9 1 7 8 3 8 7 3 8 9 4 8
4 8 8 3 2 9 7 1 9 6 5 9 8 4 9 5 2
2
4
7
3
6
5
1
2
3
2
6
4
8
2
1
86
Header and declarations / The ASC Minimum
Spanning Tree - with slight modifications from
ASC PRIMER / main mst / Note Vertices were
encoded as integers / deflog (TRUE, 1) deflog
(FALSE, 0) char parallel tail, head int
parallel weight, state char scalar
node index parallel xx logical parallel
nxtnod, graph, result
87
Obtain input associate head, tail,
weight, state with graph read tail,
head, weight in graph Mark the active
PEs for the next command (otherwise the zeros in
the fields where data wasnt read in would be
used.) Find a tail whose weight is
minimal. setscope graph node
tailmindex(weight) endsetscope Because of
the layout of the data file, we would find the
first PE containing the minimal weight (which is
1) to be the PE holding 4 5 1. So node would be
set to 4.
88
The Graph and Its Data File
1 2 2 1 6 7 1 7 3 2 1 2 2 3 4 2 7 6 3 2 4 3 8 2 3
4 2 4 5 1 4 8 8 4 3 2 5 4 1 5 9 2 5 6 6 6 1 7
6 9 5 6 5 6 7 2 6 7 1 3 7 9 1 7 8 3 8 7 3 8 9 4 8
4 8 8 3 2 9 7 1 9 6 5 9 8 4 9 5 2
2
4
7
3
6
5
1
2
3
2
6
4
8
2
1
89
Continued Mark as being in set V2, all edges
that have tails equal to node, i.e. 4 if (node
tail) then state 2 else state 3
endif This would mark the following edges as
having a state of 2, i.e. they are in V2. 4 5 1 4
8 8 4 3 2
90
The Graph and Its Data File
1 2 2 1 6 7 1 7 3 2 1 2 2 3 4 2 7 6 3 2 4 3 8 2 3
4 2 4 5 1 4 8 8 4 3 2 5 4 1 5 9 2 5 6 6 6 1 7
6 9 5 6 5 6 7 2 6 7 1 3 7 9 1 7 8 3 8 7 3 8 9 4 8
4 8 8 3 2 9 7 1 9 6 5 9 8 4 9 5 2
2
4
7
3
6
5
1
2
3
2
6
4
8
2
1
91
Continued
while xx in (state 2) if (state
2) then nxtnod mindex(weight) endif
node headnxtnod In loop 0 The
only edges with the state of 2 are 4 5 1 4 8
8 4 3 2 so first one is selected and node is
set to 5. statenxtnod 1 The edge 4 5
receives a state of 1.
92
The Graph and Its Data File
1 2 2 1 6 7 1 7 3 2 1 2 2 3 4 2 7 6 3 2 4 3 8 2 3
4 2 4 5 1 4 8 8 4 3 2 5 4 1 5 9 2 5 6 6 6 1 7
6 9 5 6 5 6 7 2 6 7 1 3 7 9 1 7 8 3 8 7 3 8 9 4 8
4 8 8 3 2 9 7 1 9 6 5 9 8 4 9 5 2
2
4
7
3
6
5
1
2
3
2
6
4
8
2
1
93
Continued
if (head node state ! 1)
then state 0 endif We no longer want
edges with a head of 5 so we throw those out of
consideration by setting their states to
0. This would be edges 6 5 and 9 5 in data file.
94
The Graph and Its Data File
1 2 2 1 6 7 1 7 3 2 1 2 2 3 4 2 7 6 3 2 4 3 8 2 3
4 2 4 5 1 4 8 8 4 3 2 5 4 1 5 9 2 5 6 6 6 1 7
6 9 5 6 5 6 7 2 6 7 1 3 7 9 1 7 8 3 8 7 3 8 9 4 8
4 8 8 3 2 9 7 1 9 6 5 9 8 4 9 5 2
2
4
7
3
6
5
1
2
3
2
6
4
8
2
1
Green entries are edges thrown out.
95
Continued if (state 3 node tail)
then state 2 endif The edges turned to a
state of 2 are then 5 4 5 9
5 6 Recall these are possible
candidates for the next round. Do we want 5
4? Isnt 4 5 already in? Solving the problem by
using a picture didnt run into this problem
because once 5 4 was in, the 4 5 was eliminated
from consideration automatically. So- we need to
correct this. When an edge is included like X Y,
we need to set the state of Y X to 0 to keep it
out of further consideration. Is anything else
needed?
96
Correct Implement MST Algorithm in Cn
  • The algorithm as coded selects D first, while we
    selected A.
  • The business at the beginning to select a minimal
    weight edge and use one of its nodes as the
    starting point was to avoid the need to assign a
    character to a variable.
  • Since we are using integer nodes, we could
    eliminate
  • setscope graph
  • node tailmindex(weight)
  • endsetscope
  • and just set node to 1, i.e.
  • node 1
  • This might help you see what is going on.
  • Try to trace the MST algorithm with this change
    and correct it. (Homework)

97
Shortest Path Problem for Graphs
  • The minimal spanning tree algorithm by Prim is
    called a greedy algorithm.
  • Greedy algorithms are usually applied to
    optimization problems i.e. a set of
    configurations is searched to find one that
    minimizes or maximizes some objective function
    defined on these configurations.
  • The approach is to proceed with a sequence of
    choices.
  • The sequence starts from some well-understood
    starting configuration.
  • Then we iteratively make choices that are locally
    best from among those currently possible.
  • This approach does not always lead to a solution,
    but if it does, the problem is said to possess
    the greedy-choice property.

98
The Greedy-choice Property.
  • This property says a global optimal configuration
    can be reached by a series of locally optimal
    choices i.e. choices that are best from among
    the possibilities available at a time.
  • This allows us to avoid the exponential timing
    that would result if, for example, we had to
    generate all trees in a graph and then find the
    minimal one.
  • Many other problems are known to have the greedy
    choice problem.
  • However, you need to be careful. Sometimes just a
    slight change in the wording of the problem turns
    it into a problem that doesnt have the
    greedy-choice property. In fact, a slight change
    can produce an NP-complete problem.

99
Some Problems Known to Have the Greedy-choice
Property
  • (Minimal Spanning Tree) just discussed
  • (Shortest Path) Find the shortest path between
    two nodes on a connected, weighted graph where
    the weights are positive and represent distances.
  • (Fractional Knapsack) Given a set of n items,
    such that each item i has a positive value bi and
    a positive weight wi. Find a maximum value subset
    that does not exceed a given weight W, provided
    we can take fractional values for the items,
  • Think of this as a knapsack being filled to not
    exceed the weight you can carry. Each item has
    benefit to you, but it can be split up into
    fractional parts, as is possible with granola
    bars, popcorn, water, etc.

100
However, The Wording is Delicate
  • The Fractional Knapsack Problem is one that must
    be carefully stated. If, for the n items, you
    only allow an item to be taken or rejected, you
    have the 0-1 Knapsack Problem which is known to
    be NP-complete i.e. it doesnt have the greedy
    choice property.
  • This has a pseudo-polynomial algorithm i.e. one
    that runs in O(nW) time, where W is the weight.
    So the timing is not proportional just to the
    input size of the problem, n, but to a function
    involved in the problem statement.
  • In fact, if W 2n, then the pseudo-polynomial
    algorithm for this problem is as bad as the brute
    force method of trying all combinations.

101
Some Problems with the Greedy-choice Property
  • (Task Scheduling Problem) We are given a set T of
    n tasks such that each task i has a start time si
    and a finish time fi where si lt fi.
  • Task i must start at time si and it is guaranteed
    to be finished by time fi.
  • Each task has to be performed on a machine and
    each machine can execute only one task at a time.
  • Two tasks i and j are non-conflicting if fi sj
    or fj si.
  • Two tasks can be scheduled to be executed on the
    same machine only if they are non-conflicting.
  • What is the minimum number of machines needed to
    schedule all the tasks?

102
A Greedy-choice Algorithm for the Shortest Path
Problem
  • Given a connected graph with positive weights and
    two nodes s, the start node, and d, the
    destination node. Find a shortest path from s to
    d.
  • A greedy choice algorithm is due to Dijkstra.
  • Unlike the MST algorithm, more must be considered
    than just the minimum weight on edge leading out
    of a node.
  • It is easy to find examples where that approach
    wont work for this problem.
  • Try to find one. (Exercise)

103
Dijkstras Sequential Algorithm for the Shortest
Path Problem
  • Let S be the set of nodes already explored and V
    all the nodes in the graph
  • For each u in S, we store a distance value d(u)
    which will be defined below.
  • Initially, only s, the starting point, is in S
    and d(s) 0.
  • While S doesnt include dp, the destination
    point,
  • Select a node v not in S with at least one edge
    from S for which the following is minimal
  • d(v) min d(u) wgt(u,v)
  • Here, the min is taken over all edges e(u,v)
    with u?S and v?S and wgt(u,v) is the weight of
    edge e.
  • Add v to S and define d(v) d(v).
  • Stop when dp, the destination point, is placed in
    S.

104
Example of the Greedy-choice only part of the
graph is shown
d(a) 1 d(b) 2 d(s) 0 Choose minimal
from d(c) d(a) 3 4 d(x) min d(a) 2
, d(s) 4,
d(b) 2 3 d(e) d(b) 3 5
3
c
a
1
2
1
4
s
x
2
b
2
2
3
e
Set S
Therefore, let d(x) 3 and put x in S.
105
Shortest Path Homework
  • More information about this assignment will be
    posted on the homework section of course webpage.
  • You should first complete your homework for the
    MST.
Write a Comment
User Comments (0)
About PowerShow.com