Title: Dependence Precedence
1DependencePrecedence
Lecture 26
2Precedence Dependence
- Can we execute a 1000 line program with 1000
processors in one step? - What are the issues to deal with in various
parallelizing situations - Parallel Programming?
- Instruction Level Parallelism?
- What type analysis is used to study concurrent
database operation?
3Dependence
4Making Use of Processors
- In parallelizing algorithms, we want to use as
many processors as possible in an effort to
finish in as little time as possible. - Often, it is not possible to make complete use of
all processors in all time units - Some instructions (or sections of instructions)
depend upon others - Others have a different, related problem called
precedence (next section)
5Input and Output
- Input and output cannot be parallelized in the
strict sense because were dealing with a user. - We assume multiple, parallel streams of input and
output (modems, etc.).
6Read and Print statements
- Read(x) x lt- keyboard
- Print(x) screen lt- x
7Dependency Relationships
- Dependencies are relationships between the steps
of an algorithm such that one step depends upon
another. - (S1) read (a)
- (S2) b lt- a 3
- (S3) c lt- b a
8Dependency Relationships
- Dependencies are relationships between the steps
of an algorithm such that one step depends upon
another. - (S1) a lt- keyboard
- (S2) b lt- a 3
- (S3) c lt- b a
- Here, S2 is dependent on S1 to provide the
appropriate value of a. - Similarly, S3 is dependent on both S1 (for as
value) and S2 (for bs value). - Since S2 needs a also, we can simply say that S3
is dependent on S2.
Dont need
9Dependence
- Defined by a read after write relationship
- This means moving from the left to the right side
of the assignment operator. - a lt- 5
- b lt- a 2
Note Read and Write in this case refer to
reading the value from a memory location and
writing a value to a memory location. Not
Input/Output.
10Graphing Dependence Relations
Processors
Time
11Dependency Graphs
- (S1) read (a)
- (S2) b lt- a 3
- (S3) c lt- b a
12Dependency Graphs
Processors
- (S1) a lt- keyboard
- (S2) b lt- a 3
- (S3) c lt- b a
- In this case, it does not matter how many
processors we have we can use only one processor
to finish in 3 time units.
Time
13What If There Are No Dependencies?
- (S1) read (a)
- (S2) b lt- b 3
- (S3) c lt- c 4
- We can use three processors to get it done in a
single time chunk.
Processors
Time
14A Dependency Example
- (S1) read (a)
- (S2) read (b)
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
15A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
16A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
S1
S2
17A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
S1
S2
S3
18A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
S1
S2
S3
S4
19A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
S1
S2
S3
S4
S5
20A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
S1
S2
S3
S4
S5
S6
21A Dependency Example
- (S1) a lt- keyboard
- (S2) b lt- keyboard
- (S3) c lt- a 4
- (S4) d lt- b / 3
- (S5) e lt- c d
- (S6) f lt- d 8
S1
S2
S3
S4
S5
S6
Using 2 processors, we finish 6 instructionsin 3
units of time.
WOW!
22Dependence and Iteration
- Ignore steps that are not part of loop (overhead
costs similar to making parallelism work) - Dont worry about loop, exitif, counter
variables, endloop, etc. - Use notation to indicate passes
- Unroll the loop, replacing the counter variable
with a literal value.
23An Iterative Example
- I lt- 1
- loop
- exitif (I gt MAX_ARRAY)
- (S1) read (AI)
- (S2) BI lt- AI 4
- (S3) CI lt- AI / 3
- (S4) DI lt- BI / CI
- I lt- I 1
- endloop
24- (S1) read (A1)
- (S2) B1 lt- A1 4
- (S3) C1 lt- A1 / 3
- (S4) D1 lt- B1 / C1
- (S1) read (A2)
- (S2) B2 lt- A2 4
- (S3) C2 lt- A2 / 3
- (S4) D2 lt- B2 / C2
- (S1) read (A3)
- (S2) B3 lt- A3 4
- (S3) C3 lt- A3 / 3
- (S4) D3 lt- B3 / C3
Any interference between iterations?
25An Iterative Example
Using 6 Processors
26Limited Number of Processors
- What if the number of processors is fixed?
- Some processors may be being used by another
program/user - If the number of processors available are less
than the number of processors that can be
utilized, shift instructions into lower time
units.
27A Limited Processor Example
Two Processors
28Questions?
29Precedence
30Precedence Relationships
- Exists if a statement would contaminate the data
needed by another, preceding instruction. - (S1) read (a)
- (S2) print (a)
- (S3) a lt- a 7
- (S4) print (a)
31Precedence Relationships
- Exists if a statement would contaminate the data
needed by another, preceding instruction. - (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- S2 and S3 are dependent on S1 (for the initial
value of a).
32Precedence Relationships
- Exists if a statement would contaminate the data
needed by another, preceding instruction. - (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- S2 and S3 are dependent on S1 (for the initial
value of a). - S4 is dependent on S3 (for updated a).
33Precedence Relationships
- Exists if a statement would contaminate the data
needed by another, preceding instruction. - (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- S2 and S3 are dependent on S1 (for the initial
value of a). - S4 is dependent on S3 (for updated a).
- There is also a precedence relationship between
S2 and S3.
34Precedence Relationships
- Exists if a statement would contaminate the data
needed by another, preceding instruction. - (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- S2 and S3 are dependent on S1 (for the initial
value of a). - S4 is dependent on S3 (for updated a).
- There is also a precedence relationship between
S2 and S3. - S3 must follow S2, else S3 could corrupt what S2
does.
35Precedence Relationships
- Exists if a statement would contaminate the data
needed by another, preceding instruction. - (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- S2 and S3 are dependent on S1 (for the initial
value of a). - S4 is dependent on S3 (for updated a).
- There is also a precedence relationship between
S2 and S3. - S3 must follow S2, else S3 will corrupt what S2
does.
36Precedence
- Defined by a write after write or write after
read relationship. - This means using the variable on the left side of
the assignment operator after it has appeared
previously on the right or left. - b lt- a 2 a lt- 7
- a lt- 5 a lt- 5
37Showing Precedence Relations
Processors
Time
38Precedence Graphs
- (S1) read (a)
- (S2) print (a)
- (S3) a lt- a 7
- (S4) print (a)
39Precedence Graphs
- (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- Precedence arrow blocks S3 from executing until
S2 is finished.
40Precedence Graphs
- (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- Precedence arrow blocks S3 from executing until
S2 is finished. - Dependency arrow between S1 and S3 is superfluous
41What if there is No Precedence?
- (S1) read (a)
- (S2) b lt- b 3
- (S3) c lt- c 4
- We can use three processors to get it done in a
single time chunk.
42Precedence and Iteration
- Ignore steps that are not part of loop (overhead
costs similar to making parallelism work) - Dont worry about loop, exitif, counter
variables, endloop, etc. - Use notation to indicated passes
- Unroll the loop, replacing the counter variable
with a literal value.
Same as dependence...
43An Iterative Example
- i lt- 1
- loop
- exitif (i gt 3)
- (S1) read (a)
- (S2) print (a)
- (S3) a lt- a 7
- (S4) print (a)
- i lt- i 1
- endloop
44An Iterative Example
- i lt- 1
- loop
- exitif (i gt 3)
- (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- i lt- i 1
- endloop
45- (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
- (S1) a lt- keyboard
- (S2) screen lt- a
- (S3) a lt- a 7
- (S4) screen lt- a
46Iteration and PrecedenceGraphs
47Space vs. Time
- We can optimize time performance by changing
shared variable to an array of independent
variables. - i lt- 1
- loop
- exitif (i gt 3)
- (S1) read (ai)
- (S2) print (ai)
- (S3) ai lt- ai 7
- (S4) print (ai)
- i lt- i 1
- endloop
48Precedence Graphs
- We can use 3 processors to finish in 4 time
units. - Note that product complexity is unchanged.
49What if Both Precedence and Dependence?
- If two instructions have both a precedence and a
dependence relation - (S1) a lt- 5
- (S2) a lt- a 2
- showing only dependence is sufficient.
50Another Iterative Example
- i lt- 1
- loop
- exitif (i gt N)
- (S1) read (ai)
- (S2) ai lt- ai 7
- (S3) c lt- ai / 3
- (S4) print (c)
- i lt- i 1
- endloop
51Another Iterative Example
- i lt- 1
- loop
- exitif (i gt N)
- (S1) ai lt- keyboard
- (S2) ai lt- ai 7
- (S3) c lt- ai / 3
- (S4) screen lt- c
- i lt- i 1
- endloop
52- (S1) a1 lt- keyboard
- (S2) a1 lt- a1 7
- (S3) c lt- a1 / 3
- (S4) screen lt- c
- (S1) a2 lt- keyboard
- (S2) a2 lt- a2 7
- (S3) c lt- a2 / 3
- (S4) screen lt- c
- (S1) a3 lt- keyboard
- (S2) a3 lt- a3 7
- (S3) c lt- a3 / 3
- (S4) screen lt- c
53We have precedence relationships between
iterations because of the shared c variable.
54Crossing Index Bounds Example
- I lt- 1
- loop
- exitif( I gt MAX ) // MAX is 3
- (S1) AI lt- AI BI
- (S2) read( BI )
- (S3) CI lt- AI 3
- (S4) DI lt- BI AI1
- I lt- I 1
- endloop
55Crossing Index Bounds Example
- I lt- 1
- loop
- exitif( I gt MAX ) // MAX is 3
- (S1) AI lt- AI BI
- (S2) BI lt- keyboard
- (S3) CI lt- AI 3
- (S4) DI lt- BI AI1
- I lt- I 1
- endloop
56- (S1) A1 lt- A1 B1
- (S2) B1 lt- keyboard
- (S3) C1 lt- A1 3
- (S4) D1 lt- B1 A2
- (S1) A2 lt- A2 B2
- (S2) B2 lt- keyboard
- (S3) C2 lt- A2 3
- (S4) D2 lt- B2 A3
- (S1) A3 lt- A3 B3
- (S2) B3 lt- keyboard
- (S3) C3 lt- A3 3
- (S4) D3 lt- B3 A4
57Precedence betweeniterations
58Questions?
59Practical Applications
- We used the single assignments as easy
illustrations of the principles. - There are additional real applications of this
capability - Much bigger than one assignment
- Smaller than one assignment
60http//setiathome.ssl.berkeley.edu/
61Large Data Sets
- Consider the SETI project
- What do you now know about the data that makes it
practical to distribute across millions of
processors?
No precedence or dependence between data sets!
62Instruction Processing
- Break computers processing into steps
- A - fetch instruction
- B - fetch data
- C - logical processing (math, test and branch)
- D - store result
- Independent for all sequential processing
- Dependency occurs when branch ruins three
instruction fetches
I lt- 0
loop
exitif( I gt MAX)
blah...
blah...
blah...
I lt- I 1
endloop
63Questions?
64(No Transcript)