Title: *Laboratory for Theoretical Computer Science
1Parallelization of the Petri Net Unfolding
Algorithm
K.Heljanko, V.Khomenko, and M.Koutny
- Laboratory for Theoretical Computer Science
- Helsinki University of Technology
-
- Department of Computing Science
- University of Newcastle upon Tyne
-
2Motivation
- Partial order semantics of Petri nets
- Alleviate the state space explosion problem
- Efficient model checking algorithms
3The ERV unfolding algorithm
Unf ? places from M0 pe ? transitions enabled by
M0 cut-off ? ? while pe ? ? extract e ? min
pe if e is a cut-off event then cut-off ?
cut-off ? e else add e and its
postset into Unf UpdatePotExt(pe, Unf,
e) end while add cut-off events and their
postsets to Unf
4P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
5P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T1
P1
P7
P8
P9
T6
6P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
T1
P1
T3
P3
P7
P8
P9
T6
7P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
T1
P1
T3
P3
P7
T7
P8
P10
P9
T8
T6
P11
8P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
P1
T3
P3
P7
T7
P8
P10
P9
T8
T6
P11
9P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
P1
T3
P5
P3
P7
T7
P8
P10
P9
T8
T6
P11
10P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
P1
T3
P5
P3
P7
T7
P12
P8
P10
P9
T8
T6
P11
11P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
P1
T3
P5
P3
P7
T7
P12
P8
P10
T9
P9
T8
P13
T6
P11
12P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
T5
P6
P1
T3
P5
P3
P7
T7
P12
P8
P10
T9
P9
T8
P13
T6
P11
13P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
T5
P6
P1
T3
P5
P3
P7
T7
P12
P8
P10
T9
P14
T10
P9
T8
P13
T6
P11
14P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
T5
P6
P1
T3
P5
P3
P7
T7
P12
P8
P10
T9
P14
T10
P9
T8
P13
T6
P11
15P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
T1
T4
T5
P6
P1
T3
P5
P3
P7
T7
P12
P8
P10
T9
P14
T10
P9
T8
P13
T6
P11
16P10
T7
T2
P4
P12
P2
P7
P1
P14
P9
T10
T5
P6
T1
T4
T6
T9
P8
P3
T8
T3
P11
P13
P5
T2
P2
P4
P7
T1
T4
T5
P6
P1
P1
T3
P5
P8
P3
P7
T7
P12
P8
P10
P7
T9
P14
T10
P9
P9
T8
P13
T6
P8
P11
17Step 1 Unfolding algorithm with slices
while pe ? ? extract appropriate non-empty Sl
? pe for all e ? Sl in any order refining
do if e is a cut-off event then cut-off
? cut-off ? e else add e
and its postset into Unf
UpdatePotExt(pe, Unf, e) end for end while
18Problem 1
19 Correctness
Theorem Let Pref' and Pref'' be the prefixes of
the unfolding of a bounded net system, produced
by arbitrary runs of the basic and slicing
algorithms respectively. Then Pref' and Pref''
are isomorphic.
20Problem 2
How to choose slices to satisfy the imposed
condition?
21Example
22Step 2 Parallelisation
The events in a slice can be inserted into the
prefix all together, and their possible
extensions can be computed in parallel!
23Problem 3
The same possible extensions can be computed for
several times!
T2
P2
P4
T4
T1
P1
T3
P5
T4
P3
P7
T7
T9
P12
P8
P10
P9
T9
T8
P13
T6
P11
24Restricting the scope
Sl
25Restricting the scope
26Restricting the scope
27Restricting the scope
28Problem 4
How to get rid of the ordering in the for all
loop?
- If there are no cut-offs in the slice Sl then
the order in which the events are processed is
irrelevant.
29Cut-offs in advance
One can check the cut-off criterion as soon as a
new possible extension is computed
- Advantages
- No cut-offs in a slice (fixes Problem 4)
- The cut-off criterion is checked in
UpdatePotExt(pe, Unf, e) the part of the
algorithm which is computed in parallel
30The queue of possible extensions
- Can be represented as a sequence Sl1,Sl2,Sl3,
- where Sli contains events whose local
configurations have the size i - Insertion an event e into the queue is reduced
to adding it to the set Sle - Choosing a slice is reduced to detaching the
first - non-empty set Sli from the queue
No comparisons of configurations are involved!
31Comparisons of configurations
The total number of comparisons of configurations
performed by the parallel algorithm is equal to
Ecut, i.e. there are no redundant
comparisons! In contrast, the ERV unfolding
algorithm performs O(ElogE) comparisons.
32Experimental results
4?PentiumTM III 500MHz 512K cache
processors, 512M 133MHz RAM
Processors 2 3 4 Speedup 1.68 2.14 2.27
The speedup is real, but not linear due to
limited memory bandwidth (bus contention)
33Conclusions
- ? The algorithm is faster even on a uniprocessor
- ? The size of slices is usually large, which
allows for good parallelization - ? More than 95 of time is spent in the parallel
sections of the algorithm - ? Can be efficiently implemented even on
distributed memory architectures - Linear speedup for most of the examples (in
theory) - ? Limited memory bandwidth (bus contention)