Title: Deadlock Detection for Distributed Process Networks
1Deadlock Detection for Distributed Process
Networks
Alex G. Olson Brian L. Evans The University of
Texas at Austin
2Motivation for Formal Models
- Applications may require higher input/output and
computational rates than one CPU can handle - Exploit parallelism for high performance
- Parallel (one machine) or distributed (many
machines) - Pitfalls of parallel/distributed programming
- Synchronization, shared memory, and deadlock
- Debugging concurrent code on many processors
- Formal models have provable properties
- Determinacy programs are correct by construction
- Validation only debug each component separately
- Scalability faster execution with more CPUs
3Applications
Application Input Data Rate Computation Rate Output Data Rate
Sonar Beamforming Allen Evans 00 160MB/s 4-20 GFLOPS 72MB/s
Bzip2 (block-zip)Compression 1-4MB/s 1-4 GIPS (approx) 1-4MB/s
MPEG4 Encoding (4CIF) 18MB/s 2 GIPS 1MB/s
H.264 Video Server(QCIF) Banerjee 02 1MB/s 1GIPS 40KB/s
Design Space Exploration Vissers Wolf,
1999 Image Processing Webb et al., 1999
4Process Networks Kahn, 1974
- Concurrently executing processes
- Communicate only over one-wayunbounded channels
(FIFO queues) - Read one input port at a time
- Node execution suspended until enough data
available - Data that has been read is dequeued from channel
- Samples (tokens) flow along arcs
- Samples have value but not time information
- Flow of (untimed) data drives computation
- Determinate execution
- Any scheduling algorithm that obeys above rules
will produce same history of tokens on arcs
5Bounding Size of PN Queues
- Bounded Scheduling Parks Lee, 1995
- Write to a full queue suspends node execution
- On global deadlock, resize smallest queue
- Favors incomplete bounded execution
(non-determinate) - Computational PN Allen Evans, 2000
- Processes may consume fewer tokens than read
- All memory allocation can be handled by queues
- Bounded Scheduling Geilen Basten, 2003
- Show local deadlock may not lead to global
deadlock
Artificial deadlock
Deadlock detection required for bounded
communication, but no framework detects local
deadlock
6Deadlock Detection Algorithm
- Mitchell Merritts algorithm 1984
- Detects local and global deadlocks
- Exactly one process detects deadlock
- Simplifies deadlock resolution
- Pair of labels (numbers) used for deadlock
detection - Deadlock detected when a label makes a
round-trip among set of blocked processes
7Mitchell-Merritt Example
BUSY
Write to B
BUSY
Read from C
A
B
B
A
Blocking Step
Initial State
D
C
D
C
BUSY
Read from A
BUSY
Arrows indicate waiting.Artificial deadlock
without feedback.
8Mitchell-Merritt Example
A
B
B
A
D
C
D
C
9Implementation
- Distributed framework for Computational Process
Networks - TCP sockets for communication
- Transmit and receive queues (zero-copy)
- C, POSIX threads
- http//www.ece.utexas.edu/bevans/projects/pn
10Execution Performance
Overhead lt1µs per read/write
11Execution Performance
Overhead lt1µs per read/write
12Conclusion
- Formal models simplify parallel design,
implementation, and debugging - Communication in PN model follows
Single-Resource semantics - Mitchell-Merritt algorithm applicable to
non-distributed, parallel, and distributed PNs - Can be used to implement bounded-memory
scheduling algorithms