Title: Symbolic%20Synthesis%20of%20Masking%20Fault-Tolerant%20Distributed%20Programs
1Symbolic Synthesis ofMasking Fault-Tolerant
Distributed Programs
- Borzoo Bonakdarpour
- Workshop APRETAF
- January 23, 2009
Joint work with Sandeep Kulkarni
2Motivation
- The most important goal of formal methods is
achieving correctness in computing systems
(programs). - Correct-by-verification
- A program is built manually
- The correctness of the program is verified by a
model checker or a theorem prover. - Correct-by-construction
- A program is constructed so that its correctness
is guaranteed.
Verification
Manual Design
3Motivation
- Automated synthesis from temporal logic
specifications - Pros
- Ability to start from a null program
- Capability to handle highly expressive
specifications - Cons
- Highly complex decision procedures
- Limited to no reusability
- Automated program revision
- An existing program is revised with respect to a
property
4Motivation
Model Checker
Program
Counterexample
Property
- Question
- Is it possible to revise the program
automatically such that it satisfies the failed
property while ensuring satisfaction of existing
properties? - bugs
- incomplete specification
- change of environment
5Motivation
Revision Algorithm
Program
Revised program
Property
6Motivation
- A one-lane bridge is controlled by two traffic
signals at the two ends of the bridge.
Controller Program
(1)
(2)
SPECbt (?0, ?1) sig1 (?1) ? R ? sig2
(?1) ? R
(sig1 G) ? (1 x1 10) sig1 Y
(sig1 Y) ? (1 y1 2) sig1 R
(sig2 R) ? (z1 1) sig2 G
((sig1 G) ? (x1 10)) ? ((sig1 Y ) ? (y1
2)) ? ((sig1 R) ? (z2 1)) wait
7Motivation
Traffic Controller Fault Action
(1)
(2)
- (sig1 sig2 R) ? (z1 1) ? (z2 gt 1)
- (sig1 sig2 R) ? (z1 1) ? (z2 0)
- (sig1 G) ? (sig2 R) ? (z1 1) ? (z2
0) - (sig1 G) ? (sig2 G) ? (z1 1) ? (z2
0)
8Motivation
- Kulkarni and Arora introduce automated addition
of fault-tolerance to fault-intolerant programs
Intel reports bug in floating point operations in
Pentium processors
Clarke and Grumberg introduce counterexample
guided abstraction-refinement (CEGAR), 101000
reachable states
Wonham and Ramadge introduce controller synthesis
Clarke, Emerson, Sifakis, and Queille invent
model checking
Clarke, Emerson, Sifakis, and Queille invent
model checking
Bonakdarpour and Kulkarni synthesize distributed
programs of size 1050
Emerson and Clarke propose synthesis from CTL
properties
2008
Biere and Clarke invent SAT-based model checking
(10500 reachable states)
Bonakdarpour, Kulkarni, and Ebnenasir, and,
Jobstmann and Bloem independently introduce
program revision (repair) techniques
Vardi and Wolper introduce automata-theoretic
verfication and synthesis
McMilan et al. intorduce BDD-based model checking
(1020 reachable states) and find bugs in IEEE
futurebus
McMilan et al. intorduce BDD-based model checking
(1020 reachable states) and find bugs in IEEE
futurebus
Alur and Henzinger propose verification and
synthesis of real-time systems
9The Synthesis Problem
State space
p
f
f
Invariant
p
f
p
p
f
p
p
p
p
p
f
p
Fault-Span
10The Issue of Distribution
- Modeling distributed programs
- A program consists of a set of processes. Each
process p is specified by - A set Vp of variables,
- A set Tp of transitions,
- A set Rp ? Vp of variables that p is allowed to
read, - A set Wp ? Rp of variable that p is allowed to
write. - Write restrictions
Such transitions cannot be executed by process p.
11The Issue of Distribution
- Such set of transitions form a group.
- Addition and removal of any transition must occur
along with its entire group.
12What Is DifficultAbout Program Revision?
- Space complexity
- The state explosion problem
- Time complexity
- NP-completeness
- Identifying the complexity hierarchy of the
problem - The need for designing efficient heuristics
- Proofs are often helpful in identifying
bottlenecks of the problem - The combination of the above complexities
- is the worst nightmare!
13What Is DifficultAbout Program Revision?
Daniel Mosé As that wise man said bridging
the gap between theory and practice is easier in
theory than in practice!
14The Byzantine Agreement Problem
GENERAL
Decision d.g ? 0, 1
NON-GENERALS
(d.j ?) ? ( f.j false) ? d.j d.g (d.j ?
?) ? ( f.j false) ? f.j true
Program
15The Byzantine Agreement Problem
Byzantine? b.g ? false, true
(b.j , b.k , b.l , b.g false) ? b.j
true (b.j true) ? d.j 01
Faults
16What Is DifficultAbout Program Revision?
- Experimental results with enumerative (explicit)
state space (the tool FTSyn) - Byzantine agreement - 3 processes
- 6912 states
- Time 10s
- Byzantine agreement - 4 processes
- 82944 states
- Time 15m
- Byzantine agreement - 5 processes
- 995328 states
- Out of memory!
17Polynomial -Time Heuristics
Identify the state predicate ms from where
faults alone violate the safety S ? S ?? ms
f
f
SPEC
f
f
ms
18Polynomial -Time Heuristics
p
Identify the state predicate ms from where
faults alone violate the safety S ? S ?? ms
f
p
Re-compute the fault-span
Inv.
BDD frontier Invariant BDD current mgr
-gt bddZero() BDD FaultSpan Invariant while
(FaultSpan ! current) current
FaultSpan BDD image frontier (P F) //
-FaulSpan frontier Unprime(image)
FaultSpan current frontier
f
f
f
p
f
ms
Fault-Span
19Polynomial -Time Heuristics
p
s1
Identify the state predicate ms from where
faults alone violate the safety S ? S ?? ms
Re-compute the fault-span
s0
p
p
f
f
Re-computing state predicates or transitions
predicates do not occur often in model checking,
but it does happen quite often during synthesis.
Identify transitions in the fault-intolerant
program that may be included in the
fault-tolerant program
No
Fixpoint?
Yes
Resolve deadlock states
20Experimental Results
- Polynomial-time sound BDD-based heuristics
- The tool SYCRAFT (http//www.cse.msu.edu/borzoo/
sycraft) - C
- CuDD (Colorado University Decision Diagram
Package) - Platform
- Dedicated PC
- 2.2GHz AMD Opteron processor
- 1.2GB RAM
21Experimental Results
- Goal
- Identifying various bottlenecks of our synthesis
heuristics - Fault-span generation
- Deadlock resolution
- Adding recovery
- State elimination
- Cycle detection and resolution
- Memory usage
- Total synthesis time
22Experimental Results
23Experimental Results
24Experimental Results
Performance of synthesizing the Byzantine
agreement program
25Experimental Results
- Observations
- 1050 reachable states
- State elimination (deadlock resolution) is the
most serious bottleneck - We run of time before we run out of space
- Size of state space by itself is not a bottleneck
26Experimental Results
--------------------------------------------------
--------------------------------------------------
- UNCHANGED ACTIONS -----------------------------
--------------------------------------------------
---------------------- 1- (d.j2) !(f.j1)
!(b.j1) ? (d.jdg) -------------------------
--------------------------------------------------
-------------------------- REVISED
ACTIONS -----------------------------------------
--------------------------------------------------
---------- 2- (b.j0) (d.j1) (d.k1)
(f.j0) ? (f.j1) 3- (b.j0) (d.j0)
(d.l0) (f.j0) ? (f.j1) 4- (b.j0)
(d.j0) (d.k0) (f.j0) ? (f.j1) 5-
(b.j0) (d.j1) (d.l1) (f.j0) ?
(f.j1) ----------------------------------------
--------------------------------------------------
----------- NEW RECOVERY ACTIONS ----------------
--------------------------------------------------
----------------------------------- 6- (b.j0)
(d.j0) (d.l1) (d.k1) (f.j0) ?
(d.j1) 7- (b.j0) (d.j1) (d.l0)
(d.k0) (f.j0) ? (d.j0) 8- (b.j0)
(d.j0) (d.l1) (d.k1) (f.j0) ?
(d.j1), (f.j1) 9- (b.j0) (d.j1)
(d.l0) (d.k0) (f.j0) ? (d.j0),
(f.j1) -----------------------------------------
-------------------------------------------------
27Experimental Results
The effect of exploiting human
knowledge (Each non-general process is allowed to
finalize its decision if no two non-generals are
undecided.)
28Experimental Results
Performance of synthesizing token ring mutual
exclusion with multi-step recovery
29Experimental Results
Multi-step vs. single-step recovery
for synthesizing token ring mutual exclusion
30Open Problems
- Exploiting techniques from model checking
- State space generation (e.g., clustering and
partitioning) - Symmetry reduction
- Counter-example guided abstraction-refinement
(CEGAR) - SMT/QBF-based methods
- Distributed/parallel techniques
31Open Problems
- Multidisciplinary research problems
- Revising hybrid systems
- Synthesizing programs with multiple concerns
(e.g., security, communication, real-time,
fault-tolerance, distribution) in epistemic logic - Program synthesis using graph mining and machine
learning techniques - Biologically-inspired revision/synthesis
techniques
32THANK YOU!