Title: The Formal Verification of SPIDER
1The Formal Verification of SPIDER
Lee Pike Department of Computer
Science Indiana University, Bloomington lepike_at_in
diana.edu
2Thanks to
- Steven Johnson,
Indiana
University, Bloomington - The National Institute of Aerospace
- The NASA LaRC Formal Methods Team,
especially Paul Miner
3Overview
- SPIDER Overview
- Reasoning about Faults
- The Old vs. New Interactive Consistency (IC)
Protocol - SPIDER Formal Verification Goals Future Work
- References
4SPIDER OverviewWhy?
- Develop a fault-tolerant architecture based on an
ultra-reliable bus - Scalable
- Handle a large number of possibly-simultaneous
faults, specifically transient faults from
electromagnetic effects. - Provide reintegration services
- Case study for the FAA
- Developed in accordance with RTCADO-254 Design
Assurance Guidance for Airborne Electronic
Hardware. - Provide a test-bed for techniques in the
specification and verification of safety-critical
electronic systems.
These sort of architectures are the foundation of
tomorrow's X-by wire safety-critical systems.
5SPIDER OverviewWhat?
- Scalable Processor-Independent Design for
Electromagnetic Resilience
6SPIDER OverviewWhat?
- Scalable Processor-Independent Design for
Electromagnetic Resilience - Processor Elements (PEs)
PE
PE
PE
7SPIDER OverviewWhat?
- Scalable Processor-Independent Design for
Electromagnetic Resilience - Processor Elements (PEs)
- Reliable Optical BUS (ROBUS)
- Time Division Multiple Access (TDMA) bus
- Maintains Synchrony between PEs.
- Prevents Babbling Idiots PE-to-PE interference
- The services of the ROBUS are the focus of the
verification effort.
PE
ROBUS
PE
PE
8ROBUS OverviewTopology
- n Bus Interface Units (BIUs)
- m Redundancy Management Units (RMUs)
- The BIUs and RMUs are called nodes.
- Every BIU and RMU is directly connected.
- No two BIUs are directly connected. Similarly
for the RMUs.
RMU1
BIU1
to PE
to PE
BIU2
RMU2
BIU3
RMU3
to PE
ROBUS
9ROBUS OverviewServices (Protocols)
- Interactive Consistency
Purpose
Reliably broadcast messages between PEs. - Clock Synchronization
Purpose
Maintain synchrony between all nodes and PEs. - Distributed Diagnosis
Purpose
Convict faulty nodes in the ROBUS.
The focus of this talk is Interactive Consistency.
10Global Fault Classifications
d
node
d
d
11Global Fault Classifications
- Good Not faulty
- Benign Broadcasts only detectably faulty messages
garbage
node
garbage
garbage
12Global Fault Classifications
- Good Not faulty
- Benign Broadcasts only detectably faulty messages
- Symmetric Broadcasts the same arbitrary message
to all
d'
node
d'
d'
13Global Fault Classifications
- Good Not faulty
- Benign Broadcasts only detectably bad messages
- Symmetric Broadcasts the same arbitrary message
to all - Asymmetric (Byzantine) Arbitrarily sends
arbitrary messages
d
node
d'
d''
14Local Fault InformationEach Node Maintains
- Accusations A node accuses other nodes based on
the messages it receives as well as indirect
information.
15Local Fault InformationEach Node Maintains
- Accusations A node accuses other nodes based on
the messages it receives as well as indirect
information. - Convictions Periodically, the distributed
diagnosis protocol is executed nodes exchange
accusations to produce convictions. - NOTE While a good node knows that all good nodes
have the same convictions, it does not know that
all good nodes have the same accusations.
16Local Fault InformationEach Node Maintains
- Accusations A node accuses other nodes based on
the messages it receives as well as indirect
information. - Convictions Periodically, the distributed
diagnosis protocol is executed nodes exchange
accusations to produce convictions. - NOTE While a good node knows that all good nodes
have the same convictions, it does not know that
all good nodes have the same accusations. - Eligible Voters For each BIU, the set of RMUs
that it neither accuses nor convicts. Similarly
for each RMU.
17Interactive Consistency ProtocolExternal View
- Purpose Reliably communicate data between
processing elements (PEs) over the ROBUS.
PE
PE
ROBUS
PE
18Interactive Consistency ProtocolExternal View
- A PE sends its data to the ROBUS.
PE
PE
data in
sender
ROBUS
PE
19Interactive Consistency ProtocolExternal View
- The IC Protocol is executed in the ROBUS.
PE
PE
...IC Protocol...
ROBUS
PE
20Interactive Consistency ProtocolExternal View
- The ROBUS broadcasts data back out to the PEs.
data out
PE
PE
data out
sender
...IC Protocol...
ROBUS
data out
PE
21Old Interactive Consistency ProtocolInternal View
to PE
RMU1
BIU1
to PE
BIU2
RMU2
data in
sender
BIU3
RMU3
to PE
ROBUS
221. A BIU broadcasts data to the RMUs. If the
BIU is good, the same value is broadcast to all
RMUs.
to PE
RMU1
BIU1
data
to PE
BIU2
RMU2
data
data in
sender
BIU3
data
RMU3
to PE
ROBUS
232. For each good RMU, if it receives data that
isn't detectably faulty, then it passes
the data received back to each BIU. Otherwise,
source_error is sent.
to PE
data or source_error
RMU1
BIU1
RMU1 good
data or source_error
to PE
BIU2
RMU2
similarly for RMUs 2 and 3
BIU3
RMU3
data or source_error
to PE
ROBUS
243. Each BIU eliminates from its EV those RMUs
that sent detectably faulty messages.
2
1
to PE
d
RMU1
BIU1
RMU1 good
3
garbage
to PE
BIU2
RMU2
RMU2 benign faulty
BIUs 2 and 3 do likewise
d
BIU3
RMU3
to PE
ROBUS
254. For each BIU, it votes on the majority data
sent from each RMU in its EV.
2
1
d
to PE
RMU1
BIU1
3
d
vote d
to PE
BIU2
RMU2
BIUs 2 and 3 do likewise
BIU3
RMU3
to PE
ROBUS
265. IF the majority of RMUs sent the same data,
then it is sent to the BIU's PE. ELSE
source_error is sent to the BIU's PE.
d
to PE
RMU1
BIU1
vote d
to PE
BIU2
RMU2
BIUs 2 and 3 similarly send data
BIU3
RMU3
to PE
ROBUS
27IC Protocol Guarantees
- Validity If the broadcasting BIU is good, not
convicted, and sends data d, then the result of
the vote for a good BIU is be d. - Agreement Any two good BIUs vote the same result
for the broadcasted value (even if the sender is
asymmetric!).
28Old Assumptionsto ensure guarantees hold
- Environment Assumptions
- The Maximum Fault Assumption (MFA)
- There are more good BIUs than symmetric
asymmetric BIUs. - Similarly for the RMUs.
- There are either no asymmetric BIUs or no
asymmetric RMUs.
29Old Assumptionsto ensure guarantees hold
- Environment Assumptions
- The Maximum Fault Assumption (MFA)
- There are more good BIUs than symmetric
asymmetric BIUs. - Similarly for the RMUs.
- There are either no asymmetric BIUs or no
asymmetric RMUs.
- System Assumptions
- Symmetric Agreement If a node is not asymmetric,
then all good nodes assign it the same
accusation. - Good Trusting Good nodes aren't accused by good
nodes. - Conviction Agreement All good nodes have the same
convictions.
30ValidityProof Sketch
Assume the broadcasting BIU is good and sends
data d.
RMU1
BIU1
d
BIU2
RMU2
d
sender good
BIU3
d
RMU3
ROBUS
31ValidityProof Sketch
Thus, all good RMUs send d back to the BIUs.
RMU1
d
BIU1
RMU1 good
d
BIU2
RMU2
similarly for RMUs 2 and 3
d
BIU3
RMU3
ROBUS
32ValidityProof Sketch
Each good BIU filters out the bad messages
received. By the MFA, most of its EV then
contains good RMUs.
2
RMU1
1
d
BIU1
garbage
3
d
BIU2
RMU2
similarly for BIUs 2 and 3
BIU3
RMU3
ROBUS
33ValidityProof Sketch
Since all good RMUs sent d, the result of the
vote yields d. q.e.d.
2
RMU1
1
d
BIU1
vote d
3
d
BIU2
RMU2
BIU3
RMU3
ROBUS
34AgreementProof Sketch
Either the broadcasting BIU is asymmetric or not.
Suppose it is.
RMU1
BIU1
d
BIU2
RMU2
d'
sender asym
d''
BIU3
RMU3
ROBUS
35AgreementProof Sketch
Then no RMU is asymmetric, by the MFA. So every
RMU sends the same data to every BIU.
2
1
x
RMU1
BIU1
y
3
z
BIU2
RMU2
BIUs 2 and 3 receive the same values
BIU3
RMU3
ROBUS
36AgreementProof Sketch
Since no RMU is asymmetric, by symmetric
trusting, the EV of each BIU is the same. Thus,
the result of the vote for each BIU is the same.
2
1
x
RMU1
BIU1
y
3
z
BIU2
RMU2
BIUs 2 and 3 receive the same values
BIU3
RMU3
ROBUS
37AgreementProof Sketch
For the other case, suppose the sending BIU is
not asymmetric.
RMU1
BIU1
d
BIU2
RMU2
d
sender not asym
d
BIU3
RMU3
ROBUS
38AgreementProof Sketch
Most of the RMUs are good, by the MFA. Since all
good RMUs received the same values, they send the
same values.
RMU1
BIU1
x
RMU1 good
BIU1 good
BIU2
RMU2
BIU3
RMU3
x
RMU3 good
BIU3 good
ROBUS
39AgreementProof Sketch
By good trusting, no good BIU accuses a good RMU.
Since most RMUs are good, there are a majority
of good RMUs in the EV of each good BIU, after
filtering benign RMUs.
2
x
1
RMU1
BIU1
RMU1 good
BIU1 good
3
x
BIU2
RMU2
2
1
x
BIU3
RMU3
RMU3 good
3
x
BIU3 good
ROBUS
40AgreementProof Sketch
Thus, the result of the votes will be the same
for all good BIUs.
q.e.d.
2
x
1
RMU1
BIU1
RMU1 good
BIU1 good
3
x
BIU2
RMU2
2
1
x
BIU3
RMU3
RMU3 good
3
x
BIU3 good
ROBUS
41New Assumptionsto reason about reintegration
- Environment Assumptions
- The Dynamic Maximum Fault Assumption (DMFA)
- For each good BIU, its EV consists of more good
RMUs than symmetric asymmetric RMUs. - Similarly for good RMUs.
- Either no asymmetric RMU is in the EV of a good
BIU or no asymmetric BIU is in the EV of
a good RMU.
42New Assumptionsto reason about reintegration
- Environment Assumptions
- The Dynamic Maximum Fault Assumption (DMFA)
- For each good BIU, its EV consists of more good
RMUs than symmetric asymmetric RMUs. - Similarly for good RMUs.
- Either no asymmetric RMU is in the EV of a good
BIU or no asymmetric BIU is in the EV of
a good RMU.
- System Assumptions
- Symmetric Agreement If a node is not asymmetric,
then all good nodes assign it the same
accusation. - Good Trusting Good nodes aren't accused by good
nodes. - Conviction Agreement All good nodes have the same
convictions.
43Agreement Breaks!Under the New Assumptions
(courtesy of Wilfredo)
Suppose the sender is asymmetric, but is in no EV
of all good RMUs. Suppose there is an asymmetric
RMU in the EV of both good BIUs. This satisfies
the DMFA.
RMU1
BIU1
good trusts all
good accuses BIU2
d
BIU2
RMU2
d'
good accuses BIU2
sender asym
d''
BIU3
RMU3
good trusts all
asym
ROBUS
44Agreement Breaks!Under the New Assumptions
The two good RMUs relay the values received, and
since RMU3 can relay arbitrary data, it sends d
to BIU1 and d' to the other.
2
1
d
RMU1
BIU1
good trusts all
good accuses BIU2
d'
3
d
BIU2
RMU2
good accuses BIU2
sender asym
2
1
d
BIU3
RMU3
d'
good trusts all
asym
3
d'
ROBUS
45Agreement Breaks!Under the New Assumptions
The result of the votes of BIU1 and BIU2 differ.
Agreement is violated!
2
1
d
RMU1
BIU1
good trusts all
good accuses BIU2
d'
3
d
vote d
BIU2
RMU2
good accuses BIU2
vote d'
sender asym
2
1
d
BIU3
RMU3
d'
good trusts all
asym
3
d'
ROBUS
46Revised IC Protocol
- In the new IC Protocol, the RMUs relay
source_error when - They receive bad messages and
- They accuse the sender.
47Revised IC Protocol
- In the new IC Protocol, the RMUs relay
source_error when - They receive bad messages and
- They accuse the sender.
The revised IC protocol satisfies both validity
and agreement (verified in PVS).
48Formal VerificationWhy Level 3 Verification?
- A math proof is proof enough, right?
- Level 3 verification can require significant time
to complete.
In other words...
49Using PVS
50Formal VerificationWhy Level 3 Verification?
- A math proof is proof enough, right?
- Level 3 verification can require orders of
magnitude more time to complete than level 1 or
level 2 verification.
- But...
- Proofs for fault-tolerant protocols for
distributed architectures are tedious and large
(there are nearly 400 lemmas theorems in our
current unfinished set of proofs). - Proofs are not checked by a community of
mathematicians like other mathematical results
are.
In other words...
51You don't have to be a Laurel or Hardy to make an
oversight in an informal proof.
Small changes in assumptions can obviate
guarantees.
52Some Goals Current Workin verifying SPIDER
- Robust Specifications/Proofs
- Hold for arbitrary configurations of SPIDER
- Hold for all accusation conviction policies
satisfying the system requirements
53Some Goals Current Workin verifying SPIDER
- Robust Specifications/Proofs
- Hold for arbitrary configurations of SPIDER
- Hold for all accusation conviction policies
satisfying the system requirements - Specification/Proof Reuse (Economic
specs/proofs)
54Some Goals Current Workin verifying SPIDER
- Robust Specifications/Proofs
- Hold for arbitrary configurations of SPIDER
- Hold for all accusation conviction policies
satisfying the system requirements - Specification/Proof Reuse (Economic
specs/proofs) - Specification/Proof Hierarchy
- Property specifications
- Relational specifications
- Functional composition specifications
- State machine specifications
55References
- SPIDER Homepage
http//shemesh.larc.nasa.gov/fm/fm-now-spider.html
. - PVS Homepage
http//pvs.csl.sri.com/. - Butler, Ricky et al. NASA Langley's Research and
Technology-Transfer Program in Formal Methods.
2000. Available athttp//shemesh.larc.nasa.gov/fm
/fm-welcome.html. - Rushby, John. Formal Methods and Digital Systems
Validation for Airborne Systems. NASA Contractor
Report 4551. 1993. Available at
http//www.csl.sri.com/papers/csl-93-7/.