Title: Experiences with Formal Specifications of FaultTolerant File Systems
1Experiences with Formal Specifications of
Fault-Tolerant File Systems
- Roxana Geambasu (University of Washington)
- Andrew Birrell (Microsoft Research)
- John MacCormick (Dickinson College)
2Fault-Tolerant File Systems (FTFSs)
- FTFSs are crucial components in todays
datacenters - They underlie most of what we do on the Web
- Dependability correctness of FTFSs are
paramount
Web services
Google Earth
Google Analytics
Amazon services
Google File System (GFS)
Niobe
Dynamo
2
3FTFSs Are Extremely Complex
- Contain sophisticated protocols for
- replica consistency,
- recovery (replica addition to compensate for
failures), - reconfiguration (replica removal due to failure),
- load balancing, etc.
- Hence, FTFS protocols and implementation are hard
to get right
4Formal Methods (FM)
- Formal methods have been used extensively to
increase trust in complex systems - Formal specification languages are unambiguous
- Model checking and formal proofs are reliable
- However, FTFS designers still rely solely on
prose and intuitive reasoning - Prose may be ambiguous, inaccurate
- Intuitive reasoning may be faulty
5FTFS Design and Analysis Challenges
- Without formal methods, it is hard to
- Understand FTFS behavior and semantics
- Intuitive reasoning is hard and error-prone
- Explore alternative designs
- Alternative designs may affect semantics in
complex ways - Compare various FTFSs
- Prose is ambiguous and code bases are huge (tens
of thousands of lines of code)
6Goal Convince FTFS Builders to Use FM
- Previous studies showed how and for what purposes
to use FM for many classes of systems, e.g. - Local/distributed FSs, processor caches, TCP
congestion - Our work
- Shows how and for what purposes to use FM for
another specific class of important systems - fault-tolerant file systems
- Identifies convenient ways in which FM help in
understanding, designing comparing FTFSs
7Our Experience
- We wrote TLA specifications for three protocols
- Chain replication (Cornell University)
- Niobe (Microsoft)
- GFS (Google)
- Our experience shows that FM help solve FTFS
challenges - Comparing system mechanisms tradeoffs
- Understanding and proving semantics
- Exploring alternative designs
7
8Outline
- Specification effort
- Experiences with formal specifications for FTFS
- Comparing system mechanisms
- Understanding and proving semantics
- Exploring alternative designs
- Conclusions
8
9Specification Effort
- Question How hard is it to build specifications?
- Answer Moderately precise specifications are
reasonably easy to produce
10Outline
- Specification effort
- Experiences with formal specifications for FTFS
- Comparing system mechanisms
- Understanding and proving semantics
- Exploring alternative designs
- Conclusions
10
111. Comparing System Mechanisms
- Case study GFS vs. Niobe
- From prose, they seemed very different systems
- GFS trades some consistency for throughput
- Niobe designed for strong consistency
- Our TLA specifications highlight significant
mechanism overlap and also key differences
11
12Capturing Similarities Differences
- More than half of the TLA code-base is common
- Specifications are small due to TLA
expressiveness - Compare their total sizes to the tens of
thousands of LOC of the systems implementations
single-master, primary-secondary
replication
Common
(291 lines)
Niobe
GFS
(287 lines)
(189 lines)
13Differences Stand Out Clearly in TLA
- Example Write completion in GFS and Niobe
w
w
1
4
w
w
ACK
ACK
3
2
14Differences Stand Out Clearly in TLA
- Example Write completion in GFS and Niobe
w
w
w
w
1
4
1
4
w
w
Group reconfiguration
w
w
ACK
ACK
ACK
ACK
3
2
3
2
15Understanding Tradeoffs
- Example Write completion in GFS and Niobe
Tradeoff
Smaller latency, but writes may leave group
inconsistent
A write never leaves replica group in
inconsistent state
16Lesson Formalism Helps in Comparison
- Formal specifications distill key differences and
similarities between systems - Understanding the key differences enables us to
understand tradeoffs
17Outline
- Specification effort
- Experiences with formal specifications for FTFS
- Comparing system mechanisms
- Understanding and proving semantics
- Exploring alternative designs
- Conclusions
17
182. Understanding FTFS Consistency
- Hard to prove consistency models for FTFSs
- For weakly consistent systems, it can be even
harder - Solution use refinement mapping
- Reduce system to a really simple model
- Prove the correctness of the reduction
- Reason about the SimpleStore
- For convenience, we use model-checking instead of
full manual proofs at Step 2
SimpleStore
consistency model
consistency model
reduction
System
19SimpleStores for the Three FTFSs
- SimpleStores capture only client-visible
behaviors and abstract out all protocol
mechanisms - SimpleStores are easy to reason about
Chain_SS
Niobe_SS
GFS_SS
reduction
reduction
reduction
Chain
Blue
Niobe
Chain
GFS
20Chains Consistency Semantics
Chain_SS
linearizable
linearizable
Proof is straightforward (half a page)
reduction
Chain
- Using convenient methods, we gained reliable
insight into Chains consistency model
20
21Niobes Consistency Semantics
Chain_SS
Niobe_SS
GFS_SS
linearizable
linearizable
reduction
reduction
??
Chain
Niobe
GFS
linearizable
linearizable
- Similar experience as with Chain
- Thus, formal methods help in verifying standard
consistency models for strongly-consistent FTFSs
21
22GFS Consistency Semantics
- Formal methods proved helpful in several ways
- An interesting conclusion (details in the paper)
- Using refinement mappings, we were able to show
that, under a small set of assumptions, GFS has
regular-register semantic
GFS_SS
well-defined intermediate-level consistency model
regular register
reduction
GFSassumptions
regular register
23Lesson Formalism Helps Understand Semantics
- Refinement mappings help in understanding
reliably verifying consistency models of FTFS - They are useful for both strongly consistent and
weakly consistent FTFSs
24Outline
- Specification effort
- Experiences with formal specifications for FTFS
- Comparing system mechanisms
- Understanding and proving semantics
- Exploring alternative designs
- Conclusions
24
253. Exploring Alternative Designs
- Exploring alternative designs is much easier
using our framework (TLA specs, SimpleStores,
reductions)
System SimpleStore
reduction
System model
25
26Case-Study Changing Niobes Design
- Currently, Niobes clients read from primary only
- Reading from any replica may improve throughput
- Design question
- What happens to Niobe if it adopts read-any
policy?
Niobe_SS
GFS_SS
linearizable
regular register
Chain
GFSassumption
Nioberead-any
?
regular register
regular register
27Conclusions
- FTFSs are extremely important in todays Web
- We showed how formal methods can help improve our
understanding and trust in FTFSs - Lessons from our experience with three FTFSs
- Writing formal specifications is relatively easy
- Formal methods enable
- Insightful comparison of mechanisms tradeoffs
- Reliable verification of consistency properties
- Convenient investigation of alternative designs
28Appendix
28
29Related Work
- FM are extensively used to reason about software
Bickford, et.al., 96 and hardware Shimizu,
et.al., 02 - However, FTFS builders have not adopted them yet
- By sharing our experience, we hope to convince
FTFS builders of the utility of specifying their
systems formally - Using FM to improve understanding and trust in
systems - Previous works apply FM to various classes of
systems - Chkliaev, et.al., 00, Crow, et.al., 98,
Joshi, et.al., 03, Houston, et.al., 91 - The closest works are those looking at
distributed FS (AFS, Coda) Sivathanu, et.al.,
05, Wing, et.al., 97, Yang, et.al., 04 - We show how to apply them in the specific context
of FTFS - Reducing complex systems to simple ones in order
to reason about semantics has been used before
Joshi, et.al., 03 - We apply this method to FTFSs
30GFS Assumptions
- If
- A write never crosses chunk boundaries
- GFS client library offers chunk-level operations
- A write never goes to a stale replica
- Implement this assumption using a lease mechanism
- Then
GFS_SS
regular register
reduction
GFSassumptions
regular register
31Standard Consistency Models
- Linearizability (Atomic register semantic)
- Any client-visible history H generated by the
system is equivalent to a legal sequential
interleaving S - The sequential interleaving S preserves the
real-time ordering of operations from H - Serializability
- Any client-visible history H generated by the
system is equivalent to a legal sequential
interleaving S - Regular register semantic
- Read not concurrent with any write returns most
recently written value - Read concurrent with some writes returns either
the value of the in-process writes or the most
recently written value - Safe register semantic
- Read not concurrent with any write returns most
recently written value - Read concurrent with some writes can return
anything
32Summary of Contributions
- Identified a new important class of extremely
complex systems FTFSs - Showed three aspects of FTFS design analysis
for which FM prove especially valuable - Mechanism comparison, semantics understanding,
and design space exploration - Showed how to apply specific FMs to FTFSs
- Showed how to construct SimpleStores and what can
be learned from them - SimpleStores are reusable between systems
- We believe that our study, tailored toward FTFSs,
can be more relevant to FTFS designers than more
general studies
32
33Lessons from Our Experience
- Building high-level specifications for FTFS is
relatively easy - It is also remarkably useful for understanding
system - The exercise of writing specifications exposes
similarities in seemingly dissimilar systems
(GFS, Niobe) - Formal specifications also distill the key design
differences - Specifications enable convenient verifications of
consistency for both strongly and weakly
consistent systems - Niobe and Chain are both linearizable
- GFS can be upgraded to regular register via a
clear set of assumptions - GFS design to read from any replica heavily
influences its consistency - Intuition can fail often times
- Niobe seemed to be reducible to Chain_SS, but
actually was not
34Chain SimpleStore
Responses
Requests
writes
reads
writes
reads
read channel
r1
r2
r3
read()
SerialDB
w6
commit(w5)
w7
w5
write channel
drop(w7)
Chain_SS
34
35The Temporal Logic of Actions (TLA)
- Formalism that combines a temporal logic with a
logic of actions - Especially designed for specification of
distributed asynchronous systems - TLA specifications model the system as a state
machine - Define system variables (state)
- Model actions that the system can take as state
transitions
35
36Understanding Tradeoffs
Smaller write latency, but writes may leave
group inconsistent
A write never leaves replica group in
inconsistent state
Error
Old value
4
1
4
read
read
1
3
2
3
2