Title: SPECULATIVE EXECUTION IN A DISTRIBUTED FILE SYSTEM
1SPECULATIVE EXECUTION INA DISTRIBUTED FILE SYSTEM
- E. B. NightingaleP. M. ChenJ. Flint
- University of Michigan
2Motivation
- Distributed file systems are often much slower
than local file systems - Due to synchronous operations required for cache
coherence and data safety - Even true for file systems that weaken
consistency and safety guarantees - Close-to-open consistency for AFS and most
versions of NFS
3A better solution
- Most of these synchronous operations
havepredictable outcomes - We can bet on the outcome and let the client
process go forward (speculation) - Make operation asynchronous
- Must take before that a checkpoint of the
process - Can restart operation if speculation failed
4Why it works
- Clients can correctly predict the outcome of many
operations - Few concurrent accesses to files
- Time to take a lightweight checkpoint is often
less than network round-trip time - 52 ms for a small process thanks tocopy-on-write
- Most clients have free cycles
5Speculator
- File system controls when speculations start,
succeed and fail - Speculator provides a mechanism to ensure
correct execution of speculative code - No application changes are required
- Speculative state is never visible from the
outside
6Correctness rules (I)
- A process that executes in speculative mode
cannot externalize output - Speculator blocks the process
- Speculator tracks causal dependencies between
kernel objects - Kernel objects modified by a speculative process
will be put in a speculative state
7Correctness rules (II)
- Speculator tracks causal dependencies between
processes - Processes receiving a message or a signal from a
speculative process will be checkpointed and
become speculative - In case of doubt, Speculator will block the
execution of the speculative process
8An example conventional NFS
9An example conventional NFS
- Linux 2.4.21 NFSv3 implements close to open
consistency - At close time, client sends to server
- Asynchronous write calls with the modified data
- A synchronous commit call once it has received
replies for all write calls
10An example SpecNFS
11An example SpecNFS
- All calls are non-blocking but force the calling
process to become speculative - If a call returns an unexpected result, the
calling process is rolled back to its checkpoint
and the call is executed again - A new speculation starts
12Speculation interface
- Three new system calls
- Create_speculation()
- Returns unique spec_id and a list of previous
speculations on which the speculation depends - Commit_speculation(spec_id)
- Fail_speculation(spec_id)
13Implementing checkpoints
- Checkpoints are implemented throughcopy-on-write
fork - Speculator also saves the state of any open file
descriptor and copies all pending signals - Forked child is not placed on the ready queue
- It just waits
- If speculation fails, forked child assumes the
identity of the failed parent
14New kernel structures
- Speculation structure
- Created during create_speculation()
- Tracks the set of kernel objects that depend on
the speculation - Undo log
- Associated with each kernel object that has a
speculative state - Ordered list of speculative modifications
15Sharing checkpoints
- Letting successive speculations share the same
checkpoint reduces the speculation overhead - Two limitations
- Speculator limits the amount of rollback work by
not letting speculation share a checkpoint that
is more than 500 ms old - Cannot let a speculation share a checkpoint with
a previous speculation that changes state of file
system
16Correctness invariants
- Speculative state should never be visible to the
user or to any external device - Speculator prevents all speculative processes
from externalizing output to any interface - A process should never view speculative state
unless it is already speculatively dependent upon
that state.
17Invariant implementations (I)
- First ImplementationBlock speculative processes
whenever they try to perform a system call - Always correct
- Limits the amount of work that can be done by a
process in a speculative state
18Invariant implementations (II)
- Second ImplementationAllow speculative
processes to perform systems calls that - Do not modify state
- Read-only calls such as getpid()
- Only modify state that is private to the calling
process - It will be rolled back if speculation fails
19Invariant implementations (III)
- Third ImplementationAllow speculative processes
to perform operations on files in speculative
file systems - With VFS, can have multiple file systems on the
same machine - Typically NFS plus FFS or ext3
- Must check type of file system
- Have a special bit in superblock
20Multiprocess speculation (I)
- Whenever a speculative process P participates in
interprocess communication with a process Q - Process Q must become speculatively dependent on
the speculative state of process P and get
checkpointed
21Multiprocess speculation (II)
- Whenever a speculative process P modifies an
object X - Object X must become speculatively dependent on
the speculative state of process P and get an
undo list
You are not responsible for the implementation
details
22Performance PostMark benchmark
23Performance PostMark benchmark
- SpecNFS is
- 2.5 times faster than NFS with no latency between
client and server - 41 times faster than NFS with a 30ms round-trip
time delay between client and server - A version of BlueFS providing single-copy
semantics is 49 times faster than NFS with same
30ms round-trip time delay
24Performance Apache benchmark
25Performance Apache benchmark
- Building Apache server from a tarred file
- SpecNFS is
- 2 times faster than NFS with no latency between
client and server - 14 times faster than NFS with a 30ms round-trip
time delay between client and serve - Always better than BlueFS and Coda
26Performance impact of rollbacks
27Performance impact of rollbacks
- Repeated Apache benchmark marking avarying
fraction of the files out-of-date - Will result in speculation failures
- Percentage of out-of-date files has little impact
on SpecNFS performance
28Performance other
29Performance other
- Impact of group commits and sharing state
- Mostly affects Blue FS
- When speculative processes cannot propagate their
state, Blue FS performs worse than NFS with no
latency between client and server - Impact magnified at 30ms latency
30Conclusion
- Speculation enables the development of
distributed file systems that are - Safe
- Consistent
- Fast
- Generic kernel support for speculative execution
and causal dependency tracking could have many
other applications