Title: Speculative Execution in a Distributed File System
1Speculative Execution in a Distributed File System
- Ed Nightingale
- Peter Chen
- Jason Flinn
- University of Michigan
2Motivation
- Why are distributed file systems slow(er)?
- Sync n/w messages provide consistency
- Sync disk writes provide safety
- Sacrifice guarantees for speed
- Can DFS can be safe, consistent and fast?
- Yes! With OS support for speculative execution
3Big Idea Slow Way
Big Idea Speculator
Client
Server
1) Checkpoint
RPC Req
RPC Req
Block!
2) Speculate!
RPC Resp
RPC Resp
3) Correct?
Yes discard ckpt.
No restore process re-execute
RPC Req
RPC Resp
- Guarantees without blocking I/O!
4Conditions for Success
- Operations are highly predictable
- Conflicts are rare
- Checkpoints are cheaper than network I/O
- 52 µs for small process
- Computers have resources to spare
- Need memory and CPU cycles for speculation
5Outline
- Motivation
- Implementing speculation
- Multi-process speculation
- Using Speculator
- Evaluation
6Implementing Speculation
1) System call
2) Create speculation
Time
Process
Checkpoint
7Speculation Success
1) System call
2) Create speculation
3) Commit speculation
Time
Process
Checkpoint
8Speculation Failure
2) Create speculation
1) System call
3) Fail speculation
Time
Process
Process
Checkpoint
9Ensuring Correctness
- Spec processes often affect external state
- Three ways to ensure correct execution
- Block
- Buffer
- Propagate speculations (dependencies)
10Systems Calls
- Block calls that externalize state
- Allow read-only calls (e.g. getpid)
- Allow calls that modify only task state (e.g.
dup2) - File system calls -- need to dig deeper
- Mark file systems that support Speculator
getpid
Call sys_getpid()
reboot
Block until specs resolved
mkdir
Allow only if fs supports Speculator
11Output Commits
1) sys_stat
2) sys_mkdir
3) Commit speculation
Time
Process
stat worked
Checkpoint
Checkpoint
mkdir worked
12Multi-Process Speculation
- Processes often cooperate
- Example make forks children to compile, link,
etc. - Would block if speculation limited to one task
- Allow kernel objects to have speculative state
- Examples inodes, signals, pipes, Unix sockets,
etc. - Propagate dependencies among objects
- Objects rolled back to prior states when specs
fail
13Multi-Process Speculation
Checkpoint
Checkpoint
Checkpoint
Checkpoint
Checkpoint
pid 8001
pid 8000
Chown-1
Chown-1
Write-1
Write-1
inode 3456
14Multi-Process Speculation
- What we handle
- DFS objects, RAMFS, Ext3, Pipes FIFOs
- Unix Sockets, Signals, Fork Exit
- What we dont (i.e. we block)
- System V IPC
- Multi-process write-shared memory
15Outline
- Motivation
- Implementing speculation
- Multi-process speculation
- Using Speculator
- Evaluation
16Example NFSv3 Linux
Client 1
Client 2
Server
Modify B
Write
Commit
Open B
Getattr
17Example SpecNFS
Client 1
Client 2
Server
WriteCommit
Modify B
speculate
Getattr
Open B
speculate
Getattr
Open B
speculate
18Problem Mutating Operations
Client 1 1. cat foo gt bar
Client 2 2. cat bar
- bar depends on cat foo
- What does client 2 view in bar?
19Solution Mutating Operations
- Server determines speculation success/failure
- State at server never speculative
- Send server hypothesis speculation based on
- List of speculations an operation depends on
- Requires server to track failed speculations
- Requires in-order processing of messages
20Group Commit
- Previously sequential ops now concurrent
- Sync ops usually committed to disk
- Speculator makes group commit possible
Client
Client
Server
Server
write
commit
write
commit
21Putting it all Together SpecNFS
- Apply Speculator to an existing file system
- Modified NFSv3 in Linux 2.4 kernel
- Same RPCs issued (but many now asynchronous)
- SpecNFS has same consistency, safety as NFS
- Getattr, lookup, access speculate if data in
cache - Create, mkdir, commit, etc. always speculate
22Putting it all Together BlueFS
- Design a new file system for Speculator
- Single copy semantics
- Synchronous I/O
- Each file, directory, etc. has version number
- Incremented on each mutating op (e.g. on write)
- Checked prior to all operations.
- Many ops speculate and check version async
23Outline
- Motivation
- Implementing speculation
- Multi-process speculation
- Using Speculator
- Evaluation
24Apache Benchmark
- SpecNFS up to 14 times faster
25The Cost of Rollback
- All files out of date SpecNFS up to 11x faster
26Conclusion
- Speculator greatly improves performance of
existing distributed file systems - Speculator enables new file systems to be safe,
consistent and fast
27Group Commit Sharing State
28Apache Benchmark
29Related Work
- Chang Gibson, Fraser Chang
- Speculative pre-fetching
- Time Warp
- Virtual Time distributed simulations
- Hardware branch prediction
- Transactional file systems