Title: Hypervisorbased Fault tolerance
1Hypervisor-basedFault tolerance
- Thomas C. Bressoud
- Isis Distributed Systems
- Fred B. Schneider
- Cornell University
2Outline
- Intro
- Related Works
- Replica Coordination Protocols
- A Prototype System
- Performance of the Prototype
- QA
31.Intro
- Intro
- Popular Scheme replicating computing replicas
coordinating - Three Problems
- Hardware hardware design cost
- OS difficult complicated OSes
- Software modification on each apps
- Hypervisor software layer to implement virtual
machines having the same instruction set
architecture as the hardware
41.Intro (cont.)
- Hypervisor advantages
- Multiple OSes coexisting on a single processor
- Isolation mechanism
- Deal with the before 3 problems
- Hardware independency
- OS independency
- App independency
- Two issues in this paper
- Practical Design a protocol
- Performance Prototype experiments
Xiang Xiaojia
Department of Computer Science Slide 4
52.Related Works
- Processors implementing replica coordination in
hardware - TandemCMJ88
- DECs VAXft 3000
- App level replica coordination
- TandemSS92
- On top of OS
- Fault tolerance under UNIX
Xiang Xiaojia
Department of Computer Science Slide 5
62.Related Works (cont)
- NetwareMPN92
- Very similar to us
- Drawbacks
- Modify OS internal
- Proscription of Preemption
- Not transparent
- Failovers are not masked
Xiang Xiaojia
Department of Computer Science Slide 6
73. Replica Coordination Protocols
- T-fault tolerant model
- 1 primary, t backups
- Primary works, no backups interact with
environment - Primary fails, only one backup takes its place
- Assumptions
Xiang Xiaojia
Department of Computer Science Slide 7
83. Replica Coordination Protocols
- Identical Instruction Streams
- VM state memory and registers that change only
with execution of instructions of that VM - Two kinds of instructions
- Ordinary instructions its behavior is determined
by VM state - Environment instructions on the controry
- Assumptions
Xiang Xiaojia
Department of Computer Science Slide 8
93. Replica Coordination Protocols
Xiang Xiaojia
Department of Computer Science Slide 9
103. Replica Coordination Protocols
Xiang Xiaojia
Department of Computer Science Slide 10
113. Replica Coordination Protocols
- Protocol-Primary fail-backup promote
Xiang Xiaojia
Department of Computer Science Slide 11
123. Replica Coordination Protocols
- Interaction with an Enviroment
- State of Environment execution of IO
instructions - Assumption
- All IO devices are assumed to comply with
Xiang Xiaojia
Department of Computer Science Slide 12
134. Prototype
- Hypervisor
- Memory architecture
- Dont support multiple VMs
- Solution single VM
- Privilege levels
- How to arrange for VM instructions
- Mapping VM 0 -gt 1 VM 3 -gt 3
Xiang Xiaojia
Department of Computer Science Slide 13
144. Prototype (cont.)
- Replica Coordination in Hypervisor
- TLB problem
- TLB-gtMemory-gtDisk
- Difference in TLB contents become visible ?
- Solution
- Handle pare of TLB miss trap to hypervisor
Xiang Xiaojia
Department of Computer Science Slide 14
155.Evaluation
Xiang Xiaojia
Department of Computer Science Slide 15
165.Evaluation (cont.)
Xiang Xiaojia
Department of Computer Science Slide 16
175.Evaluation (cont.)
Xiang Xiaojia
Department of Computer Science Slide 17
18QA