Failstutter Behavior Characterization of NFS - PowerPoint PPT Presentation

About This Presentation
Title:

Failstutter Behavior Characterization of NFS

Description:

Failstutter Behavior Characterization of NFS – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 10
Provided by: pagesC
Category:

less

Transcript and Presenter's Notes

Title: Failstutter Behavior Characterization of NFS


1
Fail-stutter Behavior Characterization of NFS
  • Jichuan Chang
  • CS736 Final Project, UW-Madison
  • December 13, 2002

2
Motivation
  • We want systems to be very Fast and Available!
  • Hard to achieve for modern computer systems
  • complex interactions among components
  • cant assume everything is always working
    perfectly!
  • We need a better fault model
  • Simpler than the Byzantine model
  • Richer than the fail-stop model
  • Fail-stutter Fault-tolerance Remzi 01.

Stable Performance
Low Performance
Down
3
Fail-stutter Issues
  • Separate performance faults from correctness
    faults
  • What are performance faults?
  • Need a performance specification, but how to get
    the spec.?
  • How to distinguish interference and performance
    fault?
  • What are correctness faults?
  • Correctness should be defined in an end-to-end
    manner.
  • How to diagnose both types of faults?
  • Must observe how systems behave!
  • Exploit fail-stutter behavior
  • Who should be notified about failures, when and
    how?
  • System supports - programming tools / runtime
    support
  • Integration with existing systems - less intrusion

4
Our Approach
  • Case study NFS fail-stutter characterization
  • Fault-injection (vs. system monitoring)
  • Performance measurement
  • Simple, software-based test-bed
  • Interesting observations
  • Different failed parts have different performance
    impact
  • Different types of clients have different
    behaviors
  • Patient (keep retrying) vs. Impatient (try other
    servers)
  • Transition between performance and correctness
    faults
  • Can be determined proactively by fault-injection
  • Performance spec. could be application-specific.

5
Experimental Settings

NFS Client App
X
X
Storage System
NFS Server

X

Click S/W Router
  • Workloads - SpecSFS97, file (micro-benchmark).
  • Data to collect - throughput, response time,
    errors.
  • Faulty components - network, server, disk, bus,
    etc.
  • Fault injection - network package dropping
  • drop k Ethernet packages,
  • drop k IP packages coming from the server.

6
Results (1) - Patient Client
1. Performance degradation scales with drop
probability.
X
X
X
Error occurred
2. Ethernet dropping less harmful compared with
IP dropping.
X
X
X
X
X
3. Performance data less meaningful when error
occurs.
X
X
X
X
X
X
X
X
X
X
4. Different operations switch to correctness
faults at different points (e.g. 5, 15, 20).
Total execution time can hide such difference.
7
Results (2) - Impatient Client
1. Throughput decreases linearly as the dropping
probability increases.
2. Throughput drops manifest under heavy loads.
SpecSFS97 Retry once!
3. Response time doesnt change as much!
4. Ethernet dropping less harmful.
8
Summary
  • Modern computer system design needs a better
    fault-tolerance model.
  • Using fault-injection to characterize NFS
    fail-stutter behavior.
  • Preliminary observations address some of the
    fail-stutter issues
  • How to separate different types of faults?
  • Suggest that we can extract performance
    specification by fault-injection and probing.

9
Future Work
  • Very-short-term
  • More classes of faults
  • More realistic fault injection
  • Short-term
  • Separate interference and performance fault
  • Extract/refine performance specifications
  • Performance-fault diagnosis
  • Long-term
  • Detailed model for a specific workload / system
  • System support for fail-stutter failures
Write a Comment
User Comments (0)
About PowerShow.com