Bug Isolation in the Presence of Multiple Errors - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Bug Isolation in the Presence of Multiple Errors

Description:

Bug Isolation in the Presence of Multiple Errors Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan UC Berkeley and Stanford University – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 25
Provided by: BenL162
Category:

less

Transcript and Presenter's Notes

Title: Bug Isolation in the Presence of Multiple Errors


1
Bug Isolation in thePresence of Multiple Errors
  • Ben Liblit, Mayur Naik, Alice X. Zheng, Alex
    Aiken, and Michael I. Jordan
  • UC Berkeley and Stanford University

2
In Our Last Episode
  • Generic instrumentation schemes
  • Wild guesses about interesting behavior
  • Bernoulli sampling transformation
  • Amortization across acyclic regions
  • Data mining techniques
  • Deterministic process of elimination
  • Non-deterministic logistic regression

3
Schemes ? Sites ? Predicates
  • Several instrumentation schemes available
  • Function returns, pairwise comparisons, branches,
  • Scheme induces finite set of instrumentation
    sites
  • Site determines finite set of observable
    predicates
  • Predicates completely partition each site
  • Bump exactly one counter per observation
  • Infer additional predicates (e.g. , ?, ) offline

4
What Does This Give Us?
  • Absolutely certain of what we do see
  • Uncertain of what we dont see
  • Given enough runs, samples reality
  • Common events seen most often
  • Rare events seen at proportionate rate

5
Regularized Logistic Regression
  • S-shaped cousin to linear regression
  • Predict success/failure as function of counters
  • Penalty factor forces most coefficients to zero
  • Large coefficient ? highly predictive of failure

6
Buffer Overrun in bc
  • void more_arrays ()
  • / Copy the old arrays. /
  • for (indx 1 indx lt old_count indx)
  • arraysindx old_aryindx
  • / Initialize the new elements. /
  • for ( indx lt v_count indx)
  • arraysindx NULL

1 indx gt scale 2 indx gt use_math 3 indx gt
opterr 4 indx gt next_func 5 indx gt i_base
7
Limitations of Logistic Regression
  • Linearly-weighted combination of features
  • What does this mean?
  • Many correlated features
  • Weight may be spread in unpredictable ways
  • Suited to explaining a single mode of failure
  • Do you really believe you have just one bug?

8
Multiple-Bug Isolation
  • Consider predicates one at a time
  • Include inferred predicates (e.g. , ?, )
  • How likely is failure when predicate P is true?
  • (technically, when P is observed to be true)

9
Multiple-Bug Isolation
  • Consider predicates one at a time
  • Include inferred predicates (e.g. , ?, )
  • How likely is failure when predicate P is true?
  • (technically, when P is observed to be true)

10
Are We Done? Not Exactly!
  • f
  • if (f NULL)
  • x 0
  • f
  • Predicate (x 0) is an innocent bystander
  • Program is already doomed

Bad(f NULL) 1.0
Bad(x 0) 1.0
11
Three-Valued Logic
  • Identify unlucky sites on the doomed path
  • Captures risk of failure from reaching site at
    all, regardless of predicate truth/falsehood

12
Getting to the Heart of the Matter
  • Looking for increase in failure odds
  • Correspondence to likelihood ratio testing

13
Multiple-Bug Filtering Ranking
  • Discard predicates having Increase(P) 0
  • Dead predicates
  • Invariant predicates
  • Bystander predicates
  • Others
  • Sort remaining predicates by Bad(P)
  • Likely causes with determinacy metrics

14
Case Study Moss
  • Reintroduce nine historic Moss bugs
  • Including wrong-output bugs
  • Instrument with everything weve got
  • Branches, returns, scalar pairs, the works
  • Generate 32,000 randomized runs

15
Effectiveness of Filtering
  • Eliminates 99 of branch predicates
  • 4170 ? 51
  • Eliminates 99.5 of return predicates
  • 2964 ? 16
  • Eliminates 96 of scalar pair predicates
  • 195,864 ? 8242

16
Effectiveness of Ranking
  • Five bugs captured by branches, returns
  • Lists are short, easy to examine by hand
  • Smoking guns rise to the top
  • Stop early if Bad() dips down
  • Two bugs buried in scalar pairs results
  • List is still too large to be useful
  • Two bugs never cause a failure
  • No failure, no problem!

17
Summary Putting it All Together
  • Wild guesses fair random sampling
  • Seek behaviors that co-vary with outcome
  • Statistical modeling, confidence tests, more
  • Future work
  • More selective instrumentation schemes
  • Non-uniform sampling
  • Improved statistical models
  • Use of program structure in analysis

18
Join the Cause!
The Cooperative Bug Isolation Project http//www.c
s.berkeley.edu/liblit/sampler/
19
(No Transcript)
20
Linear Regression
  • Match a line to the data points
  • Outcome can be anywhere along y axis
  • But our outcomes are always 0/1

21
Logistic Regression
  • Prediction asymptotically approaches 0 and 1
  • 0 predict success
  • 1 predict failure

22
Training the Model
  • Maximize LL using stochastic gradient ascent
  • Problem model is wildly under-constrained
  • Far more counters than runs
  • Will get perfectly predictive model just using
    noise

23
Regularized Logistic Regression
  • Add penalty factor for nonzero terms
  • Force most coefficients to zero
  • Retain only features that pay their way by
    significantly improving prediction accuracy

24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com