Abstract Interpretation and Future Program Analysis Problems - PowerPoint PPT Presentation

About This Presentation
Title:

Abstract Interpretation and Future Program Analysis Problems

Description:

Good fit with analysis problems of that era. Properties of local variables ... this approach requires analysis of entire program in top-down fashion ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 62
Provided by: martin49
Category:

less

Transcript and Presenter's Notes

Title: Abstract Interpretation and Future Program Analysis Problems


1
Abstract Interpretation and Future Program
Analysis Problems
  • Martin Rinard
  • Alexandru Salcianu
  • Laboratory for Computer Science
  • Massachusetts Institute of Technology

2
Abstract InterpretationThe Early Years
  • Formal Connection Between
  • Sound analysis of program
  • Execution of program
  • Broader Impact
  • Insight that analysis is execution
  • Reduced need to think of analysis as reasoning
    about all possible executions!
  • Good fit with analysis problems of that era
  • Properties of local variables
  • Within single procedure

3
How Is Abstract Interpretation Holding Up?
  • Technical result as relevant as ever
  • Moores Law effects
  • Much more computing power for analysis
  • More complex programs
  • Ambitious analyses
  • Heap properties
  • Multiple threads
  • Interprocedural partial program analyses
  • Stretch intuitive vision of analysis as execution

4
Outline
  • Combined pointer and escape analysis
  • Rationale behind design decisions
  • Alternative choices in design space
  • Challenges and Predictions
  • Bigger Picture

5
Goal of Pointer Analysis
  • Characterize objects to which pointers point
  • Synthesize finite set of object representatives
  • Derive representative(s) each pointer points to

p
  • r p.f

f
r
p.f points to a object, so after the
execution of r p.f, r may point to a
object, but not to a , , or object
6
Our Pointer Analysis Goals
  • Accurate for multithreaded programs
  • Compositional, partial program analysis
  • Analyze each procedure once
  • Independently of callers
  • May skip analysis of invoked procedures
  • Why?
  • Parts of program unavailable (different
    language, not written yet)
  • Parts may be irrelevant for desired result

7
Analysis Abstraction
  • Basic abstraction Is Points-to Graph
  • Nodes represent objects in heap
  • Edges represent references in heap

f
p
f
f
q
f
u
8
Two Kinds of Edges
  • Inside edges (solid) represent references
    created inside analyzed part of program
  • Outside edges (dashed) represent references
    created outside analyzed part of program

f
p
f
f
q
f
u
9
Two Kinds of Nodes
  • Inside nodes (solid) represent objects created
    inside analyzed part of program
  • Outside nodes (dashed) represent objects
  • Created outside analyzed part of program, or
  • Accessed via edges created outside analyzed part
    of program

f
p
f
f
q
f
u
10
Key Question
  • What does the heap look like when the procedure
    begins its execution?
  • Previous algorithms analyzed callers before
    callees, so model of heap always available
  • Unfortunately, this approach requires analysis of
    entire program in top-down fashion
  • Our solution use code to reconstruct what
    (accessed part of) heap must look like

11
Analysis In Example
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
q
12
Analysis In Example
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
q
13
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
q
14
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
q
s
15
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
q
s
16
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
f
q
s
One option continue to expand graph But the
analysis may never terminate
17
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
f
q
s
Instead have one outside node per load
statement Represents all objects loaded at that
statement Bounds graph and guarantees termination
18
Consequences of This Decision
  • Multiple objects represented by single node (load
    node in loop)
  • But can also have single object represented by
    multiple nodes in graph (!!)
    (object loaded at multiple statements)

f
do a q.f until (a null) do b q.f until
(b null)
f
q
f
f
19
Consequences of This Decision
  • Form of points-to graph depends on program
  • Programs with identical behavior but different
    graphs

f
f
p
p
f
r
r
f
f
f
f
q
q
s
s
do s s.f until (s null)
s s.f while (s ! null) s s.f
20
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
f
q
s
21
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
f
q
s
t
22
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
f
f
q
s
t
23
Analysis In Example
f
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
p
r
f
f
f
q
s
t
u
24
What Does Result Tell Us?
  • Nodes (outside)
  • Created outside analyzed part of program
  • Incomplete information
  • Nodes (inside, escaped)
  • Created inside analyzed part of program
  • But reachable from unanalyzed part of program
  • Incomplete information

f
p
r
f
f
f
q
s
t
u
  • Nodes (inside, captured)
  • Created inside analyzed part of program
  • Unreachable from unanalyzed part of program
  • Complete information about referencing
    relationships!

25
Crucial Distinction
  • Escaped vs. Captured
  • Enables analysis to identify regions of heap
    where it has complete information
  • Crucial for both
  • Accuracy of analysis
  • Effective use of analysis results

f
p
r
f
f
f
q
s
t
u
26
Multiple Calling Contexts
f
  • Two Key Assumptions
  • p and q refer to different objects
  • Parallel threads may access objects

p
r
f
f
f
q
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
s
t
27
Multiple Calling Contexts
What if p and q refer to the same object? (i.e. p
and q aliased)
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
r
f
f
p
f
f
q
s
t
28
Multiple Calling Contexts
f
p
What if p and q refer to the same object and
there are no parallel threads?
r
f
f
f
q
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
s
t
r
f
f
p
f
f
q
s
t
29
Multiple Calling Contexts
What if p and q refer to the same object and
there are no parallel threads?
m(p, q) r new C() p.f r s q do s
s.f until (s null) t new C() s.f
t u new C()
r
p
f
f
q
s
t
30
Issues
  • Substantially different results for different
    calling contexts
  • But caller is unavailable at analysis time
  • New analysis for each possible context?
  • Lots of contexts
  • Most of which probably wont be needed

31
Our Solution
f
p
  • Analyze assuming
  • Distinct parameters
  • Parallel threads
  • Aliased parameters at caller? Merge nodes
  • No parallel threads? Remove outside edges and
    nodes

r
f
f
f
q
s
t
r
f
f
p
f
f
q
s
t
32
Solution Is Not Perfect
  • Specialization can lose precision can have two
    procedures such that when analyzed with
  • Distinct parameters same analysis result
  • Aliased parameters - different analysis result
  • Conceptually complex analysis
  • Think about all contexts during analysis
  • Start to lose intuition of analysis as execution
  • Difficult time applying abstract interpretation
    framework

33
Abstract Interpretation and Analysis
Abstract interpretation is parameterized framework
  • V concrete values
  • A abstract values
  • ? - abstraction function
  • ? - concretization function

ta
a1
a2
?
?
?
?
tv
v1
v2
34
Applying Framework
  • A points-to graphs
  • V concrete heaps
  • ? - points-to graph for a given heap
  • Points-to graph depends on program
  • Need to augment heap with access history
  • ? - all heaps that correspond to points-to graph
  • OK, I give up

35
Correctness Proof
  • Inductively construct a relation ? between
  • Objects in heap
  • Nodes that represent objects
  • Invariants that characterize ?
  • Transfer function
  • Takes points-to graph and ?
  • Give new points-to graph and ?
  • Prove that transfer functions preserve invariants

36
Threads and Abstract Interpretation
  • Philosophy of Abstract Interpretation
  • Come up with a decent abstraction
  • Execute program on that abstraction
  • Problem with threads
  • Execution usually modeled as interleaving
  • Too many interleavings!

37
Our Solution
  • Points-to graphs explicitly represent all
    possible interactions between parallel threads
  • Basic Analysis Approach
  • Analyze each thread in isolation
  • To compute combined effect of multiple threads
  • Retrieve result for each thread
  • Compute interactions that may occur

Outside edges Interactions in which one thread
reads a reference created by parallel thread
Inside Edges Interactions in which one thread
creates a reference read by parallel thread
38
Interthread Analysis
n(p,q) m(p,q)
39
Interthread Analysis
n(p,q) m(p,q)
p
q
q
Retrieve points-to graph from analysis of each
thread
40
Interthread Analysis
n(p,q) m(p,q)
p
q
q
Establish correspondence between nodes
Start with parameter nodes
41
Interthread Analysis
n(p,q) m(p,q)
p
q
q
  • Compute Interactions Between Threads
  • Match inside and outside edges
  • For each outside node, compute nodes in other
    graph that it represents

42
Interthread Analysis
n(p,q) m(p,q)
p
q
q
  • Compute Interactions Between Threads
  • Match inside and outside edges
  • For each outside node, compute nodes in other
    graph that it represents

43
Interthread Analysis
n(p,q) m(p,q)
p
q
q
  • Use computed representation relationship to
  • combine graphs and
  • obtain single graph for the execution of both
    threads

q
44
Property of Analysis
  • Flow-sensitive within each thread (if reorder
    statements, get different result)
  • Flow-insensitive between threads
  • Assumes interactions can happen
  • Any number of times
  • In any order
  • Analysis models interactions that cant actually
    happen in any interleaved execution

45
Imprecision Due To Flow Insensitivity
n(a,b,c) 1pb.f p.fa 2a.fb
m(a,c) 3qa.f 4q.fc

Interthread Analysis Result
Execution Order Required to Produce Blue Edge
a
1
3
b
2
4
c
46
Weak Memory Consistency Models
47
Initially y1 x0
Thread 2
Thread 1
y0
z xy
x1
What is value of z?
48
Initially y1 x0
Three Interleavings
z xy
y0
Thread 2
Thread 1
z xy
y0
y0
x1
x1
z xy
z 0
z 1
x1
y0
What is value of z?
x1
z xy
z 1
49
Initially y1 x0
Three Interleavings
z xy
y0
Thread 2
Thread 1
z xy
y0
y0
x1
x1
z xy
z 0
z 1
x1
y0
What is value of z?
x1
z can be 0 or 1
z xy
z 1
50
Initially y1 x0
Three Interleavings
z xy
y0
Thread 2
Thread 1
z xy
INCORRECT REASONING!
y0
y0
x1
x1
z xy
z 0
z 1
x1
y0
What is value of z?
x1
z can be 0 or 1
z xy
z 1
51
Initially y1 x0
Memory system can reorder writes as long as it
preserves illusion of sequential execution within
each thread!
Thread 2
Thread 1
y0
y0
z xy
z xy
x1
x1
What is value of z?
Different threads can observe different orders!
z can be 0 or 1 OR 2!
52
Implications for Example
n(a,b,c) 1pb.f p.fa 2a.fb
m(a,c) 3qa.f 4q.fc

Interthread Analysis Result
Blue Edge Can Actually Occur in Some Execution!
a
Cant reason about program by interleaving
statements
1
3
b
2
4
c
53
Implications for Analysis of Multithreaded
Programs
  • Analyzing all statement interleavings is unsound
  • We believe that our flow-insensitive analysis is
    sound even for weak consistency models
  • But formal semantics of weak memory consistency
    models still under development
  • Maessen, Arvind, Shen OOPSLA 2000
  • Manson, Pugh Java Grande/ISCOPE 2001
  • Unclear how to prove ANY analysis sound

54
Challenges and Predictions
55
Need To Analyze Partial Programs
  • Fact of life - whole program may be either
  • Unavailable,
  • Infeasible to analyze, or
  • Unnecessary to analyze
  • Challenges
  • What is starting context(s) for analysis?
  • What is effect of invoked but unanalyzed parts of
    program?
  • Especially difficult for linked data structures

56
Need To Analyze Partial Programs
  • Predictions
  • Future analyses will not use presented technique
  • Care about more sophisticated properties
  • Need more information about calling context
  • Many potential calling contexts never used
  • Analysis will instead start with specification
  • Provided by programmer
  • Automatically guessed by unsound static analysis
    heuristic or dynamic analysis
  • Then automatically verify specification

57
Multithreaded Programs
  • Challenge too many potential executions
  • Prediction more two phase analyses
  • Phase One
  • Analyze each thread in isolation
  • Represent potential interactions between analyzed
    thread and other threads
  • Phase Two
  • Collect results from parallel threads
  • Compute interactions between threads

58
Multithreaded Programs
  • Prediction
  • Language will enforce more structured model
  • Enhanced type system
  • Force threads to interact only at explicit
    synchronization points
  • Development of structured analyses
  • Analyze single thread in isolation between
    synchronization points
  • Apply potential interaction effects only at
    synchronization points

59
Weak Memory Consistency Models
  • Challenges
  • Lack of good formal semantics
  • Explosion in possible program behaviors
  • Short Term Prediction
  • Development of formal semantics
  • Flow-insensitive analyses proved sound
  • Long Term Prediction
  • Structured model will force threads to interact
    only at synchronization points
  • Eliminate visibility of weak models

60
Trends
  • More sophisticated properties
  • Harsher analysis environments
  • Partial programs
  • Threads with weak consistency models
  • Role of abstract interpretation
  • Intuition of analysis as execution breaking down
    as analyses become more ambitious
  • Analyses starting to look like verifications
  • Synthesis of loop invariants
  • Synthesizing global view of computation

61
Bigger Picture
No idea what program should do
Can write full formal specification for program
Correctness Crucial
?
Program verification
?
Abstract Interpretation
Dynamic Analyses
?
?
Unsound Static Analyses
?
?
Dont care if program works reliably or not
Write a Comment
User Comments (0)
About PowerShow.com