Title: Partial Program Admission by Path Enumeration
1Partial Program AdmissionbyPath Enumeration
Michael Wilson Advisor Jonathan Turneralso with
Ron Cytron
2Problem Context
- Virtualizing the network core
- Router hosting platforms will run third-party
networking protocols - Router platforms run at Internet Core speeds
(5-10 Gbps) - High-Speed Networking platforms are usually
non-preemptive, lack OS protections - Third-party code cannot be trusted!
- Memory protection is largely a solved problem
- How to ensure that untrusted network protocols
dont hog the CPU? - Each protocol must adhere to a very strict cycle
budget
3Existing Solutions
- Worst-Case Execution Time (WCET)
- Longest path analysis
- Just find the longest path from entry to exit
- Arose from circuit analysis
- Wont work with loops
- Instrumentation
- Add run-time checks to ensure code stays under
budget - Our budgets are very tight adding a run-time
check might push safe paths over budget! - Integer Linear Programming
- Requires developers to provide branch constraints
- Deals well with loops
- We cant trust the developers!
- A new solution is needed.
4Partial Program Admission
- Better method partial program admission
- We can determine which execution paths are within
budget and admit just those paths - Use an over budget exception handler
- Modify the program at load time to enforce good
behavior - Modifications result in zero runtime overhead
- Requires explicit path enumeration,
computationally expensive - Dynamic Programming to make the problem tractable
5Partial Program Admission
- Why partial program admission?
- What good is it to just run part of your program?
- Not really about running part of a program, but
about transforming a program to make proofs
easier - Developer may know program branch constraints
that render some paths impossible - if (A)do foo()
- if (not A)do bar()
- Substitute code duplication for developer
knowledge
6Solution Concept
SafeProgram
Program
Emit
Enumerate
Bound
Coalesce
CFG
CFT
BXG
Compile
BXT
- Process is a series of graph transformations
- Start with a Control Flow Graph (CFG)
- Enumerate all paths within the CFG to produce a
Control Flow Tree (CFT) - Bound all paths in the CFT by our budget to
produce a Bounded Execution Tree (BXT) - Coalesce identical subtrees to reduce code
duplication, producing the Bounded Execution
Graph (BXG) - Our actual algorithm shortcuts directly from CFG
to BXG
7Control Flow Graph (CFG)
CFG to CFT
S
- Digraph representation of a program
- Vertices represent pieces of code
- Edges represent flow of control
- Weights represent cycle counts to traverse parent
vertex - Artificial source, sink S, T
0
A
Aif (cond1) Belse CDif (cond2)
Eelse FG
2
2
B
C
1
3
D
1
1
E
F
1
4
G
1
T
8CFG to CFT
CFT to BXT
S
0
A
2
2
B
C
1
3
D
1
1
E
F
1
4
G
1
T
9CFT to BXT
S
0
A
2
2
At a budget of 10, this path ran out of time
Lets prune this path by aborting to the
exception handler
Now all paths are under budget. All previously
over-budget paths generate exceptions. This
program is safe to run!
B
C
1
3
D2
D1
1
1
1
1
E1
E2
F1
F2
4
1
1
4
G1
G2
G4
G3
However, its very bloated. 7 vertices have been
replaced with 11.
1
1
1
1
T1
T2
T3
T4
X
6
9
11
8
6
10BXT to BXG
S
0
A
2
2
We can reduce this bloat by coalescing identical
subtrees
Our bloat is greatly reduced! From our original 7
vertices, we now have 8.
These two subtrees are identical. Theres no
reason to keep them both.
These subtrees are not identical, so we cant
coalesce them.
Lets merge.
We have two remaining identical subtrees, so
lets merge them, too.
B
C
1
3
D2
D1
1
1
1
1
E2
E1
E1/E2
F1
Unfortunately, direct path enumeration is too
slow to be useful. Fortunately, we can use
Dynamic Programming.
1
1
4
G2
G1
G1/G3
G1/G2/G3
G3
1
1
1
T2
T3
T1
T1/T3
X
T1/T2/T3
11Dynamic Programming
- Dynamic programming
- Remember answers to previously solved problems,
look them up later instead of recomputing them - This works when we have repeated subproblems or
overlapping subproblems.
12Applying Dynamic Programming
- Need to select repeated subproblems or
overlapping subproblems - Our overlapping subproblem
- Given an execution subtree rooted at u and a
cycle budget B, what is the bounded execution
subtree bxtB(u)? - bxti(u) and bxtj(u) are often identical when i,j
are close - We rely on the concept of Intervals of budgets
- Interval i,j for execution tree rooted at
vertex u such that bxti(u) bxtj(u). - For all k such that i ? k ? j, bxti(u) bxtj(u)
impliesbxti(u)bxtk(u)bxtj(u) - From here on, we will only deal with maximal
intervals
13Intervals
- Using intervals for dynamic programming
- We want the interval for u, B10
- Recurse for child intervals
- Adjust for edge weights
- Intersect child intervals
- Now we want B8
- 8 is in our interval 8,12! No need to
recompute! - Now we want B6
- Still need to compute new intervals.
S
Budget B20
6
8
10
u
7,12
8,14
8,12
4
1
v1
v2
7,13
3,8
14The Algorithm
- Preliminaries
- Interval data type as described
- Interval search object
- Can associate a vertex and an interval
- inserts, lookups
- Assume we have done a reverse Dijkstra over the
entire graph from T, storing the result in d(v,T).
15The Algorithm
interval bxg(integer R, vertex u) if weve
already computed this subtree return the
interval else if were over budget record
an exception interval and return it else if
were at the sink record a basis interval and
return it else recursively analyze
children combine results record and and
return it
- To create the bxg, we call bxg(B,S) where B is
the budget and S is the root of the CFG. - The bounded execution graph is embedded in the
intervals we have computed
16Evaluation Methodology
- The duplication factor is crucial
- High code bloat makes the algorithm useless
- To test the algorithm, we created synthetic CFGs
by applying graph transformations analogous to
grammar production rules in a C-like language - Roughly 1000 vertices each, corresponding to code
size of around 3600 instructions. - Our real programs of interest are much
smaller(50 vertices, lt1000 instructions) - Acyclic graphs only for this examination
- We also tested the algorithm on one real CFG from
our problem area.
17Performance Real CFG
- Original code size
- 180
- Worst-case
- 296 at 85 cycles
- (About 1.6x)
- Longest path
- 108 cycles
IPv4 Header Format
Instructions
Budget (cycles)
18Performance Synthetic CFGs
- Mean worst-case
- 1.6
- Median worst-case
- 1.08
- Worst worst-case
- 23.4
Code Duplication Distribution
Percentage of CFGs
Maximum Duplication Required (Normalized)
19Future Work
- Examine more real CFGs
- Strategies to compact that BXG by adding run-time
checks when they wont impact safe paths - Incorporate program analysis for branch
constraints - Detect loop iteration bounds
- Detect mutually exclusive paths
- Program analysis for the real WCET
- Constant propagation over all branch constraints
- Solves a decidable variation of the halting
problem