Stacks, Heaps and Regions: One Logic to Bind Them

About This Presentation

Title:

Stacks, Heaps and Regions: One Logic to Bind Them

Description:

An activation record is a sequence of adjacent locations ... the top of the stack, we can access the items in the current activation record ... – PowerPoint PPT presentation

Number of Views:79

Avg rating:3.0/5.0

Slides: 79

Provided by: amala1

Learn more at: https://www.cs.princeton.edu

Category:

more less

Transcript and Presenter's Notes

Title: Stacks, Heaps and Regions: One Logic to Bind Them

1
Stacks, Heaps and RegionsOne Logic to Bind Them

David Walker
Princeton University
SPACE 2004

2
Stacks, Heaps and RegionsOne Logic to Bind Them

David Walker
Princeton University
With Amal Ahmed Limin Jia

3
Certifying Compilers
Source Program

Certifying compilers produce
machine code
safety proof
type safety
thread safety
memory safety
Uses
trustworthy mobile code
safety-critical systems
compiler debugging

Certifying Compiler
Machine Code Safety Proof
4
Certifying Compilers

Low-level typing abstractions
support diverse source languages
support diverse implementation optimization
strategies
clean interface between compiler and mechanical
safety checkers

Java
C
ML
Transform, Optimize
Low-level Typing Abstractions
Machine Code Safety Proof (Typing Invariants
Encoded)
5
TALx86 Lessons Morrisett et al.

Checking control-flow safety is fairly easy
State memory management is the hard part
new typing algorithms for each new compiler trick
machine register state
heap memory (pointers, structs, ...)
stack memory (stack pointers, stack structs, ...)
user-managed memory (more pointers, aliasing
info, ...)
Results
complex, ad hoc axioms (type checker less
trust-worthy)
repeated work
abstractions not generally composable or reusable

6
A Goal for SPACE 20...

What we are looking for A new proof-carrying
code system/typed assembly language for safe
memory management
More uniform more general
Easier to understand (simpler semantics)
Allows reuse and composition of abstractions
A promising approach Search for new logics that
can capture common storage invariants
Following Ishtiaq, OHearn, Pym, Reynolds, and
others insights on storage semantics separation
logic
And Pfenning, CMU crew and others logical design
techniques work on logical frameworks

7
This Talk

What recurring properties of memory do we need to
reason about in a proof-carrying code system?
Internalizing storage properties in a modal
substructural logic
Semantics of formulae
Using the logic to describe state in a low-level
type system (briefly)
Related Future work
This talk based on work at TLDI 03 LICS 03

8
Property 1 Separation

The memory for the heap is separate from the
memory for the stack
The register EAX is separate from register EBX
(and ECX, etc...)
In general, memory A is separate from memory B if
the domain of A does not overlap with the domain
of B

?74
?75
?7
?8
?9
?14
?15
stack
heap
EAX
EBX
9
Property 1 Separation

The importance of separation
If memory A is separate from memory B then
updates to A have no impact on B
Eg updating the stack does not change values in
the heap
Eg updating EAX does not change the contents of
EBX
Eg deallocating region r1 has no impact on
region r2 (if they are separate)
Present in
Linear type systems
TALx86
Ishtiaq, OHearn, Reynolds separation logic

10
Property 2 Adjacency

A struct is a sequence of adjacent locations
An activation record is a sequence of adjacent
locations
A stack is a sequence of adjacent activation
records
In general, A is adjacent to B if the greatest
location in A is next to the least location in B,
and A is separate from B

?7
?8
?9
a1
a2
rest...
top
11
Property 2 Adjacency

The importance of adjacency
If memory A is adjacent to memory B and we can
access A then we can access B
Eg using a pointer to the beginning of a
struct, we can access all of its elements
Eg using a pointer to the top of the stack, we
can access the items in the current activation
record
Present in
TALx86
Foundational PCC (Appel et al)
Ordered type systems (Petersen et al.)

12
Property 3 Containment

Register EAX can contain an integer value (or a
pointer value or other kinds of values)
A memory location (say, ?7) can contain a
sequence of 32 bits
A user-managed memory region may contain a
collection of memory locations.

EAX
3
31
0
1
...
?7
on
on
off
?22
?7
13
R7
7
13
Property 3 Containment

The importance of containment
If A is contained in memory region r and region r
has property P then A has property P
Eg EAX may contain an integer --- if so, we can
add 3 to the contents of EAX
Eg Memory region R1 may contain live data ---
if so, we can dereference pointers into that
region
Present in
Tofte Talpins region calculus
Cardelli, Gardner, Ghelli Gordons ambient, tree
graph logics
TALx86 (registers, static data segment, stack
heap)

14
Property 4 Aliasing

Two pointers are aliases of one another if they
are the same location.
Aliasing information is important since changing
memory at x changes memory at y
Present in
every system!!
Talx86 reasoned about heap aliases and stack
aliases

(x y)
x
y
3
15
This Talk

What recurring properties of memory are
convenient for reasoning in a proof-carrying code
system?
Internalizing storage properties in a modal
substructural logic
Semantics of formulae
Using the logic to describe state in a low-level
type system
Related Future work
This talk based on work at TLDI 03 LICS 03

16
Preliminaries - Memories

A memory is a mapping from locations to values.
Each location may have a single successor.
Successor relation gives rise to an ordering.
Locations may be composite
? ?.n eg .R1.a7
.R2.a14.b0

m
?9
?6
?5
?16
?7
?17
a
3
1
r2
r1
17
Formulae

Predicates q t
Formulae F q
Semantics of formulae given by m ? F _at_ ?
F describes memory m, whose contents are located
in place ? (? acts like a constraint on the
memory)
Simplest case
m ? t _at_ ? iff dom(m)? and ? m(?) t

18
Formulae

Example
m ? int _at_ ?3 if

m
?3
(notice ? m(?3) int )
5
19
Formulae Separation

Predicates q t
Formulae F q F1 ? F2
m ? F1 ? F2 _at_ ? iff exists disjoint m1 and m2
such that
m1 ? F1 _at_ ? and m2 ? F2 _at_ ?
and mm1?m2

20
Formulae Separation

Example
m1 ? F1 _at_ m2 ? F2 _at_

m2
m1
?3
?16
?17
?7
?8
?9
7
r6
?3
?16
?5
21
Formulae Separation

Example
m1?m2 ? F1 ? F2 _at_

m1?m2
?3
?16
?17
?7
?8
?9
7
r6
?3
?16
?5
22
Formulae Adjacency

Predicates q t
Formulae F q F1 ? F2 F1 ? F2
m ? F1 ? F2 _at_ ? iff there exist adjacent (and
disjoint)
m1 , m2 such that
m1 ? F1 _at_ ? and m2 ? F2 _at_ ?
and mm1?m2

23
Formulae Adjacency

Example
m1 ? F1 _at_ m2 ? F2 _at_

m2
m1
?3
?5
?7
?8
?9
?10
?16
?17
7
b
c
24
Formulae Adjacency

Example
m1?m2 ? F1 ? F2 _at_

m1?m2
?3
?5
?7
?8
?9
?10
?16
?17
7
b
c
25
Formulae Containment

Predicates q t
Formulae F q F1 ? F2 F1 ? F2 nF
m ? nF _at_ ? iff m ? F _at_ ?.n

26
Formulae - Containment

Example
m ? eaxint _at_ since m ? int _at_ .eax

since ? m(.eax) int
m
eax
5
27
Formulae - Containment

Example
m ? eaxint ? ebxchar _at_

m
eax
ebx
5
a
28
Formulae - Containment

Example
m ? eaxint ? ebxchar _at_
since m1 ? eaxint _at_ and m2 ? ebxchar _at_

m
eax
ebx
5
a
29
Formulae - Containment

Example
m ? eaxint ? ebxchar _at_
since m1 ? eaxint _at_ and m2 ? ebxchar _at_
since m1 ? int _at_ .eax and m2 ? char _at_ .ebx

m
eax
ebx
5
a
30
Aliasing

Types t int bool S(?) ...
Predicates q t
Formulae F q F1 ? F2 F1 ? F2 nF
? v S(?) iff v ? (all values with type
S(?) are
aliases of one another)

31
Aliasing
aliases

Example
m ? eaxS(.a2) ? ebxS(.a2) ? a2int _at_

m
eax
ebx
a2
7
32
One More Useful Predicate

Types t int bool S(?) ...
Predicates q t more? more?
Formulae F q F1 ? F2 F1 ? F2 nF ...
m ? more? m ?
more?

m
m
?7
?8
?9
?6
?5
?4
?17
?18
?19
?16
?15
?14
. . .
. . .
33
Simple Machine Memory Layout

( more? ? ?hd t ? Ftail ? Fheap ? ?ap t ?
more? )
? r1 t1 ? r2 t2 ? . . . ? sp S(?hd) ? ap
S(?ap)

?hd
?ap
. . .
. . .
. . .
. . .
more? Ftail
Fheap more?
sp
r1
r2
ap
34
More logic

Predicates q t more? more?
Formulae F q F1 ? F2 F1 ? F2 nF
1 F1 -o F2
F1 F2 ? F1 ? F2 0
f ?b. F b.F
Bindings b ?L nN aT f F
m ? 1 iff dom(m) is empty
m ? F1 F2 iff m ? F1 and m ? F2
m ? ? (holds for any memory m)
....

35
Logical Deduction

Judgments have the form q ? D ? F _at_ ?
is a variable context a list of free variables
their kinds
is a bunched context trees rather than lists
(OHearn Pym, 1999)
? . (F _at_ ?) ?, ? ? ?

object at a place
adjacent storage (no exchange prop)
separate storage (exchange prop)
36
Logical Deduction

The natural deduction rules are sound with
respect to the storage semantics
Semantics of contexts m ? D
Theorem (Soundness)
If m ? D and ?? D ? F _at_ ? then m ? F _at_ ?.

37
This Talk

What recurring properties of memory are
convenient for reasoning in a proof-carrying code
system?
Internalizing storage properties in a modal
substructural logic
Semantics of formulae
Using the logic to describe state in a low-level
type system
Related Future work
This talk based on work at TLDI 03 LICS 03

38
Mini-KAM Simplified ML Kit Abstract Machine

Registers r acc1 acc2 sp
Values v ....
Instructions i immed1(v) immed2(v) add
sub push pop
selectStack(i) storeStack(i)
select(i) store(i)
letRgnInf endRgnInf alloc(i)

Types t int S(?) live dead
(F _at_ ?) ? 0
Integers 5 int
Places ? S(?)
Region status live live dead dead
Code Locations c (F _at_ ?) ? 0
Means it is safe to jump to c with a memory m
such that m ? F _at_ ?

40
Mini-KAM Simplified ML Kit Abstract Machine

Mini-KAM Store Hierarchy

acc1 acc2 sp stack R1 . .
. Rn
R1live ? F ? (a- ? more?)
stmore? ? ak- ? . . ? a1- ? ?
current activation record
description of data in region
region allocation boundary
live region
stack tail
stack area
41
Using Formulae in Typing Rules

Judgments of the form F _at_ ? can be used to
describe the pre and postconditions of
instructions
Instruction typing judgment q ? F _at_ ? ? i
F _at_ ?

42
Using Formulae in Typing Rules

Judgment q ? F _at_ ? ? i F _at_ ?
In J, look up the type of place ?.n
J(?.n) F if ?? J ? (? ? nF ) _at_ ?
Rule for add instruction
(F _at_ ?)(.acc1) int (F _at_ ?)(.acc2)
int
q ? F _at_ ? ? add F _at_ ?

43
Using Formulae in Typing Rules

Judgment q ? J ? i J (where J is of the
form F _at_ p)
J(.sp)S(.stack.n0) J(.acc1)t
q ? J ? storeStack(i) J.stack.no i
t

( storeStack)
In J, update the type of place ?.no
i J?.noi t (F1 ? n0- ? ??? ? nit ?
F2) ? F3 _at_ ? if ?? uJ ? ((F1 ?
n0- ? ??? ? ni- ? F2) ? F3) _at_ ?
44
This Talk

What recurring properties of memory are
convenient for reasoning in a proof-carrying code
system?
Internalizing storage properties in a modal
substructural logic
Semantics of formulae
Using the logic to describe state in a low-level
type system
Related Future work
This talk based on work at TLDI 03 LICS 03

45
Related Work

Reasoning about adjacency
Stack-based TAL (Morrisett et al., 1998)
Foundational PCC reasoning about memory
allocation (Appel et al.)
lord - calculus for reasoning about data layout
at the frontier (Petersen et al., 2003)
Reasoning about aliasing
Long history . . . singleton types for aliasing
(Smith, Walker Morrisett) continue to be useful
Spatial logics separation and/or containment
BI, separation logic (Ishtiaq, OHearn, Reynolds
others, 2000, 2001)
Ambient logic (Cardelli Gordon, 2000)
Tree and graph logics (Cardelli, Gardner, Ghelli,
2002)

46
Lots More Work to Do

Add inductive definitions syntactic rules for
reasoning about arrays, recursive data structures
Investigate encodings for common invariants
stack-allocation algorithms
region-allocation algorithms
aliasing patterns
Better understand the connection between modal
(hybrid) logic regions

47
Conclusion

Described a unified framework for reasoning about
Separation
Adjacency
Containment
Aliasing
Semantics are sound, simple and uniform
Logic forms the basis for a sound and flexible
low-level type system
See TLDI 03 LICS 03 for details

48
(No Transcript)
49
(No Transcript)
50
May Alias Formula

when two bits of storage (at a1 and a2) may
alias
?a1. ?a2. (a1int ? ?) (a2int ? ?)
both memories satisfy the formula

a1
a2
a
5
7
5
51
Example Saving Temporaries on the Stack

Code Describing Formula
(b-stackgrow)(x 2)
(b-unpack)(x 2)
sub sp,sp,2
st sp0,r1
st sp1,r2
lt Code for A gt
ld r1,sp0
ld r2,sp1
add sp,sp,2

Fpre
(more? ? ?1a1 ? ?2a2 ? ?t ? F1) ?
spS(?) ? r1t1 ? r2t2
Fpost
(more? ? ?1a1 ? ?2a2 ? ?t ? F1) ?
spS(?1) ? r1t1 ? r2t2
52
Formulae Wrapped in Types

Types t int S(p) (F _at_ p) ? 0
Informally, c (F _at_ p) ? 0 means it is safe to
jump to c with a memory m such that m ? F _at_ p

53
Motivation Certifying Compilers
Source Program
Certifying Compiler
Safety Proof
Machine Code
54
Motivation Certifying Compilers
Source Program
Parse, Typecheck
High-level Typed IL
Analysis, Optimization
Type- preserving Compiler
Medium-level Typed IL
Code Generation
Typed Assembly Language
Assembler
Hints
Prover
Safety Proof
Machine Code
55
Motivation Certifying Compilers
Java
Java
ML
High TIL High TIL High TIL
Optimize Optimize Optimize
Type- preserving Compiler
Medium-level Typed IL
Code Generation
Typed Assembly Language
Assembler
Hints
Prover
Safety Proof
Machine Code
56
Motivation Proof-Carrying Code

The Princeton foundational PCC system (Appel et
al.)
Scaling PCC to production compilers and realistic
languages
Some requirements
Multiple source languages, single target language
Core proof system must be general and flexible
support for general language features
handle different implementation and optimization
strategies
Trusted computing base should be small
to limit security bugs

57
PCC System Layers of Abstraction
Compiler
High-level typing abstractions
Low-level typing abstractions
Semantics of types
Machine spec
Higher-order logic
58
A Hard Problem (Semantics)

Semantics of memory updates and memory reuse
Semantic model of ML-style mutable references
(Ahmed, Appel, Virga, 2002)
To handle ML function closures
extended model with mutable references to
(impredicative) polymorphic types (Ahmed, Appel,
Virga, 2003)
To allow memory reuse
extended model to support region-based memory
management

59
Motivation Certifying Compilers
Java
C
ML
High-level Typed IL
Analysis, Optimization
Medium-level Typed IL
Typing abstractions (TAL)

Should be general flexible support many
language features
implementation
optimization strategies

Prover
Machine Code Safety Proof
60
Typing Abstractions for Memory

Reasoning about memory is complicated
many different memory management strategies,
aliasing patterns, data layout possibilities,
etc.
Systems for safe mobile code would benefit from
a unified framework for reasoning about a variety
of invariants
convenient abstractions that help structure
proofs of memory safety

61
Abstractions for Memory?
62
Abstractions for Memory?
Cornell Popcorn Cyclone
Cedilla Systems Special J
Princeton Foundational PCC
Source
Source
Source
High TIL
High TIL
High TIL
Medium TIL
Medium TIL
Medium TIL
TALx86
LTAL
VCGen Prover
Prover
Machine Code Safety Proof
Machine Code Safety Proof
Machine Code Safety Proof
63
Abstractions for Memory?

Reasoning about
memory is
complicated
many different
memory
management
strategies,
aliasing patterns,
data layout
possibilities, etc.

64
Typing Abstractions for Memory?

Reasoning about memory is complicated
many different memory management strategies,
aliasing patterns, data layout possibilities, etc.

65
Formulae Wrapped in Types

Types t int S(p) (F _at_ p) ? 0
Informally, c (F _at_ p) ? 0 means it is safe to
jump to c with a memory m such that m ? F _at_ p

66
Lessons from Typed Assembly Language

Lesson 1
Much of the type theory designed for higher-level
languages can be reused to help verify machine
code.
TAL is just the closed, continuation-passing
style polymorphic lambda calculus ()
Lesson 2
The hard part is memory management memory
safety.

67
One Logic to Bind Them

New goals for general-purpose safe memory
management
composable abstractions
reusable abstractions
orthogonal abstractions
comprehensible abstractions
A unified composable framework for reasoning
about
separation of objects (memory blocks)
adjacency of objects
aliasing of pointers
containment of one place in another
Proof that deduction in our logic is sound with
respect to the memory model
Use logic in a type system for an IL for
region-based memory management (Mini-KAM) and
prove that the language is sound

68
This Talk

Logical formulae and the memory model
Flat memory
Hierarchical memory
Type system for Mini-KAM (informally)

69
A Logical Approach to Memory Management

One logic for reasoning about key storage
properties
separation of objects (memory blocks)
adjacency of objects
containment of one place in another
aliasing of pointers
Logic comes with
orthogonal connectives to internalize key
properties
syntactic proof rules
sound store semantics
Logic is incorporated into a typed abstract
machine
safe stack, heap and region-based memory
management

70
Formulae Multiplicative Unit

Predicates q ? t
Formulae F q F1 ? F2 F1 ? F2 1
m ? 1 iff

m
71
Hierarchical Memories
m
72
Hierarchical Memories, Paths

m
R2
R1
R1
R2
?7
?8
?9
?14
?15
?7
?8
?9
?14
?15

Path/place p p.n eg
.R1.?7 .R2.?14

73
Hierarchical Memories, Paths

m A1
R2
R1
R1
R2
?7
?8
?9
?14
?15
?7
?8
?9
?14
?15

Path/place p p.n eg
.R1.?7 .R2.?14
A hierarchical memory is a mapping from paths to
values.

74
Formulae Containment

Predicates q t more? more?
Formulae F q F1 ? F2 F1 ? F2 1
F1 F2 ? F1 ? F2 0
f ?b. F b.F nF
Bindings b pP nN aT f F

Semantics given by m ? F _at_ p
75
Formula Semantics Separation

Formulae F F1 ? F2 nF
m ? (F1 ? F2) _at_ p iff there exist disjoint
m1 and m2
m1 ? F1 _at_ p and m2 ? F2 _at_ p
and mm1?m2

76
Formula Semantics Separation

Example
m1 ? F1 _at_ m2 ? F2 _at_
dom(m1).R5.?3 dom(m2).R5.?4

m1
m2
R5
R5
?3
?4
3
3
77
Formula Semantics Separation

Example
m1 ? F1 _at_ m2 ? F2 _at_
dom(m1).R5.?3 dom(m2).R5.?4

m1?m2
R5
R5
?3
?4
3
3
m1?m2 ? (F1 ? F2) _at_
78
Sample Deductive Rules
(hypothesis)
q ? F _at_ p ? F _at_ p
q ? ? ? F _at_ p.n
q ? ? ? nF _at_ p
(n I)
(n E)
q ? ? ? nF _at_ p
q ? ? ? F _at_ p.n