Title: An Introduction to Proof-Carrying Code
1An Introduction to Proof-Carrying Code
- David Walker
- Princeton University
- (slides kindly donated by George Necula
- modified by David Walker)
2Motivation
- Extensible systems can be more flexible and more
efficient than client-server interaction
server
client
client-server
3Motivation
- Extensible systems can be more flexible and more
efficient than client-server interaction
server
client
client-server
extensible system
extension
host
4Example Deep-Space Onboard Analysis
Data gt 10MB/sec
Bandwidth lt 1KB/sec Latency gt hours
- Note
- efficiency (cycles, bandwidth)
- safety critical operation
Source NASA Jet Propulsion Lab
5More Examples of Extensible Systems
Code
- Device driver Operating system
- Applet Web browser
- Loaded procedure Database server
- DCOM Component DCOM client
6Concerns Regarding Extensibility
- Safety and reliability concerns
- How to protect the host from the extensions ?
- Extensions of unknown origin ) potentially
malicious - Extensions of known origin ) potentially
erroneous - Complexity concerns
- How can we do this without having to trust a
complex infrastructure? - Performance concerns
- How can we do this without compromising
performance? - Other concerns (not addressed here)
- How to ensure privacy and authenticity?
- How to protect the component from the host?
7Approaches to Component Safety
- Digital signatures
- Run-time monitoring and checking
- Bytecode verification
- Proof-carrying code
8Assurance Support Digital Signatures
Code
- Example properties
- Microsoft produced this software
- Verisoft tested the software with test suite 17
- No direct connection with program semantics
- Microsoft recently recommended that Microsoft be
removed from ones list of trusted code signers
9Run-Time Monitoring and Checking
Code
Monitor
- A monitor detects attempts to violate the safety
policy and stops the execution - Hardware-enforced memory protection
- Software fault isolation (sandboxing)
- Java stack inspection
- Relatively simple effective for many properties
- Either inflexible or expensive on its own
10Java Bytecode
Code Compiler
JVM bytecode
Code
- Relatively simple overall an excellent idea
- Large trusted computing base
- commercial, optimizing JIT 200,000-500,000 LOC
- when is the last time you wrote a bug-free
200,000 line program? - Java-specific somewhat limited policies
11Proof-carrying code
Code Compiler
JVM bytecode
Proof
- Flexible interfaces like the JVM model
- Small trusted computing base (minimum of 3000
LOC) - Can be somewhat more language/policy independent
- Building an optimizing, type-preserving compiler
is much harder than building an ordinary compiler
12Proof-carrying code
Code Compiler
JVM bytecode
Proof
Question Isnt it hard, perhaps impossible, to
check properties of assembly language?
13Proof-carrying code
Code Compiler
JVM bytecode
Proof
Question Isnt it hard, perhaps impossible, to
check properties of assembly language? Actually,
no, not really, provided we have a proof to guide
the checker.
14Proof-Carrying Code An Analogy
15Proof-carrying code
Code Compiler
JVM bytecode
Proof
Question Well, arent you just avoiding the
real problem then? Isnt it extremely hard to
generate the proof?
16Proof-carrying code
Code Compiler
JVM bytecode
Proof
Question Well, arent you just avoiding the
real problem then? Isnt it extremely hard to
generate the proof? Yes. But there is a trick.
17PCC Type-Preserving Compilation
Code Compiler
JVM bytecode
Proof
Types
Types Compiler
- The trick we fool the programmer into doing our
- proof for us!
- We convince them to program in a typesafe
language. - We design our compiler to translate the typing
derivation - into a proof of safety
- We can always make this work for type safety
properties
18Good Things About PCC
- Someone else does the really hard work (the
compiler writer) - Hard to prove safety but easy to check a proof
- Research over the last 5-10 years indicates we
can produce proofs of type safety properties for
assembly language - Requires minimal trusted infrastructure
- Trust proof checker but not the compiler
- Again, recent research shows PCC TCB can be as
small as 3000 LOC - Agnostic to how the code and proof are produced
- Not compiler specific Hand-optimized code is Ok
- Can be much more general than the JVM type system
- Only limited by the logic that is used (and we
can use very general logics) - Coexists peacefully with cryptography
- Signatures are a syntactic checksum
- Proofs are a semantic checksum
- (see Appel Feltens proof-carrying
authorization)
19The Different Flavors of PCC
- Type Theoretic PCC Morrisett, Walker, et al.
1998 - source-level types are translated into low-level
types for machine language or assembly language
programs - the proof of safety is a typing derivation that
is verified by a type checker - Logical PCC Necula, Lee, 1996, 1997
- low-level types are encoded as logical predicates
- a verification-condition generator runs over the
program and emits a theorem, which if true,
implies the safety of the program - the proof of safety is a proof of this theorem
- Foundational PCC Appel et al. 2000
- the semantics of the machine is encoded directly
in logic - a type system for the machine is built up
directly from the machine semantics and proven
correct using a general-purpose logic (eg
higher-order logic) - the total TCB is approximately 3000 LOC
20The Common Theme
- Every general-purpose system for proof carrying
code relies upon a type system for checking
low-level program safety - why?
- building a proof of safety for low-level programs
is hard - success depends upon being able to structure
these proofs in a uniform, modular fashion - types provide the framework for developing
well-structured safety proofs - In the following lectures, we will study the
low-level typing mechanisms that are the basis
for powerful systems of proof carrying code