Improving Extension Reliability Using LanguageBased Techniques

About This Presentation

Title:

Improving Extension Reliability Using LanguageBased Techniques

Description:

Run in the same protection domain. Extensions are often buggier than hosts ... protection domain & virtual machines: Nooks [Swift et al], L4 [LeVasseur et al] ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 39

Provided by: billmcc3

Category:

more less

Transcript and Presenter's Notes

Title: Improving Extension Reliability Using LanguageBased Techniques

1
Improving Extension ReliabilityUsing
Language-Based Techniques

Ph.D. Qualifying Examination
Feng Zhou
CS, UC Berkeley
11/21/2006

2
Motivation

OSes and applications often run loadable
extensions
e.g. Linux kernel, Apache, Firefox
Run in the same protection domain
Extensions are often buggier than hosts
Device drivers cause a large percentage of
Windows crashes
Xbox hacked due to memory bugs in games

3
Problem Statement

The Extension Isolation Problem
Detect extension failures and protect other parts
of the system from these failures

4
Why Extensions Fail?

Memory-safety problems
Null pointer dereference
Buffer overrun
Dangling pointers
Concurrency problems
Race conditions
Deadlocks
Domain specific problems (e.g. interrupt-related)
Improper API usage
Using a file descriptor after closing it
See Sullivan91, Christmansson96, Chou01

5
Previous Approaches

User-level drivers (e.g. in Windows Driver
Foundation)
Currently for non-interrupt drivers and
non-performance-critical ones only
Separate hardware protection domain virtual
machines Nooks Swift et al, L4 LeVasseur et
al, Xen Fraser et al
Coarse-grained, high overhead, system specific
Binary instrumentation SFI Wahbe et al,
Small/Seltzer
Coarse-grained, system specific
Static analysis software guards XFI
Erlingsson et al
Verification works at binary level

6
Conjecture

We can get more reliable extensions with,
a bit more info. in the language (C w/
annotations)
more advanced and cooperated compiler/runtime
support

7
A Language-Based Approach to Extension Isolation

Light-weight annotations in extension code and
host API
A src-to-src compiler tries to verify safety
Emits runtime checks when necessary
Hybrid checking
Runtime tracks resource usage and restores system
invariants when extension fail

Annot.Source
Src2src compiler
C w/ checks
GCC
Extension
Runtime Recovery
Host Program
Address Space
8
Goals and Non-goals

Fine-grained safety checks and few false
positives
Low performance overhead
Require few changes and no hardware support
Detect popular bugs e.g. memory concurrency
Do not aim to block malicious code
Gain knowledge about how to improve API for
better safety and recoverability

9
Outline

Introduction
SafeDrive
Memory Safety Checking
Extension Recovery
Future work
SafeDrive Improvements
Locking safety
Better device driver API for Linux
Timeline

10
SafeDrive Overview

Detects and recovers from memory safety problems
in Linux device drivers
OSDI06
Adds fined-grained type-safety, to extensions
only
Maintains compatible kernel-driver binary
interface
A way to recover from detected failures by
restarting drivers

11
Deputy Compiler

Deputy compiler by Jeremy Condit et al.

Compiler emits runtime checks
No memory layout change? Can be applied to one
extension a time
Bounds safe,count(n)
Null term strings, tagged unions, open arrays,
printf

struct
unsigned int len
int count(len) data
x
for(i 0 i lt x.len i)
if (ilt0igtx.len) abort()
x.datai
void clear(char count(size) buf, int size)

12
Deputy Guarantees

Deputy guarantees type-safety if,
Programmer correctly annotates globals and
function parameters used by the extension
Deallocation does not create dangling pointers
Trusted casts are correct
External modules / trusted code establish and
preserve invariants specified by existing
annotations
Concurrent accesses are properly synchronized

13
SafeDrives use of Deputy
Annot.Driver
Annot.Kernel Headers

Kernel API functions and data structures used are
annotated (header files)
One time cost
Function parameters and global data structures in
drivers annotated
1-4 of lines
Kernel needs no annotations and is trusted.

Deputy
C w/ checks
GCC
InstrumentedDriver Module
14
Failure Handling

Everything runs inside the same protection domain
After Deputy check failure could just halt
More useful clean-up extension and let host
continue
Assumption restarts should fix most transient
failures

Annot.Driver
Deputy
C w/ checks
GCC
DriverModule
SafeDrive Runtime Recovery
Linux Kernel
Kernel Address Space
15
Update Tracking and Restarts

Free resources and undo state changes done by
driver
Kernel API functions wrapped to do update
tracking
Compensations spin_lock(l) vs. spin_unlock(l)
After failure, undo updates in LIFO order
Then restart driver

Annot.Driver
Deputy
C w/ checks
GCC
DriverModule
Recovery
UpdateTracking
Wrappers
Linux Kernel
Kernel Address Space
16
Return Gracefully from Failure

Invariants
No driver code is executed after failure

Kernel foo()
Driver bar1()
Driver bar2()
Err code
17
Return Gracefully from Failure

Invariants
No driver code is executed after failure
No kernel function is forced to return early

Kernel foo1()
Driver bar1()
Kernel foo2()
Driver bar2()
lock()
unlock()
18
Discussion

Compared to Nooks
Significantly less interception ? Much simpler
overall
Deputy does fine-grained per-allocation checks ?
No separate heap/stack
No help from virtual memory hardware
Works for user-level applications and safe
languages
Compared to C/Java exceptions
Compensation does not contain any code from
driver
Only restores host state, not extension state

19
Implementation

Deputy compiler 20K lines of OCaml
Kernel patch to 2.6.15.5 1K lines
Kernel headers patch 1.9K lines
Patch for 6 drivers in 4 categories
Network e1000, tg3
USB usb-storage
Sound intel8x0, emu10k1
Video nvidia

20
Evaluation Recovery Rate

Inject random errors with compile-time injection
5 errors from one of 7 categories each time
Faults chosen following empirical studies
Sullivan Chillarege 91, Christmansson
Chillarege 96
Scan overrun, loop fault, corrupt parameter,
off-by-one, flipped condition, missing call,
missing assignment
Load buggy e1000 driver w/ and w/o SafeDrive
Exercise by downloading a 89MB file, verifying it
and unloading the driver
Then rerun with original driver

21
Recovery Rate Results

140 runs, 20 per fault category

SafeDrive is effective at detecting and
recovering from crashing problems, and can detect
some statically.

22
More Results

Annotation burden
1-4 of driver code changed for annotations
Less amount of wrapper code. Can be automatically
generated in the future
Performance
lt25 overhead for driver micro-benchmark
E.g. TCP send w/ Netperf,
About 1/10 the overhead of Nooks in two
comparable experiments

23
Outline

Introduction
SafeDrive
Memory Safety Checking
Extension Recovery
Future work
SafeDrive Improvements
Locking safety
Better device driver API for Linux
Timeline

24
SafeDrive Improvements

Separate wrappers from driver/kernel headers
To evolve with new versions of drivers and kernel
Needs kernel loader support
Tools for usability
Identify driver entry functions and generate
wrappers
List all unwrapped kernel functions called, to
help identify API functions to wrap
Annotate more drivers and run on live servers
Make code release

25
Common locking problems in kernel

Race conditions
Deadlocks
Acquiring the same spinlock multiple times
Acquiring multiple locks in different orders
Using locks in wrong contexts
Acquiring a mutex in interrupt context
Acquiring a spinlock in interrupt context and
also in process context with interrupt enabled

26
Hybrid lock checking

Assign a static name to each lock
Combine dynamically allocated locks to static
ones
Some functions are annotated with
process/interrupt contexts
Inference propagate these annotations
An analysis checks locking safety
Context constraints
Consistent global ordering

27
Hybrid lock checking (2)

When not sure, emit runtime checks
Runtime checks done with lockdep in Linux kernel
Lockdep Molnar06 builds lock ordering and
context constraints at runtime
Store lock orderings in a big hash table
Consumes memory and causes significant slowdown
With the hybrid checking
Locks verified to be safe do not need to be
tracked by lockdep

28
Better driver API for Linux

In most OSes, drivers communicate with the kernel
with a wide and trusted API
2500 symbols exported to drivers in Linux
Some Linux driver API functions are not checkable
for memory safety
Driver API improvements
Introduce shim between common functions to do
parameter/invariant checking
Fix legacy functions where memory safety are not
checkable
Change kernel data structures for memory safety

29
Better driver API for Linux (2)

Gauge of success
Whether more checking finds undetected problems
in previous experiments
How many lines of trusted code are eliminated
Find real bugs?
Related work Windows Driver Foundation (WDF)
A wrapper API on top of existing Windows driver
API
Better default values, parameter checking and
back-ward compatibility
Backward compatibility not 100 necessary for
Linux

30
Outline

Introduction
SafeDrive
Memory Safety Checking
Extension Recovery
Future work
SafeDrive Improvements
Locking safety
Better device driver API for Linux
Timeline

31
Timeline

Phase 1 - Oct. 2006
Basic SafeDrive design and implementation
Phase 2 Oct. 2006 Dec. 2006
Tools for SafeDrive
Locking safety
Phase 3 Jan. 2007 May 2007
Modular locking safety
Better driver API
Phase 4 June 2007 Dec. 2007
Wrap up and dissertation writing

32
(No Transcript)
33
Classification of OS problems

Due to Sullivan91 on IBM field data
Problems corrupting program data
75 memory-safety related
8 concurrency related
17 others
Regular problems
30 memory-safety related
14 concurrency related
56 others

34
How do you change bounds/tags
struct unsigned int len int count(len)
data x x.data NULL if (x.data!NULL
(Alt0Agtlen)) abort x.len A if
(Bltsizeof(int)x.len) abort x.data malloc(B)
1
2
3
35
Related Work

Improving memory safety of C
Safe C-like language Cyclone Morrisett et al
Hybrid checking (non-modular) CCured Necula et
al
Type qualifiers for static checking CQual
Foster et al, Johnson/Wagner, Sparse Torvalds
Improving OS/extension reliability
Hardware protection Nooks Swift et al, L4
LeVasseur et al, Xen Fraser et al
Binary instrumentation SFI Wahbe et al,
Small/Seltzer, XFI Erlingsson
Using Cyclone OKE Bos/Samwel
Static validation of API usage SLAM Ball et al
Writing OS with safe language Singularity Patel
et al

36
Performance
e1000 TCP recv e1000 UDP recv e1000 TCP
send e1000 UDP send tg3 TCP recv tg3 TCP
send usb-storage untar emu10k aplay intel8x0
aplay nvidia xinit

Nooks (Linux 2.4) e1000 TCP recv 46 (vs. 4),
e1000 TCP send
111 (vs. 12)

37
Annotation Burden

1-4 of lines with Deputy annotations
Recovery wrappers can be automatically generated

38
Annotations Break-down

Common reasons for trusted casts and trusted code
Polymorphic private data, e.g. netdev-gtpriv
Small number of cases where buffer bounds are not
available
Code manipulating pointer values directly, e.g.
PTR_ERR(x)

Write a Comment

User Comments (0)

About PowerShow.com

Improving Extension Reliability Using LanguageBased Techniques - PowerPoint PPT Presentation

Improving Extension Reliability Using LanguageBased Techniques

Run in the same protection domain. Extensions are often buggier than hosts ... protection domain & virtual machines: Nooks [Swift et al], L4 [LeVasseur et al] ... – PowerPoint PPT presentation