Improving Extension Reliability Using LanguageBased Techniques - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Improving Extension Reliability Using LanguageBased Techniques

Description:

Run in the same protection domain. Extensions are often buggier than hosts ... protection domain & virtual machines: Nooks [Swift et al], L4 [LeVasseur et al] ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 39
Provided by: billmcc3
Category:

less

Transcript and Presenter's Notes

Title: Improving Extension Reliability Using LanguageBased Techniques


1
Improving Extension ReliabilityUsing
Language-Based Techniques
  • Ph.D. Qualifying Examination
  • Feng Zhou
  • CS, UC Berkeley
  • 11/21/2006

2
Motivation
  • OSes and applications often run loadable
    extensions
  • e.g. Linux kernel, Apache, Firefox
  • Run in the same protection domain
  • Extensions are often buggier than hosts
  • Device drivers cause a large percentage of
    Windows crashes
  • Xbox hacked due to memory bugs in games

3
Problem Statement
  • The Extension Isolation Problem
  • Detect extension failures and protect other parts
    of the system from these failures

4
Why Extensions Fail?
  • Memory-safety problems
  • Null pointer dereference
  • Buffer overrun
  • Dangling pointers
  • Concurrency problems
  • Race conditions
  • Deadlocks
  • Domain specific problems (e.g. interrupt-related)
  • Improper API usage
  • Using a file descriptor after closing it
  • See Sullivan91, Christmansson96, Chou01

5
Previous Approaches
  • User-level drivers (e.g. in Windows Driver
    Foundation)
  • Currently for non-interrupt drivers and
    non-performance-critical ones only
  • Separate hardware protection domain virtual
    machines Nooks Swift et al, L4 LeVasseur et
    al, Xen Fraser et al
  • Coarse-grained, high overhead, system specific
  • Binary instrumentation SFI Wahbe et al,
    Small/Seltzer
  • Coarse-grained, system specific
  • Static analysis software guards XFI
    Erlingsson et al
  • Verification works at binary level

6
Conjecture
  • We can get more reliable extensions with,
  • a bit more info. in the language (C w/
    annotations)
  • more advanced and cooperated compiler/runtime
    support

7
A Language-Based Approach to Extension Isolation
  • Light-weight annotations in extension code and
    host API
  • A src-to-src compiler tries to verify safety
  • Emits runtime checks when necessary
  • Hybrid checking
  • Runtime tracks resource usage and restores system
    invariants when extension fail

Annot.Source
Src2src compiler
C w/ checks
GCC
Extension
Runtime Recovery
Host Program
Address Space
8
Goals and Non-goals
  • Fine-grained safety checks and few false
    positives
  • Low performance overhead
  • Require few changes and no hardware support
  • Detect popular bugs e.g. memory concurrency
  • Do not aim to block malicious code
  • Gain knowledge about how to improve API for
    better safety and recoverability

9
Outline
  • Introduction
  • SafeDrive
  • Memory Safety Checking
  • Extension Recovery
  • Future work
  • SafeDrive Improvements
  • Locking safety
  • Better device driver API for Linux
  • Timeline

10
SafeDrive Overview
  • Detects and recovers from memory safety problems
    in Linux device drivers
  • OSDI06
  • Adds fined-grained type-safety, to extensions
    only
  • Maintains compatible kernel-driver binary
    interface
  • A way to recover from detected failures by
    restarting drivers

11
Deputy Compiler
  • Deputy compiler by Jeremy Condit et al.
  • Compiler emits runtime checks
  • No memory layout change? Can be applied to one
    extension a time
  • Bounds safe,count(n)
  • Null term strings, tagged unions, open arrays,
    printf
  • struct
  • unsigned int len
  • int count(len) data
  • x
  • for(i 0 i lt x.len i)
  • if (ilt0igtx.len) abort()
  • x.datai
  • void clear(char count(size) buf, int size)

12
Deputy Guarantees
  • Deputy guarantees type-safety if,
  • Programmer correctly annotates globals and
    function parameters used by the extension
  • Deallocation does not create dangling pointers
  • Trusted casts are correct
  • External modules / trusted code establish and
    preserve invariants specified by existing
    annotations
  • Concurrent accesses are properly synchronized

13
SafeDrives use of Deputy
Annot.Driver
Annot.Kernel Headers
  • Kernel API functions and data structures used are
    annotated (header files)
  • One time cost
  • Function parameters and global data structures in
    drivers annotated
  • 1-4 of lines
  • Kernel needs no annotations and is trusted.

Deputy
C w/ checks
GCC
InstrumentedDriver Module
14
Failure Handling
  • Everything runs inside the same protection domain
  • After Deputy check failure could just halt
  • More useful clean-up extension and let host
    continue
  • Assumption restarts should fix most transient
    failures

Annot.Driver
Deputy
C w/ checks
GCC
DriverModule
SafeDrive Runtime Recovery
Linux Kernel
Kernel Address Space
15
Update Tracking and Restarts
  • Free resources and undo state changes done by
    driver
  • Kernel API functions wrapped to do update
    tracking
  • Compensations spin_lock(l) vs. spin_unlock(l)
  • After failure, undo updates in LIFO order
  • Then restart driver

Annot.Driver
Deputy
C w/ checks
GCC
DriverModule
Recovery
UpdateTracking
Wrappers
Linux Kernel
Kernel Address Space
16
Return Gracefully from Failure
  • Invariants
  • No driver code is executed after failure

Kernel foo()
Driver bar1()
Driver bar2()
Err code
17
Return Gracefully from Failure
  • Invariants
  • No driver code is executed after failure
  • No kernel function is forced to return early

Kernel foo1()
Driver bar1()
Kernel foo2()
Driver bar2()
lock()
unlock()
18
Discussion
  • Compared to Nooks
  • Significantly less interception ? Much simpler
    overall
  • Deputy does fine-grained per-allocation checks ?
    No separate heap/stack
  • No help from virtual memory hardware
  • Works for user-level applications and safe
    languages
  • Compared to C/Java exceptions
  • Compensation does not contain any code from
    driver
  • Only restores host state, not extension state

19
Implementation
  • Deputy compiler 20K lines of OCaml
  • Kernel patch to 2.6.15.5 1K lines
  • Kernel headers patch 1.9K lines
  • Patch for 6 drivers in 4 categories
  • Network e1000, tg3
  • USB usb-storage
  • Sound intel8x0, emu10k1
  • Video nvidia

20
Evaluation Recovery Rate
  • Inject random errors with compile-time injection
    5 errors from one of 7 categories each time
  • Faults chosen following empirical studies
    Sullivan Chillarege 91, Christmansson
    Chillarege 96
  • Scan overrun, loop fault, corrupt parameter,
    off-by-one, flipped condition, missing call,
    missing assignment
  • Load buggy e1000 driver w/ and w/o SafeDrive
  • Exercise by downloading a 89MB file, verifying it
    and unloading the driver
  • Then rerun with original driver

21
Recovery Rate Results
  • 140 runs, 20 per fault category
  • SafeDrive is effective at detecting and
    recovering from crashing problems, and can detect
    some statically.

22
More Results
  • Annotation burden
  • 1-4 of driver code changed for annotations
  • Less amount of wrapper code. Can be automatically
    generated in the future
  • Performance
  • lt25 overhead for driver micro-benchmark
  • E.g. TCP send w/ Netperf,
  • About 1/10 the overhead of Nooks in two
    comparable experiments

23
Outline
  • Introduction
  • SafeDrive
  • Memory Safety Checking
  • Extension Recovery
  • Future work
  • SafeDrive Improvements
  • Locking safety
  • Better device driver API for Linux
  • Timeline

24
SafeDrive Improvements
  • Separate wrappers from driver/kernel headers
  • To evolve with new versions of drivers and kernel
  • Needs kernel loader support
  • Tools for usability
  • Identify driver entry functions and generate
    wrappers
  • List all unwrapped kernel functions called, to
    help identify API functions to wrap
  • Annotate more drivers and run on live servers
  • Make code release

25
Common locking problems in kernel
  • Race conditions
  • Deadlocks
  • Acquiring the same spinlock multiple times
  • Acquiring multiple locks in different orders
  • Using locks in wrong contexts
  • Acquiring a mutex in interrupt context
  • Acquiring a spinlock in interrupt context and
    also in process context with interrupt enabled

26
Hybrid lock checking
  • Assign a static name to each lock
  • Combine dynamically allocated locks to static
    ones
  • Some functions are annotated with
    process/interrupt contexts
  • Inference propagate these annotations
  • An analysis checks locking safety
  • Context constraints
  • Consistent global ordering

27
Hybrid lock checking (2)
  • When not sure, emit runtime checks
  • Runtime checks done with lockdep in Linux kernel
  • Lockdep Molnar06 builds lock ordering and
    context constraints at runtime
  • Store lock orderings in a big hash table
  • Consumes memory and causes significant slowdown
  • With the hybrid checking
  • Locks verified to be safe do not need to be
    tracked by lockdep

28
Better driver API for Linux
  • In most OSes, drivers communicate with the kernel
    with a wide and trusted API
  • 2500 symbols exported to drivers in Linux
  • Some Linux driver API functions are not checkable
    for memory safety
  • Driver API improvements
  • Introduce shim between common functions to do
    parameter/invariant checking
  • Fix legacy functions where memory safety are not
    checkable
  • Change kernel data structures for memory safety

29
Better driver API for Linux (2)
  • Gauge of success
  • Whether more checking finds undetected problems
    in previous experiments
  • How many lines of trusted code are eliminated
  • Find real bugs?
  • Related work Windows Driver Foundation (WDF)
  • A wrapper API on top of existing Windows driver
    API
  • Better default values, parameter checking and
    back-ward compatibility
  • Backward compatibility not 100 necessary for
    Linux

30
Outline
  • Introduction
  • SafeDrive
  • Memory Safety Checking
  • Extension Recovery
  • Future work
  • SafeDrive Improvements
  • Locking safety
  • Better device driver API for Linux
  • Timeline

31
Timeline
  • Phase 1 - Oct. 2006
  • Basic SafeDrive design and implementation
  • Phase 2 Oct. 2006 Dec. 2006
  • Tools for SafeDrive
  • Locking safety
  • Phase 3 Jan. 2007 May 2007
  • Modular locking safety
  • Better driver API
  • Phase 4 June 2007 Dec. 2007
  • Wrap up and dissertation writing

32
(No Transcript)
33
Classification of OS problems
  • Due to Sullivan91 on IBM field data
  • Problems corrupting program data
  • 75 memory-safety related
  • 8 concurrency related
  • 17 others
  • Regular problems
  • 30 memory-safety related
  • 14 concurrency related
  • 56 others

34
How do you change bounds/tags
struct unsigned int len int count(len)
data x x.data NULL if (x.data!NULL
(Alt0Agtlen)) abort x.len A if
(Bltsizeof(int)x.len) abort x.data malloc(B)
1
2
3
35
Related Work
  • Improving memory safety of C
  • Safe C-like language Cyclone Morrisett et al
  • Hybrid checking (non-modular) CCured Necula et
    al
  • Type qualifiers for static checking CQual
    Foster et al, Johnson/Wagner, Sparse Torvalds
  • Improving OS/extension reliability
  • Hardware protection Nooks Swift et al, L4
    LeVasseur et al, Xen Fraser et al
  • Binary instrumentation SFI Wahbe et al,
    Small/Seltzer, XFI Erlingsson
  • Using Cyclone OKE Bos/Samwel
  • Static validation of API usage SLAM Ball et al
  • Writing OS with safe language Singularity Patel
    et al

36
Performance
e1000 TCP recv e1000 UDP recv e1000 TCP
send e1000 UDP send tg3 TCP recv tg3 TCP
send usb-storage untar emu10k aplay intel8x0
aplay nvidia xinit
  • Nooks (Linux 2.4) e1000 TCP recv 46 (vs. 4),
    e1000 TCP send
    111 (vs. 12)

37
Annotation Burden
  • 1-4 of lines with Deputy annotations
  • Recovery wrappers can be automatically generated

38
Annotations Break-down
  • Common reasons for trusted casts and trusted code
  • Polymorphic private data, e.g. netdev-gtpriv
  • Small number of cases where buffer bounds are not
    available
  • Code manipulating pointer values directly, e.g.
    PTR_ERR(x)
Write a Comment
User Comments (0)
About PowerShow.com