Documenting and Automating Collateral Evolutions in Linux Device Drivers - PowerPoint PPT Presentation

About This Presentation
Title:

Documenting and Automating Collateral Evolutions in Linux Device Drivers

Description:

Documenting and Automating Collateral Evolutions in Linux Device ... Julia Lawall and Ren Rydhof Hansen (DIKU) Gilles Muller (Ecole des Mines de Nantes) ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 29
Provided by: aryx1
Category:

less

Transcript and Presenter's Notes

Title: Documenting and Automating Collateral Evolutions in Linux Device Drivers


1
Documenting and Automating Collateral Evolutions
in Linux Device Drivers
  • Yoann Padioleau
  • Ecole des Mines de Nantes (now at UIUC)
  • with
  • Julia Lawall and René Rydhof Hansen (DIKU)
  • Gilles Muller (Ecole des Mines de Nantes)

the Coccinelle project
2
The problem Collateral Evolutions
lib.c
int foo(int x)
  • Evolution
  • in a library

becomes
int bar(int x)
  • Can entail lots of
  • Collateral Evolutions (CE) in clients

before
Legend
after
clientn.c
client1.c
client2.c
foo(foo(2))
bar(bar(2))
if(foo(3))
if(bar(3))
3
The problem Collateral Evolutions
lib.c
int foo(int x)
  • Evolution
  • in a library

becomes
int bar(int x, int y)
  • Can entail lots of
  • Collateral Evolutions (CE) in clients

before
Legend
after
clientn.c
client1.c
client2.c
foo(foo(2))
bar(bar(2,?),?)
if(foo(3))
if(bar(3,?))
4
Our target Linux device drivers
  • Many libraries and many clients
  • Lots of driver support libraries one per device
    type, one per bus (pci library, sound library, )
  • Lots of device specific code Drivers make up
    more than 50 of Linux
  • Many evolutions and collateral evolutions
    Eurosys06
  • 1200 evolutions in Linux 2.6
  • For each evolution, lots of collateral evolutions
  • Some collateral evolutions affect over 400 files
    at over 1000 code sites

5
Our goal
  • Currently, Collateral Evolutions in Linux are
    done nearly manually
  • Difficult
  • Time consuming
  • Error prone
  • The highly concurrent and distributed nature of
    the Linux development process makes it even
    worse
  • Patches that miss code sites (because of newly
    introduced sites and newly introduced drivers)
  • Out of date patches, conflicting patches
  • Drivers outside the Linux source tree are not
    updated
  • Misunderstandings

Need a tool to document and automate Collateral
Evolutions
6
Taxonomy of transformations
  • Taxonomy of evolutions (library code)
  • add parameter, split data structure, change
    protocol sequencing, change return type, add
    error code, etc
  • Taxonomy of collateral evolutions (client code)?
  • Very wide variety of program transformations,
    affecting wide variety of C and CPP constructs
  • Often depends on context, e.g. for add argument
    the new value must be constructed from enclosing
    code
  • Note that not necesseraly semantic preserving

Can not be done by current refactoring tools
(more than just renaming entities). Need a
flexible tool.
7
Complex Collateral Evolutions (2.5.71)
  • Evolution scsi_get()/scsi_put() dropped from
    SCSI library
  • Collateral evolutions SCSI resource now passed
    directly to proc_info callback functions via a
    new parameter

From local var to parameter
  • int a_proc_info(int x
  • )
  • scsi y
  • ...
  • y scsi_get()
  • if(!y) ... return -1
  • ...
  • scsi_put(y)
  • ...

,scsi y
Delete calls to library
Delete error checking code
before
Legend
after
8
Our idea
The example
  • How to specify the required program
    transformation ?
  • In what programming language ?
  • int a_proc_info(int x
  • ,scsi y
  • )
  • scsi y
  • ...
  • y scsi_get()
  • if(!y) ... return -1
  • ...
  • scsi_put(y)
  • ...

9
Our idea Semantic Patches
metavariable declarations
_at__at_
Patch-like syntax
function a_proc_info identifier x,y
_at__at_
Transform if everything matches
metavariable references
  • int a_proc_info(int x
  • ,scsi y
  • )
  • - scsi y
  • ...
  • - y scsi_get()
  • - if(!y) ... return -1
  • ...
  • - scsi_put(y)
  • ...

the ... operator
Declarative language
modifiers
10
Affected Linux driver code
drivers/scsi/53c700.c
drivers/scsi/pcmcia/nsp_cs.c
  • int s53c700_info(int limit)
  • char buf
  • scsi sc
  • sc scsi_get()
  • if(!sc)
  • printk(error)
  • return -1
  • wd7000_setup(sc)
  • PRINTP(vald,
  • sc-gtfieldlimit)
  • scsi_put(sc)
  • return 0

int nsp_proc_info(int lim) scsi host
host scsi_get() if(!host)
printk(nsp_error) return -1
SPRINTF(NINJASCSId, host-gtbase)
scsi_put(host) return 0
Similar, but not identical
11
Applying the semantic patch
int s53c700_info(int limit) char buf
scsi sc sc scsi_get() if(!sc)
printk(error) return -1
wd7000_setup(sc) PRINTP(vald,
sc-gtfieldlimit) scsi_put(sc) return 0
int nsp_proc_info(int lim) scsi host
host scsi_get() if(!host)
printk(nsp_error) return -1
SPRINTF(NINJASCSId, host-gtbase)
scsi_put(host) return 0
proc_info.sp
  • _at__at_
  • function a_proc_info
  • identifier x,y
  • _at__at_
  • int a_proc_info(int x
  • ,scsi y
  • )
  • - scsi y
  • ...
  • - y scsi_get()
  • - if(!y) ... return -1
  • ...
  • - scsi_put(y)
  • ...

spatch .c lt proc_info.sp
12
Applying the semantic patch
int s53c700_info(int limit, scsi sc) char
buf
wd7000_setup(sc) PRINTP(vald,
sc-gtfieldlimit) return 0
int nsp_proc_info(int lim, scsi host)
SPRINTF(NINJASCSId,
host-gtbase) return 0
proc_info.sp
  • _at__at_
  • function a_proc_info
  • identifier x,y
  • _at__at_
  • int a_proc_info(int x
  • ,scsi y
  • )
  • - scsi y
  • ...
  • - y scsi_get()
  • - if(!y) ... return -1
  • ...
  • - scsi_put(y)
  • ...

spatch .c lt proc_info.sp
13
SmPL Semantic Patch Language
  • A single small semantic patch can modify hundreds
    of files, at thousands of code sites
  • The features of SmPL make a semantic patch
    generic. Abstract away irrelevant details
  • Differences in spacing, indentation, and comments
  • Choice of the names given to variables
    (metavariables)
  • Irrelevant code (..., control flow oriented)
  • Other variations in coding style (isomorphisms)
    e.g. if(!y) if(yNULL)
    if(NULLy)

14
Sequences and the operator
C file
Semantic patch
1 y scsi_get() 2 if(exp) 3 scsi_put(y) 4 ret
urn -1 5 6 printf(d,y-gtf) 7
scsi_put(y) 8 return 0
- y scsi_get() ... - scsi_put(y)
Control-flow graph(CFG) of C file
1
path 1
2
6
path 2
. . . means for all subsequent paths
3
7
8
4
exit
One - line can erase multiple lines
15
Isomorphisms, C equivalences
  • Examples
  • Boolean X NULL ? !X ? NULL X
  • Control if(E)S1 else S2 ? if(!E) S2 else S1
  • Pointer E-gtfield ? E.field
  • etc.
  • How to specify isomorphisms ?

_at__at_ expression X _at__at_ X NULL ltgt !X ltgt
NULL X
Reuses SmPL syntax
16
How does it work ?
17
The transformation engine architecture
Parse Semantic Patch
Parse C file
Expand isomorphisms
Translate to CFG
Translate to extended CTL
Match CTL against CFG using a model checking
algorithm
Computational Tree Logic Clark86 with extra
features
Modify matched code
Unparse
18
CTL and Model checking
  • Model checking a CTL formula against a model
    answers just yes/no (with counter example).
  • We do program transformations, not just pattern
    checking. Need
  • Bind metavariables and remember their value
  • Remember where we have matched sub-formulas
  • We have extended CTL existential variables and
    program transformation annotations

_at__at_ exp X,Y_at__at_ f(X) ... - g(Y) g(X,Y)
9X.f(X)Æ AX Atrue U 9v.9Y.g-(-Y-)--g(X,Y)v
19
Other issues
  • Need to produce readable code
  • Keep space, indentation, comments
  • Keep CPP instructions as-is. Also programmer may
    want to transform some define,iterator macros
    (e.g. list_for_each)
  • Interactive engine, partial match
  • Isomorphisms
  • Rewriting the Semantic patch (not the C code),
  • Generate disjunctions

Very different from most other C tools
60 000 lines of OCaml code
20
Evaluation
21
Experiments
  • Methodology
  • Detect past collateral evolutions in Linux 2.5
    and 2.6 using patchparse tool Eurosys06
  • Select representative ones
  • Test suite of over 60 CEs
  • Study them and write corresponding semantic
    patches
  • Note we are not kernel developers
  • Going "back to the future". Compare
  • what Linux programers did manually
  • what spatch, given our SPs, does automatically

22
Test suite
  • 20 Complex CEs mistakes done by the programmers
  • In each case 1-16 errors or misses
  • 23 Mega CEs affect over 100 sites
  • Up to 40 people for up to two years
  • 26 typical CEs
  • The whole set of CEs affecting a typical
    (median) directory from 2.6.12 to 2.6.20

More than 5800 driver files
23
Results
  • Our SPs are on average 106 lines long
  • SPs often 100 times smaller than human-made
    patches. A measure of time saved
  • Not doing manually the CE on all the drivers
  • Not reading and reviewing big patches, for people
    with drivers outside source tree
  • Overall correct and complete automated
    transformation for 93 of files
  • Problems on the remaining 7 We miss code sites
  • CPP issues, lack of isomorphisms (data-flow and
    inter-procedural)
  • We are not kernel developers dont know how to
    specify
  • No false positives, just false negatives
  • Average processing time of 0.7s per relevant file

Sometimes the tool was right and human wrong
24
Impact on the Linux kernel
  • We also wrote some SPs for current collateral
    evolutions (looking at linux kernel mailing
    lists)
  • use DIV_ROUND_UP, BUG_ON, FIELD_SIZE
  • convert kmalloc-memset to kzalloc
  • Total diffstat 154 files changed, 203
    insertions(), 375 deletions(-)
  • We wrote other SPs, for bug-fixing (good side
    effects of our tool)
  • Add missing put functions (reference counting)
  • Drop unnecessary put functions (reference
    counting)
  • Remove unused variables
  • Total diffstat 111 files changed, 340
    insertions(), 355 deletions(-)

Accepted in latest Linux kernel
25
Future work
  • Are semantic patches and spatch useful
  • Only for Linux device drivers?
  • Only for Linux programmers?
  • Only for collateral evolutions program
    transformations?
  • Only for program transformations?
  • Our thesis We dont think so.
  • But first device driver CEs are an important
    problem!
  • All software evolves. Software libraries are more
    and more important, and have more and more clients

We may also help that software, those libraries
26
Related work
  • Refactoring
  • CatchUpICSE05, tool-based, replay refactorings
    from Eclipse
  • JunGLICSE06, language-based, but based on ML,
    less Linux-programmer friendly
  • Program transformation engines
  • Stratego04
  • C front-ends
  • CILCC02

27
Conclusion
  • Collateral Evolution is an important problem,
    especially in Linux device drivers
  • SmPL a declarative language to specify
    collateral evolutions
  • Looks like a patch fits with Linux programmers
    habits
  • But takes into account the semantics of C hence
    the name Semantic Patches
  • A transformation engine to automate collateral
    evolutions based on model checking technology.

28
Thank you
  • You can download our tool, spatch, at
    http//www.emn.fr/x-info/coccinelle
  • Questions ?

Write a Comment
User Comments (0)
About PowerShow.com