Towards Shape Analysis of Real C Programs - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Towards Shape Analysis of Real C Programs

Description:

Restricts the translation to relevant part of the program analyze large programs. ... Translation Rules (continued) modeled statement a parametric SimpleC ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 19
Provided by: Sint1
Category:

less

Transcript and Presenter's Notes

Title: Towards Shape Analysis of Real C Programs


1
Towards Shape Analysis of Real C Programs
  • Greta Yorsh
  • Tel-Aviv University

Joint work with Manuvir Das, Nurit Dor, Thomas
Reps, Mooly Sagiv, Reinhard Wilhelm
2
Motivation
  • Shape analysis invented in early 80s by Jones and
    Muchnick
  • Analyze the context of the heap
  • Useful for
  • Cleanness checking
  • Optimization
  • Verification
  • But how does one apply shape analysis to a real C
    program
  • Doubly exponential algorithms
  • May be imprecise for complicated data structures
    (trees, )
  • Handling complicated aspects of C (cast, )

3
Glimpse of Hope
  • Flow insensitive techniques have been shown to
    scale to large programs
  • Rather precise shape analysis algorithms
    forsingly and doubly-linked lists have been
    developed
  • The TVLA system allows fast prototype
    implementations of sophisticated shape analysis
    algorithms
  • Different costs and precision
  • Constantly being improved (J. Field, D. Goyal,
    R. Manevich and G. Ramalingam)

4
Outline of this talk
  • Shape Analysis of Trees
  • C-2-TVLA Translator
  • Allows shape analysis of real C programs
  • Automatically extracts the relevant parts of the
    program
  • Can be also used to analyze string cleanness

5
C-2-TVLA Translator
6
General
  • Automatic generation of TVLA input from C
    programs
  • Input
  • C Source Code
  • Translation Specification (translation rules)
  • Output
  • TVP Control Flow Graph (input for TVLA)
  • TVP Sets for TVLA

7
Main Phases of Translator
8
Simplifier
  • Convert input C code into a SimpleC code
  • Simplify nested expressions by introducing
    temporary variables
  • SimpleC program is a canonical representation of
    a regular C program.
  • SimpleC language is a subset of ANSI-C.
  • Simplified program is semantically equivalent to
    the original program
  • Why to use SimpleC ?
  • small and flat language constructs
  • specify translation for each statement

9
C-Machine
  • SimpleC contains control flow structures while,
    do-while, for
  • C-Machine is a (low level) control flow graph of
    the program
  • Nodes are Basic Blocks
  • Kinds of nodes Join, Branch, Block, Entry, Exit.
  • Basic Blocks contain straight-line code, but may
    contain nested scopes
  • Uses only SimpleC statements
  • Uses only branches (if/swith) and goto

10
What to Analyze section
  • Contains information specific to the analyzed
    program.
  • Restricts the translation to relevant part of the
    program analyze large programs.
  • Restriction parameters
  • Functions
  • Types
  • Variables

11
Statement Map
  • Mapping from SimpleC statements to TVLA-Actions
  • Specific to a given TVLA analysis
  • Same Statement Map may be used with different
    input C programs
  • Defined in the input specifications file
  • Useful beyond TVLA

12
Translation Rules
  • Statement Map contains translation rules for
    statements semantic constraints
  • Best matching is a matching to the most specific
    condition
  • Structure of an entry in the statement map
  • ltmodeled statementgt ltwhere clausegt
  • gt lttranslationgt

13
Translation Rules (continued)
  • ltmodeled statementgt a parametric SimpleC
    statement
  • L1 ltbgt?ltcgt ltagt L2
  • ltwhere clausegt set of semantic conditions on the
    operands of modeled statement
  • typeof(ltbgt, ptr(List))
  • lexme(ltcgt, next)
  • valueof(ltcgt, signed int, 0)
  • points-to conditions discussed in the Users
    Manual

14
Translation Rules (continued)
  • lttranslationgt list of edges of TVP graph.
  • nodes program locations (labels)
  • edges actions performed on the transition
    between two program locations.
  • Actions on the edges may use statement operands
    and labels
  • One statement may be translated to more than one
    edges
  • L1 Kill_Next_L(ltbgt) L3
  • L3 Set_Next_L(ltbgt,ltagt) L2

15
Done and To Do
  • Doubly Linked Lists
  • Trees
  • Checking Strings Cleanness
  • Deadalus Sample Programs
  • Singly Linked Lists
  • Small examples
  • Tiger compiler

16
Optimizations
  • Motivation
  • Simplifier introduces new variables and
    assignments
  • TVLA space limitation (even for reasonable
    program size)
  • Focus on relevant part of the program reduces
    CFG by elimination empty edges
  • Solutions
  • Static Analysis after Simplifier (Live Variables)
  • to decrease number of variables analyzed by TVLA
  • Construct a variant of Sparse Evaluation Graph
    (Ramalingam, 98)
  • compact equivalent flow graph

17
Next Steps
  • Generate translation log file
  • Extend the Translator to analyze more than one C
    source file
  • Extend the use of PTA to matching process.
  • Add Optimizations Liveness, EFG
  • Use the Translator for analyzing real
    applications
  • Distribute the Translator to TVLA users
  • Improve scalability and performance of the
    Translator

18
THE END
Write a Comment
User Comments (0)
About PowerShow.com