Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey

1 / 40
About This Presentation
Title:

Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey

Description:

explain the program, its structure, its behavior, its effects on its operation ... Activity: what is being performed/contributed? e.g., architecture reconstruction. 8 ... –

Number of Views:44
Avg rating:3.0/5.0
Slides: 41
Provided by: Mar546
Category:

less

Transcript and Presenter's Notes

Title: Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey


1
Program Comprehension through Dynamic
AnalysisVisualization, evaluation, and a survey
  • Bas Cornelissen (et al.)
  • Delft University of Technology
  • IPA Herfstdagen, Nunspeet, The Netherlands
  • November 26, 2008

1
2
Context
  • Software maintenance
  • e.g., feature requests, debugging
  • requires understanding of the program at hand
  • up to 70 of effort spent on comprehension
    process
  • ? Support program comprehension

3
Definitions
  • Program Comprehension
  • A person understands a program when he or she is
    able to
  • explain the program, its structure, its behavior,
    its effects on its operation context, and its
    relationships to its application domain
  • in terms that are qualitatively different from
    the tokens used to construct the source code of
    the program.

4
Definitions (contd)
  • Dynamic analysis
  • The analysis of the properties of a running
    software system

Unknown system
e.g., open source
  • Advantages
  • preciseness
  • goal-oriented
  • Limitations
  • incompleteness
  • scenario-dependence
  • scalability issues

Instrumentation
e.g., using AspectJ
Scenario
Execution
(too) much data
5
Outline
  • Literature survey
  • Visualization I UML sequence diagrams
  • Comparing reduction techniques
  • Visualization II Extravis
  • Current work Human factor
  • Concluding remarks

6
  • Literature survey

7
Why a literature survey?
  • Numerous papers and subfields
  • last decade many papers annually
  • Need for a broad overview
  • keep track of current and past developments
  • identify future directions
  • Existing surveys (4) do not suffice
  • scopes restricted
  • approaches not systematic
  • collective outcomes difficult to structure

8
Characterizing the literature
  • Four facets
  • Activity what is being performed/contributed?
  • e.g., architecture reconstruction
  • Target to which languages/platforms is the
    approach applicable?
  • e.g., web applications
  • Method which methods are used in conducting the
    activity?
  • e.g., formal concept analysis
  • Evaluation how is the approach validated?
  • e.g., industrial study

9
Attribute framework
10
Characterization
Etc.
11
Attribute frequencies
12
Survey results
  • Least common activities
  • surveys, architecture reconstruction
  • Least common target systems
  • multithreaded, distributed, legacy, web
  • Least common evaluations
  • industrial studies, controlled experiments,
    comparisons

13
  • Visualization I Sequence Diagrams

14
UML sequence diagrams
  • Goal
  • visualize testcase executions as sequence
    diagrams
  • provides insight in functionalities
  • accurate, up-to-date documentation
  • Method
  • instrument system and testsuite
  • execute testsuite
  • abstract from irrelevant details
  • visualize as sequence diagrams

15
Evaluation
  • JPacman
  • Small program for educational purposes
  • 3 KLOC
  • 25 classes
  • Task
  • Change requests
  • addition of undo functionality
  • addition of multi-level functionality

16
Evaluation (contd)
  • Checkstyle
  • code validation tool
  • 57 KLOC
  • 275 classes
  • Task
  • Addition of a new check
  • which types of checks exist?
  • what is the difference in terms of implementation?

17
Results
  • Sequence diagrams are easily readable
  • intuitive due to chronological ordering
  • Sequence diagrams aid in program comprehension
  • supports maintenance tasks
  • Proper reductions/abstractions are difficult
  • reduce 10,000 events to 100 events, but at what
    cost?

18
Results (contd)
  • Reduction techniques issues
  • which one is best?
  • which are most likely to lead to significant
    reductions?
  • which are the fastest?
  • which actually abstract from irrelevant details?

19
  • Comparing reduction techniques

20
Trace reduction techniques
  • Input 1 large execution trace
  • up to millions of events
  • Input 2 maximum output size
  • e.g., 100 for visualiz. through UML sequence
    diagrams
  • Output reduced trace
  • was reduction successful?
  • how fast was the reduction performed?
  • has relevant data been preserved?

21
Example technique
  • Stack depth limitation metrics-based filtering
  • requires two passes

discard events above maximum depth
determine maximum depth
determine depth frequencies
Trace
Trace
  • 0 28,450
  • 13,902
  • 58,444
  • 29,933
  • 10,004
  • ...

200,000 events
42,352 events
gt depth 1
maximum output size (threshold)
50,000 events
22
How can we compare the techniques?
  • Use
  • common context
  • common evaluation criteria
  • common test set
  • ? Ensures fair comparison

23
Approach
  • Assessment methodology
  • Context
  • Criteria
  • Metrics
  • Test set
  • Application
  • Interpretation

need for high level knowledge reduction success
rate performance info preservation output
size time spent preservation per type five
open source systems, one industrial apply
reductions using thresholds 1,000 thru
1,000,000 compare side-by-side
24
Techniques under assessment
  • Subsequence summarization summarization
  • Stack depth limitation metrics-based
  • Language-based filtering filtering
  • Sampling ad hoc

25
Assessment summary
26
  • Visualization II Extravis

27
Extravis
  • Execution Trace Visualizer
  • joint collaboration with TU/e
  • Goal
  • program comprehension through trace visualization
  • trace exploration, feature location, ...
  • address scalability issues
  • millions of events ? sequence diagrams not
    adequate

28
(No Transcript)
29
Evaluation Cromod
  • Industrial system
  • Regulates greenhouse conditions
  • 51 KLOC
  • 145 classes
  • Trace
  • 270,000 events
  • Task
  • Analysis of fan-in/fan-out characteristics

30
Evaluation Cromod (contd)
31
Evaluation JHotDraw
  • Medium-size open source application
  • Java framework for graphics editing
  • 73 KLOC
  • 344 classes
  • Trace
  • 180,000 events
  • Task
  • feature location
  • i.e., relate functionality to source code or
    trace fragment

32
Evaluation JHotDraw (contd)
33
Evaluation Checkstyle
  • Medium-size open source system
  • code validation tool
  • 73 KLOC
  • 344 classes
  • Trace 200,000 events
  • Task
  • formulate hypothesis
  • typical scenario comprises four main phases
  • initialization AST construction AST traversal
    termination
  • validate hypothesis through trace analysis

34
Evaluation Checkstyle (contd)
35
  • Current work Human factor

36
Motivation
  • Need for controlled experiments in general
  • measure impact of (novel) visualizations
  • Need for empirical validation of Extravis in
    particular
  • only anecdotal evidence thus far
  • Measure usefulness of Extravis in
  • software maintenance
  • does runtime information from Extravis help?

37
Experimental design
  • Series of maintenance tasks
  • from high level to low level
  • e.g., overview, refactoring, detailed
    understanding
  • Experimental group
  • 10 subjects
  • Eclipse IDE Extravis
  • Control group
  • 10 subjects
  • Eclipse IDE

38
  • Concluding remarks

39
Concluding remarks
  • Program comprehension important subject
  • make software maintenance more efficient
  • Difficult to evaluate and compare
  • due to human factor
  • Many future directions
  • several of which have been addressed by this
    research

40
Want to participate in the controlled
experiment..?
  • Prerequisites
  • at least two persons
  • knowledge of Java
  • (some) experience with Eclipse
  • no implementation knowledge of Checkstyle
  • two hours to spare between December 1 and 19
  • Contact me
  • during lunch, or
  • through email s.g.m.cornelissen_at_tudelft.nl
Write a Comment
User Comments (0)
About PowerShow.com