Title: Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey
1Program Comprehension through Dynamic
AnalysisVisualization, evaluation, and a survey
- Bas Cornelissen (et al.)
- Delft University of Technology
- IPA Herfstdagen, Nunspeet, The Netherlands
- November 26, 2008
1
2Context
- Software maintenance
- e.g., feature requests, debugging
- requires understanding of the program at hand
- up to 70 of effort spent on comprehension
process - ? Support program comprehension
3Definitions
- Program Comprehension
- A person understands a program when he or she is
able to - explain the program, its structure, its behavior,
its effects on its operation context, and its
relationships to its application domain - in terms that are qualitatively different from
the tokens used to construct the source code of
the program.
4Definitions (contd)
- Dynamic analysis
- The analysis of the properties of a running
software system
Unknown system
e.g., open source
- Advantages
- preciseness
- goal-oriented
- Limitations
- incompleteness
- scenario-dependence
- scalability issues
Instrumentation
e.g., using AspectJ
Scenario
Execution
(too) much data
5Outline
- Literature survey
- Visualization I UML sequence diagrams
- Comparing reduction techniques
- Visualization II Extravis
- Current work Human factor
- Concluding remarks
6 7Why a literature survey?
- Numerous papers and subfields
- last decade many papers annually
- Need for a broad overview
- keep track of current and past developments
- identify future directions
- Existing surveys (4) do not suffice
- scopes restricted
- approaches not systematic
- collective outcomes difficult to structure
8Characterizing the literature
- Four facets
- Activity what is being performed/contributed?
- e.g., architecture reconstruction
- Target to which languages/platforms is the
approach applicable? - e.g., web applications
- Method which methods are used in conducting the
activity? - e.g., formal concept analysis
- Evaluation how is the approach validated?
- e.g., industrial study
9Attribute framework
10Characterization
Etc.
11Attribute frequencies
12Survey results
- Least common activities
- surveys, architecture reconstruction
- Least common target systems
- multithreaded, distributed, legacy, web
- Least common evaluations
- industrial studies, controlled experiments,
comparisons
13- Visualization I Sequence Diagrams
14UML sequence diagrams
- Goal
- visualize testcase executions as sequence
diagrams - provides insight in functionalities
- accurate, up-to-date documentation
- Method
- instrument system and testsuite
- execute testsuite
- abstract from irrelevant details
- visualize as sequence diagrams
15Evaluation
- JPacman
- Small program for educational purposes
- 3 KLOC
- 25 classes
- Task
- Change requests
- addition of undo functionality
- addition of multi-level functionality
16Evaluation (contd)
- Checkstyle
- code validation tool
- 57 KLOC
- 275 classes
- Task
- Addition of a new check
- which types of checks exist?
- what is the difference in terms of implementation?
17Results
- Sequence diagrams are easily readable
- intuitive due to chronological ordering
- Sequence diagrams aid in program comprehension
- supports maintenance tasks
- Proper reductions/abstractions are difficult
- reduce 10,000 events to 100 events, but at what
cost?
18Results (contd)
- Reduction techniques issues
- which one is best?
- which are most likely to lead to significant
reductions? - which are the fastest?
- which actually abstract from irrelevant details?
19- Comparing reduction techniques
20Trace reduction techniques
- Input 1 large execution trace
- up to millions of events
- Input 2 maximum output size
- e.g., 100 for visualiz. through UML sequence
diagrams - Output reduced trace
- was reduction successful?
- how fast was the reduction performed?
- has relevant data been preserved?
21Example technique
- Stack depth limitation metrics-based filtering
- requires two passes
discard events above maximum depth
determine maximum depth
determine depth frequencies
Trace
Trace
- 0 28,450
- 13,902
- 58,444
- 29,933
- 10,004
- ...
200,000 events
42,352 events
gt depth 1
maximum output size (threshold)
50,000 events
22How can we compare the techniques?
- Use
- common context
- common evaluation criteria
- common test set
- ? Ensures fair comparison
23Approach
- Assessment methodology
- Context
- Criteria
- Metrics
- Test set
- Application
- Interpretation
need for high level knowledge reduction success
rate performance info preservation output
size time spent preservation per type five
open source systems, one industrial apply
reductions using thresholds 1,000 thru
1,000,000 compare side-by-side
24Techniques under assessment
- Subsequence summarization summarization
- Stack depth limitation metrics-based
- Language-based filtering filtering
- Sampling ad hoc
25Assessment summary
26- Visualization II Extravis
27Extravis
- Execution Trace Visualizer
- joint collaboration with TU/e
- Goal
- program comprehension through trace visualization
- trace exploration, feature location, ...
- address scalability issues
- millions of events ? sequence diagrams not
adequate
28(No Transcript)
29Evaluation Cromod
- Industrial system
- Regulates greenhouse conditions
- 51 KLOC
- 145 classes
- Trace
- 270,000 events
- Task
- Analysis of fan-in/fan-out characteristics
30Evaluation Cromod (contd)
31Evaluation JHotDraw
- Medium-size open source application
- Java framework for graphics editing
- 73 KLOC
- 344 classes
- Trace
- 180,000 events
- Task
- feature location
- i.e., relate functionality to source code or
trace fragment
32Evaluation JHotDraw (contd)
33Evaluation Checkstyle
- Medium-size open source system
- code validation tool
- 73 KLOC
- 344 classes
- Trace 200,000 events
- Task
- formulate hypothesis
- typical scenario comprises four main phases
- initialization AST construction AST traversal
termination - validate hypothesis through trace analysis
34Evaluation Checkstyle (contd)
35- Current work Human factor
36Motivation
- Need for controlled experiments in general
- measure impact of (novel) visualizations
- Need for empirical validation of Extravis in
particular - only anecdotal evidence thus far
- Measure usefulness of Extravis in
- software maintenance
- does runtime information from Extravis help?
37Experimental design
- Series of maintenance tasks
- from high level to low level
- e.g., overview, refactoring, detailed
understanding - Experimental group
- 10 subjects
- Eclipse IDE Extravis
- Control group
- 10 subjects
- Eclipse IDE
38 39Concluding remarks
- Program comprehension important subject
- make software maintenance more efficient
- Difficult to evaluate and compare
- due to human factor
- Many future directions
- several of which have been addressed by this
research
40Want to participate in the controlled
experiment..?
- Prerequisites
- at least two persons
- knowledge of Java
- (some) experience with Eclipse
- no implementation knowledge of Checkstyle
- two hours to spare between December 1 and 19
- Contact me
- during lunch, or
- through email s.g.m.cornelissen_at_tudelft.nl