Laboratory of Software Analysis Lesson 1 - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Laboratory of Software Analysis Lesson 1

Description:

providing the practical skills involved in software analysis and testing. ... North Korea reverse-engineered the Russian missile Scud Bs to make their own Scud Mod A. ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 49
Provided by: sra8
Category:

less

Transcript and Presenter's Notes

Title: Laboratory of Software Analysis Lesson 1


1
Laboratory of Software AnalysisLesson 1
  • Filippo Ricca
  • Unità CINI at DISI
  • (Laboratorio Iniziativa Software
    FINMECCANICA/ELSAG spa - CINI)
  • Genova, Italy
  • filippo.ricca_at_disi.unige.it

Mariano Ceccato, Alesssandro Marchetto ITC-Irst T
rento, Italy ceccato_at_itc.it, marchetto_at_itc.it
2
Overview
  • Objectives
  • Course dependences
  • Content / Course material / tools used
  • Exam (discussion)
  • Legacy systems
  • Reverse engineering, re-structuring,
    re-engineering
  • Program transformations (TXL)
  • Past projects
  • This year three small projects

3
Objectives, dependences, material, exam.
4
Objectives
  • This course has two objectives
  • providing the practical skills involved in
    software analysis and testing. Some
    techniques/approaches described during the
    theoretical lessons of the basic course
    (Software Analysis and Testing) will be applied
    to real cases of software systems to be
    re-engineered and tested.
  • introducing Empirical studies in Software
    engineering

5
Dependences
Not mandatory but
---gt Programming I and II, Software Engineering,
Software Analysis and Testing. ---gt It is
important to kwon (a little) - OO programming,
in particular Java (base level). - UML (class
diagram, ). - WEB technologies HTML, JSP,
(Just a little) - Theoretical aspects of
testing. -
6
Content
  • Code analysis and transformations
  • Theoretical aspects (already seen in Software
    analysis and testing).
  • The TXL programming language.
  • Practice application of some techniques to
    software systems.
  • Software testing
  • Theoretical aspects (already seen in Software
    analysis and testing).
  • Acceptance testing, GUI testing, Test-first,
    Design for testing,
  • Tools FIT, FITNESSE, JUnit, ABBOT, Robot
  • Empirical studies in Software engineering
  • Theoretical aspects (what is an ES?, How to
    design/conduct an ES?)
  • Analysis and interpretation (how to draw
    conclusions)
  • Execution of two empirical studies.

7
Material / Tools
  • Slides
  • Papers
  • Manuals of tools

http//sra.itc.it/people/ceccato/courses/lsa/
Languages and Tools
  • TXL code analysis and transformations
  • Graphviz Graph Visualization Software
  • VisualUML UML modeling tool (diagrams recovery)
  • JUnit, Fit, Fitnesse, Abbot, Robot Testing
    tools

8
Examination
  • During the course we will work at a lot of small
    projects.
  • The examination will consist of a discussion.
  • Admission to the examination requires (at least)
    the production
  • of some documents that we will see during the
    year.
  • Examples of small projects
  • Recovering the Architecture (class diagram) of a
    system.
  • Maintenance intervention / re-implementation of
    a system
  • Porting a C program in Java
  • Testing
  • Empirical study C vs. Java

9
A little of Terminology
10
Legacy systems
Negative aspects
Positive aspects
Characteristics
  • They were implemented years ago (? 1970)
  • Their technology became obsolete (obsolete
    languages, language
  • styles, hardware, )
  • They have been maintained for a long time (? 30
    years)
  • Their structure is deteriorated and does not
    facilitate understanding
  • Their documentation (if it exists) became
    obsolete
  • Original authors are not available
  • They contain business rules not recorded
    elsewhere
  • They can not be easily replaced (importart!)
  • They represent a large investment

Each maintenance intervention is Extremely
difficult!
11
Legacy dilemma
  • What should we do with legacy code?
  • to build the new system from scratch.
  • trying to understand the legacy code and to
    reconstitute it in a new form.

First step reverse engineering
12
Reverse Engineering
  • Reverse engineering is the process of taking
    something (a device, an electrical component, a
    car, a software, ) apart and analyzing its
    working in details, usually with the intention to
    construct a new device or program that does the
    same thing.
  • Reverse engineering is used often by military, in
    order to copy other nations technology.

13
Military Reverse Engineered projects
  • Examples of military reverse-engineered projects
    include
  • Soviet Union reverse-engineered Tu-4 Bull bomber
    from United States Boing B-29.
  • Soviet Union personal computer AGATHA was
    reverse-engineered from the Apple II.
  • North Korea reverse-engineered the Russian
    missile Scud Bs to make their own Scud Mod A.

Boing B-29
14
(Software) Reverse engineering
  • Reverse engineering is a process that helps
    understanding a software system. It is a process
    of examination, of extracting information, not a
    process of change or replication.

Software ----------gt Abstract representation
Software ----------gt
15
Forward and Reverse Engineering
  • Forward engineering is the traditional process of
    moving from high-level abstractions to the
    physical implementation of a system.

Requirements
Design
Implementation
  • Reverse engineering is the inverse of Forward
    engineering

Design
Implementation
Requirements
Abstract Code Representation
Code
16
Reverse Engineering Tools
  • Pretty printers and code viewers
  • Diagram generators (software views flowcharts,
    data flow diagrams, call graph diagrams, )
  • Embedded comments extractors (ex. Javadoc)
  • Software metrics tools (Locs, methods/functions,
    cohesion, coupling)
  • Design recovery tools (ex. Rational Rose, Omondo,
    VisualUML UML diagram extractor)
  • Others

17
Restructuring
  • Restructuring is the transformation from one
    representation to
  • another at the same relative abstraction level -
    while preserving
  • the system external behavior (functionality and
    semantics).
  • Examples
  • Code level - from an unstructured
    (spaghetti) form to a
  • structured form
    (goto-less)
  • - conversion of set of
    if-statements into a
  • case structure.
  • Design level to improve or change data
    structures (arrays to Lists,
  • files system to DBMS )
    or to improve algorithms
  • (for example time
    complexity).

18
Re-engineering
  • Re-engineering is the examination (reverse
    engineering) of a system to reconstitute it
    (forward engineering) in a new form.
  • This process may include modifications with
    respect to new requirements not met by the
    original system (Semantics cannot be preserved).
  • The re-engineering process takes many forms,
    depending on its
  • objectives. Sample objectives are
  • code migration/porting (ex. C to C)
  • reengineering code for reuse
  • reengineering code for security

19
Relationships
20
Program analysis
  • Program analysis is the (automated) inspection
    of a program
  • to infer some properties. Usually, properties are
    inferred without
  • running the program (static analysis).
  • Examples are
  • Type analysis (type inference)
  • Dead code analysis
  • Clone analysis
  • Pointer Analysis

21
Program Transformations
  • Program transformation is the act of changing one
    program into another.

Two cases L is different from L L is equal to
L
transformation
P
P
source language L
target language L
  • Examples
  • Pascal to C porting
  • Goto elimination (Pascal language)

22
TXL
  • TXL is a programming language specifically
    designed to support software analysis and program
    transformation.

x4 b c y2 x4 2 Loop x a b
y x a y 3 z x4 a y2 End
loop
Loop x a b y x a y 3
x b c y x 2 z x a y End
loop
Code motion optimization
Example moves all loop-independent
assignment statements outside of loops.
23
Past Projects and Project of this year
24
Project Year 2004 Porting C to Java
  • Porting of the Chull program (C code) in Java.
  • Chull determines the convex hull of a set of
    points in 3D.
  • Chull is not a trivial C code (4161 LOCs, 31
    functions, 3 struct, pointers, ).

Convex hull in 2D
25
Project Steps 2004
Semi-automated procedure
  • Instrumentation of Chull using TXL. Writing
    Testcases such that branch covered is reached.
  • Reverse engineering of Chull using TXL Call
    graph, dependences between functions and data
    structure.
  • Object identification (clustering and concept
    analysis).
  • OO design in UML (only class diagram).
  • Java code generation. Chull (partially
    TXL)
  • Testing of Chull with testcases generated at
    point (1) to show that Chulli Chulli

26
Code instrumentation
  • To determine whether or not each branch is
    traversed, we can place a counter
    (instrumentation) on each branch. Then we have to
    run the program with inputs.
  • To have branch coverage we have to check if
    count is equal to (1, 1, , 1).

read x, y
start
count(1) 1
Program instrumented
z 1
count (0, 0, 0)
true
If (x gty)
count(3) 1
exit
false
N.B count is an array where each element
is assigned to 0.

count(2) 1
27
Project Year 2005Maintenance intervention
is implemented by code fragments spread across
several classes
  • Adding a new crosscutting functionality
    (persistence history) to the Jconsole java
    program .
  • Jconsole 27 java files, 1385 LOCs.
  • Two ways for adding a crosscutting functionality
    to a system
  • 1) Changing (almost) all the java classes.
  • 2) Adding an aspect (AOP) in the language
    AspectJ.

28
AspectJ example
Suppose to have to add logging for all methods
of a Java program. (Logger.entry(string) and
Logger.exit(string))
/ Java / Public class Main public
void foo() Logger.entry(foo())
. Something
Logger.exit(foo()) public void
foo(int i) Logger.entry(foo(int))
. Something
Logger.exit(foo(int)) public
static void main(String args)
Logger.entry(main()) .
Something Logger.exit(main())

/ AspectJ / Public class Main public
void foo() . Something
public void foo(int i)
. Something public static void
main(String args) . Something

Public aspect autolog pointcut
publicMethods() . Before() publicMethods()
Logger.entry After() publicMethods()
Logger.exit
29
Project Year 2006a real SE experiment
  • We have conducted a real software engineering
    experiment
  • stereotyped UML class diagrams
  • (Conallen proposal)
  • vs.
  • Pure UML class diagrams
  • What are stereotypes?
  • What is a software engineering experiment (or
    software engineering empirical study)?

Web Applications context
30
Stereotypes
  • The designers of UML recognized that the
    language is not always perfect for every
    situation/domain.
  • UML has defined a mechanism to allow certain
    domains to extend the semantics of specific model
    elements. The extension mechanism allows the
    inclusion of new attributes, different semantics
    and additional constraints.
  • Stereotypes form an extension to UML.
  • Stereotypes are adornments or icons having a
    well-defined semantics.

Used instead of classes in the class diagram
31
Empirical studies in SE
  • Software engineering is the result of opinions
    and anecdotal evidences and not the result of
    empirical evidence...
  • For example no one has demonstrated that OO
    techniques are better that structured techniques,
    but everyone uses OO ...
  • Empirical studies (experiments) are useful to try
    to answer some research questions.
  • technique A is better than B?

32
How to conduct an empirical study?
  • Suppose that we have to demonstrate this
    hypothesis
  • technique A is better than B
  • Procedure
  • Participants (students, professionals, etc) are
    divided into two groups (Group 1 and Group 2).
  • Group 1 will execute the task with technique A
    while Group 2 with technique B.
  • Data of the experiment are collected and metrics
    are measured.
  • The hypothesis of the experiment is evaluated
    statistically using data collected and metrics.

33
Empirical study 1 Conallen vs. Pure UML
Conallen notation
Pure UML
Which is more useful during understanding and
maintenance?
34
This year
Three projects
  • Porting Borland Delphi Object Pascal program to
    Java using TXL
  • Empirical study 1 Testcases (Fit tables) can
    be used to clarify requirements?
  • Empirical study 2 Conallen vs. WebML. When doing
    a comprehension task is more useful Conallen or
    WebML?

35
Borland Delphi Object Pascal to Java
  • Type
  • Person object
  • surname string30
  • name string20
  • age Integer
  • Procedure init
  • End
  • Student Object(Person)
  • grade Integer
  • teacher String30
  • End
  • Procedure Person.Init
  • Begin
  • surname
  • name
  • age0
  • End

class Person String surname String
name int age Person()
surname name age
0 class Student extends Person
int grade String teacher
TXL program
36
Fit tables
  • A Fit table is a way of expressing the business
    logic using a simple (input-output) HTML table.
  • Fit tables are added to the requirements and
    are used as acceptance test cases.
  • Customers and Analysts create Fit tables using a
    tool like Word, Excel, or even a text editor.

input
output
37
Sports Magazine Website
We did an example to understand a little bit
better Fit tables
  • A sports magazine decides to add a new feature to
    its Website that will allow users to view top
    football teams based on their ratings.
  • Rating ((10000(won3drawn))
  • (3played))/100)
  • The analyst can express the change requirement in
    the traditional way
  • - natural language, use cases, .
  • or
  • - using natural language Fit tables

new feature added
38
Natural languagevs. Fit tables natural
language
Fit tables can be used to clarify requirements?
Only natural language
Fit table natural language
  • A user can search for top N football teams based
    on rating.
  • The rating is defined
  • A user can search for top N football teams based
    on rating.

39
Empirical study 2
Questionnaire
Group 1

Conallen
Questionnaire
Web appls

Group 2
WEBML
  • When doing a comprehension task which is the
    notation more useful?

40
The end
Next lessons
41
Obfuscated C contest Winner
return
IOCCC is a competition to see who can write the
most unreadable, but legal C program.
return
42
Code viewers
return
1) Textual representation (colors)
2) Graphical representation (colors)
43
A picture is worth a thousand words
Imagix tool
Main Window (? call graph)
C code
Calls
Functions
Variables
return
44
return
CVF 3.0
CVF 3.0 is a automated program Flow chart
generator. It can perform automated reverse
engineering of program code into programming
flowcharts. It works with C, C, VC, VB, VBA,
VBScript, ASP, Visual C, Visual Basic .NET,
Visual J .NET, VC.NET, ASP.NET, Java, JSP,
JavaScript, Delphi, PowerBuilder and Perl.
45
return
UML Class Diagram Recovery
46
Type inference
Inferred Types
Program P
a4 cab Push(x, T) Push(y, T) dPop(T)
a integer creal, breal T queue dreal
Language without declarations
return
47
Dead code
20 FOR I1 TO 10 30 VI VI 1 40 PRINT
VI 50 ENDFOR 60 PRINT X 70 GOTO 100 80
CALL F1 90 CALL F2 100 END
Suppose No jumps to the lines 80 and 90!
Never executed
return
48
Clones
return
20 FOR I1 TO 10 30 VI VI 1 40 PRINT
VI 50 ENDFOR 60 PRINT X 70 CALL F 100
FOR J1 TO 10 110 WJ WJ 1 120 PRINT
WJ 130 ENDFOR
  • Example clone analysis

Clones
Lines 20-50 and 100-130
Write a Comment
User Comments (0)
About PowerShow.com