Title: SENG 540 Software Evolution
1SENG 540 Software Evolution
2Software Maturity Index
- Discussed in last lecture
- Measure of the maturity of a software system
- How stable is the system?
- Quantify the readiness of a product
- Determined by comparing the size of the system to
the amount of changes occurring in given release - Stability assumed when change is low
- Across several releases, cannot be determined
from one release
3Software Maturity Example
- Mt number of modules in current release
- Fc number of modules in current release which
have been changed - Fa number of module in current release which
have been added - Fd number of modules in previous release that
were deleted - SMI (Mt - (Fa Fc Fd))/Mt
4SMI Example (cont)
5SMI Example (cont)
- Graph of SMI
- Assumes there was a 0 release
- What can you see from this chart?
6SMI Example (cont)
- Other ways to view the same data
- More insight?
- Account for past history
- Examine cumulative changes and average number of
changes
7SMI Example (cont)
- Consider a new metric
- The average number of changes
- Calculated as the cumulative number of changes
over the number of releases - ASC CSC/n
- ASC average software change
- CSC cumulative software change
- n number of release
8SMI Example (cont)
- Software Change
- Sum of Fa Fc Fd
- Average Software Change
- (Fa Fc Fd)/Mt
9SMI Example (cont)
- Software Stability Index
- Mt CSC/Mt
- Historical SMI
- Mt - ASC/Mt
10Program Comprehension
- Estimated that 50-90 of maintainers time is
spent in program comprehension - Improve maintenance - facilitate the process of
comprehending existing programs - Critical in our ability to maintain and
reengineer legacy systems
11Program Comprehension (2)
- Performed
- Maintenance
- Reverse engineering
- Reuse
- Code reviews and walkthroughs
- Validation and verification
12Program Comprehension (3)
- Artifacts
- Requirements specification
- Design documents
- User manual
- Previous maintenance records
- Test cases developed
- Source code
13Program Comprehension (4)
- Attempt to reconstruct
- A model of the system as it currently exists
- Why it exists in current state
14Comprehension Strategies
- Techniques a programmer uses to understand a
program - Mapping between the problem domain and the
programming domain - Actions used
- Read code
- Read documentation
- Run code
15Comprehension Strategies (2)
- Systematic
- Examine the entire program and workout the
interactions between various modules - As-needed
- Before modification -- locate the section which
needs modified and modify it
16Program Understanding
- Muller University of Victoria
- The task of building mental models of the
underlying software at various abstraction
levels, ranging from models of the code itself to
one of the underlying application domain, for
maintenance, evolution and reengineering purposes.
17Levels of Understanding
- Highest level
- What it does
- Lowest level
- Recognize algorithms which implement what it does
- NOT looking for line-by-line level of
understanding
18Factors Affecting Understandability
- Reader characteristics
- Knowledge of language
- Knowledge of domain
- Strategies uses
- Intrinsic factors
- Complexity of code
- Concurrent or real-time
- Size of program
19Factors Affecting Understandability (2)
- Representational factors
- Language itself
- Nature and inclusion of comments
- Choice of identifiers
- Typographical factors
- Upper-lower case
- Fonts and color
- Indentation/white space
20Factors Affecting Understandability (3)
- Environmental factors
- Medium -- paper/CRT
- External documentation
- Tools -- editors and compilers
- Eventually use this information to provide style
guidelines to produce more readable programs
21Theories of Program Understanding
- Bottom-up
- Pennington 87
- Top-Down
- Brooks 83
- Opportunistic, Synchronized Refinement,
Situational - Letovsky 86
22Bottom-Up or Chunking
- Build higher levels of abstraction
- Starts from the source code
- Uses chunking and concept assignment
- Shneiderman and Mayer 79
- Pennington 87
- Biggerstaff, et al 93
- Use control flow
- Examine data structures
- Trace variables
- Chunk this information for storage in memory
23Top-Down
- Begin with a pre-existing notion of the
functionality of the system (hypothesis) and try
to reconstruct the mappings from the problem
domain into the programming domain - Trying to go back in time and recreate a thought
process - Brooks 83
- Soloway and Ehrlich 84
- Determine components responsible for specific
tasks - Iteratively create, verify, and modify hypotheses
until the entire system is explained by a
consistent set of hypotheses - Each hypothesis is investigated to determine if
it holds, should be rejected or refined in a
hierarchical way
24Opportunistic
- Coordinate bottom-up methodology with source code
and top-down with the applications description - Belief is that programmers frequently change
their viewpoint - Letovsky 86
- Or combine them into a new approach
- Mayrhauser and Vans 95, 96, 97
- Working to develop data and control flow
information to create mental representations of
the software system
25Types of Knowledge Employed
- Programming Plans
- Mental representations of program fragments that
represent stereotypical action sequences within
source code - Rules of Discourse
- Conventions for creating programs
- Domain Specific Knowledge
- Information about functional units within the
domain and how these pieces fits together
26Novices verses Experts
- Novices -- use bottom-up
- Due to lack of domain knowledge
- Experts -- use mixture but prefer top-down
- Reading source code in order to identify plans
and domain models - Indicator of types of tools to build
27Program Slicing
- Developed by Weiser (1981)
- Throw out parts of the program which are
irrelevant to the particular function or to the
state of the program of interest in order to
study the program
28Tools and Techniques
- Be aware code and comments may not agree
- Use indentation to help structure
- Try to build a model of the style conventions
used in the program - Watch out for code written to overcome compiler
or computer limitations or code containing
apparently magic numbers.
29Tools and Techniques (2)
- Watch out for non-standard language features
- Consider roundoff implications
- Use stepwise abstraction.
- Look for evidence of changes.
- Be wary of objects of nearly the same name,
particularly those identifiers which differ by a
single character.
30Tools and Techniques (3)
- Be alert for variables that serve more than one
purpose or that are used inconsistently, as they
can mislead the reader. - The effect of apparent bugs can be undone by an
inverse bug somewhere else. - Be alert for literals that are conceptually
distinct but that happen to have the same values.
31Tools and Techniques (4)
- In languages that permit overloading be sure the
operator you think you have is really the one you
do have. - Use program slicing
- Use an editor or browser to traverse the code
- Use file searching tools to find identifiers that
might be in several files.
32Tools and Techniques (5)
- In the absence of tools like a cross-reference
generator such unlikely tools as spell checkers
can be useful for listing the identifiers of the
program. - Traditional debugging techniques can be used to
read the code. - Read programs with a structure chart,
cross-reference listing or summaries at hand.
33Software Reengineering Processes
- Understanding software
- Improving Software
- Capturing, preserving and extending knowledge
about software - Reverse Engineering
34Reverse Engineering
- Process of analyzing a subject system to identify
the systems components and their
interrelationships and create representations of
the system in another form or at a higher level
of abstraction - Need to understand other peoples system
- New people on the team, code reviews, etc.
- Two step process
- Extraction
- Abstraction
35Reverse Engineering (2)
- Three Step Process
- Information gathering
- Information organization
- Information navigation, analysis and presentation
- Analyzing the system
- Determine current components and their
dependencies - Extract and create system abstractions and design
information
36Reverse Engineering (3)
- Purpose is to facilitate
- Enhancement
- Correction
- Documentation
- Re-design
- Re-programming
37Old Code
- Existing code which can not be easily understood,
redesigned, modified, debugged or rewritten - Attributes
- Design method used does not clearly communicate
the program structure, data and function
abstractions - Language and/or techniques used doesnt quickly
and clearly communicate the program's structure,
interfaces, etc.
38Old Code -- Attributes (2)
- Design and code are not organized to be insulated
from changes in hardware and external software - Design was targeted for constraints that no
longer exist - Code contains parts which are non-standard or
unorthodox coding techniques were used - Documentation is non-existent, incomplete, or not
current
39Knowledge Needed
- Syntactic
- Understanding of the syntax of the language
- Understanding of programming semantics
- Semantic
- Understanding of language independent rules and
definitions for the application data types and
processing - General application area
- Domain-specific algorithms
40Reverse Engineering and Design Recovery
- Precursor to Forward Engineering for
re-implementation - Produce a reconstructed design that captures the
functionality of the system - Reconstructed design can be transformed to
modernize it, restructure it, re-modularize it,
incorporate new requirements, etc.
41Design Recovery
- Recreates design abstractions from a combination
of code, existing design documentation, personal
experience and general knowledge about problem
and application domains - New development as well as maintenance and
reverse engineering
42Design Recovery (2)
- Artifacts recovered have an abstraction hierarchy
- Application
- Concepts
- Business rules
- Policies
- Function
- Logical and functional specifications
- Non-functional specifications
- Structure
- Data flow
- Control flow
- Structure charts
- Software architecture
- Implementation
- Symbol tables
- Source text
43Why Reverse Engineer
- Consider re-implementing
- Manually rewrite
- Use an automatic translator
- Redesign and re-implement
- Reverse engineer and re-implement
44Reverse Engineering Procedure
- Pulling things together
- Levels of abstraction result in multiple points
of view - Hierarchical set of models for each abstraction
- Examine the subsystems based on software
engineering principles - Classes, modules, directories, coupling,
cohesion, data flow, control flow, slices - Look for other matches
- Design and change patterns
- Business and technology models
- Function, system and application architectures
- Common services and infrastructure
45Reverse Engineering Procedure (2)
- Collect information
- Examine Information
- Develop plan for recovering and recording
information - Extract structure
- Create set of structure charts
- Create a set of data structure diagrams
- Record functionality
- Each module record processing, PDL
46Reverse Engineering Procedure (3)
- Record data-flow
- Identify data transformations, DFD and PSPEC
- Record control-flow
- Identify high-level control only
- CFD and state diagrams
- Review recovered design
- Consistency and validity
- Generate documentation
47Issues in Reverse Engineering
- Separate design information from implementation
information - Traceability
- Record links between recovered information and
sources - Domain Information
- Reengineering
- Change recovered design information
- Existing Documentation
48Design Decisions Detected
- Composition/Decomposition
- Encapsulation/Interleaving
- Generalization/Specialization
- Representation
- Data/Procedure
- Non-deterministic relations
49Composition/Decomposition
- How arrive at pieces/parts
- Modules
- Data structures
- Top-down -- decompose
- Bottom-up -- compose
- Maps the relationship between abstract elements
and components
50Encapsulation/Interleaving
- Encapsulation --
- Gather parts into a component
- Behavior is restricted through interface
- Limits side effects during modification if
information hiding is used - Interleaving
- Two or more plans performed in same code section
or data structure - Efficient, but harder to understand and maintain
51Generalization/Specialization
- Creating components based on their similarities
- Generalization -- higher level has fewer special
characteristics - Generic functions
- Specialization -- higher level realizes special
cases -- OO
52Representation
- Program serves as a model for application domain
- When efficiency was a concern these models were
represented by constructs close to machine - Could direct or control implementation decisions
53Data and Procedure
- Variables -- introduced to
- Avoid recalculation
- To simplify the expression of a computation
- Must determine
- How variables are used
- What they represent
- Often substitute with calculation
54Non-deterministic Relations
- Logic languages -- Prolog
- Input/output parameters
- Select direction of the function