Title: Maintaining large software systems
1Maintaining large software systems
- Dr L Bottaci
- Department of Computer Science
- University of Hull, Hull, UK
2Preface Module
- 20 Credits
- Syllabus topics
- Software maintenance practice
- Debugging
- Software maintenance management
3Preface Resources
- Course Materials
- Check undergraduate web site for this module.
- Lecture slides
- Make your own notes
- Course notes
- ACW description
- Read notice board and email
4Preface ACW
- Assessed course work
- Worth 100 of module assessment
- Software maintenance task
- Work individually
- Assessed on what you learn, as well as product
- ACW specification, on web page, will have details
5Preface Assessment
- Assessing learning. In order of importance
- Evidence of learning in the logbook
- Student contribution to seminars and lab
discussions - Assessment of modified software
- Course is safe environment for sensible risk
taking
6Preface Reading
- Books
- Few books specifically on Maintenance
- See references in course notes
- Consult any good software engineering book, e.g.
- Pressman, Software Engineering, McGraw Hill, 2000
7What is Software Maintenance?
- Software does not change
- But the operating environment and the world does.
- Fix bugs
- Adapt to new operating environment
- Adapt to new requirements
8Maintenance as software engineering
- 50 (by cost) of maintenance is done to adapt to
changed requirements - 80 (by cost) of software engineering is software
maintenance
9Maintenance costs
- Cost of change increases with time after design
- Reduce cost by planning for long term maintenance
- Planning and designing for maintenance increases
development cost - Commercial requirements important, balance with
engineering requirements
10Syllabus topic 1 Software maintenance practice
- Software maintenance is not a theoretical subject
- Learning is change
- Learn by doing
- Learn by thinking
11Practical exercise outcomes
- Learn what is required to maintain software.
- Learn how to improve ones knowledge and skill.
- Lazy practice makes permanent
- Goal directed practice makes better
- Motivation and self confidence.
- Requires a rational assessment of ones abilities
12Maintenance task Brief intro
- Modify the jscript compiler (part of the .NET
sscli) to implement new requirements.
13Maintenance task overview
- Students given a short review of compiler
operation, scanner, parser, code generator - Role of abstract syntax tree
- No other information it is important that you
learn
14Why use Rotor?
- Code large enough that it cannot be understood in
its entirety - Code contains very few comments
- Sufficiently readable for students to make
progress in the relatively short time allocated
for a module - It is real code
15Finding out about the system
- Look for information about systems of the type
you are examining - If it is an object-oriented compiler, look for
information about object-oriented compilers. - Do not overlook journal articles and books. There
is a lag between ideas appearing in the
literature and their take up in commercial
products so it may be necessary to search the
literature that was published several years
before the system was built.
16Finding out about the system
- Look for documentation, in the source code files
themselves or in associated documentation files.
- Check if the producer of the code has
documentation in addition to whatever is in the
distribution you are examining. - Has anyone else worked or looked at the code, do
they have documentation or information? Can they
be contacted?
17Finding out about the system
- Can static analysis tools be useful?
- Very simple and useful facility is the ability to
search a set of files for a given string e.g. the
grep tool in UNIX, find, findstr in Windows.
Similar tool in VS Ctrl-Shft-F - More sophisticated tools later
18Finding out about the system
- Can automatic documentation tools be useful?
- E.g. it is often useful to know which functions
call a given function. A complete description is
known as the call graph. - Other kinds of graph - class hierarchies.
- In general, automatic documentation tools produce
cross referenced lists - Example documentation tool is doxygen, available
on the web.
19Finding out about the system
- Ultimately, read the code.
20Code reading skills
- Code reading should be goal-directed
- Reading to see what is there
- Trying to understand each line
- What are you expecting to find?
- Formulate an hypothesis
- Read the code to confirm or disprove it.
21Code reading illustration 1
- What is the following code doing?
- while (...)
- ...
-
- Hypothesise the most popular uses of a loop in
general and look at code for evidence.
22Code reading illustration 2
- while (...)
- sum sum ai
- ...
-
- Array accumulation a likely hypothesis
23Code reading illustration 3
- while (...)
- sum sum ai
- if (...)
- ...
- else
- ...
- ...
-
- Hypotheses to explain the conditional inside a
loop
24Code reading illustration 4
- while (...)
- sum sum ai
- if (...)
- done 1
- else
- ...
- ...
-
- Flag, is it for early termination?
25Code reading illustration 5
- while (i lt 9 and done 0)
- sum sum ai
- if (...)
- done 1
- else
- ...
- ...
26Initial lab exercises 1
- Read the Rotor (sscli) documentation
- Download and build the system
- Try the jscript compiler.
- Find, compile and execute a sample jscript
program - Test the system, save the log file to compare
with future tests.
27Initial lab exercises 2
- Modify the jscript compiler to print a message
before it compiles a file - Rebuild compiler and recompile jscript sample.
- Rerun tests and check output with previous run of
tests.
28Further lab exercises
- Print each character in the file compiled by the
jscript compiler. - Print each token recognised by the jscript
compiler.
29Practical Exercise
- Control moves to next statement in program unless
there is a conditional statement or transfer of
control statement. - Conditional statement is if-statement,
switch-statement, while-statement and
for-statement. - Transfer of control statement is return, break,
continue. - Task is to modify the compiler so that it
produces a warning when it detects that a
statement is unreachable, i.e. cannot be
executed.
30Practical Exercise E.g.
- for (i 0 i lt n i)
- x y
- break
- y 0
-
-
-
31Practical Exercise E.g.
- switch (e)
- case 1
- x 0
- break
- x 1
- case 2
- x 1
- break
- default
- x 2
32Practical Exercise E.g.
33Practical Exercise stages
- Continue and extend the examples given to produce
a list of test cases. - Implementation plan with algorithm
- When above two checked, proceed with
implementation
34Example implementation plantextual substitution
- Read jscript program source code, as a file of
text, looking for keywords such as return,
break, etc. - Identify statements in jscript program source
code by looking for substrings terminating in a
semicolon,
35Example implementation plantextual
substitution, evaluation
- Read jscript program source code, as a file of
text, looking for keywords such as return,
break, etc. - jsscanner does this, plus point for plan
- Identify statements in jscript program source
code by looking for substrings terminating in a
semicolon, - jsparser does this, plus point for plan
36Example implementation planexamine the MSIL
- Examine the MSIL produced by the jscript compiler
to identify unreachable code. - Could start by examining MISL for simple source
code examples given above.
37Example implementation plantransform the AST
- Examine the AST produced by the jscript compiler
to identify transfer of control statements, etc.
38Example implementation plantransform the AST,
evaluation
- Examine the AST produced by the jscript compiler
to identify control transfer statements. - Traverse AST looking for a type of AST node
- Need a foreach-stmt to iterate over AST
39Cost estimation individual
- Necessary and frequent activity, usually implicit
- In practical work, cost estimation should be
explicit so that it can be scrutinised and
improved. - Calculate estimate, record in logbook
- When estimate expires, review estimate
- Note how it can be improved
40Tools for navigating code
- Tools are available for extracting information
from code. - Most simple tools search files for strings, e.g.
in VS Ctrl-Shft-F - Most sophisticated tools called reverse
engineering
41Tools for navigating code
- Method call relationships for all methods is
known as the call graph - Can be constructed by a tool.
- Other graphs includes class hierarchies.
- The sort of documentation produced by automatic
documentation tools consists largely of cross
referenced lists. - An example of such a documentation tool is
doxygen, available on the web
42Tools for navigating code
- To answer more sophisticated queries, analysis of
the program dependency graph is required. - To use these tools it is necessary to understand
the program dependency graph. - The program dependency graph is actually a
collection of graphs dealing with control and
data dependency
43Example program
- 1. i 0
- 2. sum 0
- 3. done 0
- 4. while (i lt 9 and done 0)
- 5. sum sum ai
- 6. if (sum gt 8)
- 7. done 1
- else
- 8. i i 1
- 9. print(sum)
44Control Flow
- The nodes of the graphs are the statements in a
program or collections of statements known as
regions. - A region may correspond to a basic block.
- The conditional nodes of a control flow graph are
distinguished (typically shown as squares) from
the statement nodes (typically shown as ellipses)
- Directed edges are the possible transitions
between statements or basic blocks during program
execution. - The conditional transitions are associated with a
branch predicate (labelled T or F). - There is a distinguished start node and a
distinguished exit node.
45Control Dependency
- The control dependency graph is derived from the
control flow graph. - When node Y is control dependent on node X,
taking one branch at X will ensure that Y is
reached, Y may or may not be reached if the other
branch is taken. - As an example, consider nodes 4 and 5 in the
control flow graph of previous program.
46Control Dependency example
- There is a path from node 4 to node 5.
- Taking the true branch at 4 ensures that 5 is
reached. This is not true if the other branch it
taken. - Node 5 is said to be control dependent of node 4.
- In contrast, node 9 is not control dependent on
node 4 since either branch at node 4 will always
lead to node 9.
47Data Dependency
- Data dependency exists between two nodes if the
meaning of the program may change when the order
of the two nodes is reversed. - Different kinds of data dependency.
- Flow dependency
- Def-order dependency
48Flow Dependency
- Flow dependency exists from X to Y if
- a variable v is defined (the value of v is set)
at X and used at Y, and - there is a path in the control flow graph from X
to Y without an intervening definition of v. - In other words, the definition at X may directly
determine the value of v at Y.
49Def-order Dependency
- Def-order dependency exists from X to Y if
- both nodes define the same variable v,
- X and Y are in the same branch of any conditional
that contains both X and Y, - there is a node Z that is flow dependent on X and
Y, and - X is to the left of Y in the abstract syntax
tree.
50Def-order Dependency example
- An example of def-order dependency is present
between node 3 and node 7 in the previous example
program
51Program slices
- A program slice is a subset of the statements in
a program that are relevant to some criterion,
usually the value of a variable at a given
statement. - This case is called a backward slice.
- The forward slice also useful, i.e. all the
statements possibly affected by the value
assigned at a particular statement. - CodeSurfer from GrammaTech is a code analysis
tool (C code only) based on the program
dependency graph. - The web site provides technical papers as well as
an overview of the capabilities of the tool
52Debugging
- It is much better to spend time when first
writing code to ensure it is correct than to
spend time debugging incorrect code. - Many programmers think the opposite is true.
53Software inspection exercise
- //REMOVE ELEMENT FROM a AT i2 IF i2 VALID
- //INSERT elem at i1 IN a IF i1 VALID
- //count IS LIMIT OF OCCUPIED PART OF a
- int i 0
- if (i1 gt 0 i1 lt count)
- for (i count i gt i1 i--)
- ai ai 1
-
- ai - 1 elem
-
- if (i2 gt 0 i2 lt count)
- for (i i2 i lt count i)
- ai ai 1
-
54Software inspection exercise
- //ONLY THE FIRST count ELEMENTS OF ARRAY a ARE
EVER OCCUPIED - //WHEN count EQUALS THE LENGTH OF a NO MORE
ELEMENTS MAY BE ADDED - //INSERT elem INTO NONFULL ARRAY a AT indexIn
PROVIDING indexIn lt count - //REMOVE EXISTING ELEMENT FROM ARRAY a AT INDEX
indexOut - //OTHERWISE, a REMAINS UNCHANGED
- if (a.Length gt count indexIn gt 0 indexIn
lt count) //INSERT - int i 0
- for (i count i gt indexIn i--)
- ai ai 1
-
- aindexIn elem
- count count 1
-
- if (indexOut gt 0 indexOut lt count)
//REMOVE - int i 0
- for (i indexOut i lt count - 1 i)
- ai ai 1
-
- count count 1
55Debugging
- Careful code design and debugging, are not of
equal productivity cost. - Over the long term, an extra day designing a
program worth more than a day of debugging saved.
- Programmers expect to improve with experience.
- Experience in careful program design is more
valuable than debugging experience. - What is learnt during a day spent debugging is
rarely applicable to another program.
56Debugging
- Defensive programming is an effective way of
avoiding debugging. - Handle exceptions as close as possible to where
they may occur. - It is important to distinguish between
- a run-time condition that can and should be
handled, e.g. an invalid input which may be
cleared and read again, and - a run-time condition that represents a failed
pre-condition that invalidates the entire program
so that recovery is not possible. - An assertion can be used to test for a
pre-condition at run-time - Debug.Assert(n gt 0) //PROGRAM INVALID
- If true, no action occurs but if it fails while
executing under the debugger, the program enters
break mode
57Debugging Assert()
- In C\, Assert method in the Debug class and
Trace class - To use Assert, the file must include the
directives - \define TRACE or
- \define DEBUG
- For efficiency, Debug methods not included in
release version - never put error handling code in a Debug
assertion.
58Debugging Assert()
- It is also essential that the code that computes
the required assert condition does not produce
side effects. - It is not good programming practice for any
condition to produce a side effect. - Trace assertions are retained in release version.
- Assert takes up to three arguments.
- The first is mandatory and is the condition to
check. - The remaining two arguments are expressions that
evaluate to strings that are printed when the
condition fails
59Debugging Assert()
- As a rule, assert all the pre-conditions for the
arguments of each non-trivial function or method.
- For each method, assert separate conditions
separately so that when a condition fails it will
be clear which it is.
60Simple test harness
- DOS batch file run.bat
- DEL run.out
- DATE /T gtgt run.out
- TIME /T gtgt run.out
- for f in (
- file0
- file1
- ) do call runaux f
61Simple test harness
- DOS batch file runaux.bat
- set CLIXC\rotor\sscli20\binaries.x86chk.rotor
- set JSCC\rotor\sscli20\binaries.x86chk.rotor
- set PROGNAME1
- TYPE PROGNAME.js gtgt run.out
- CLIX JSC PROGNAME.js gt PROGNAME.out
- FC /L /N PROGNAME.out PROGNAME.rqd gtgt run.out
- REM DEL PROGNAME.out
62Software Maintenance management
- Any activity that consumes considerable resources
requires good management. - Maintenance planning should be done at the same
time as the planning of the system development.
63Software Maintenance management
- Maintenance plan should include
- the maintenance goals,
- maintenance management
- maintenance processes,
- hardware and software platforms, tools
- personnel, training
- and budget.
64Maintenance process models
- Any activity that is to be managed must first be
described and understood. - A life cycle model describes activities as phases
in a process.
65Maintenance process models
- Taute maintenance model (1983, see notes)
- Request phase,
- the requested change is identified and logged.
- Identification includes a check that the request
actually is a modification, that it has not
already been submitted by some other user under a
different id perhaps, etc.
66Maintenance process models
- Estimate phase,
- how much will the change cost to implement?
- What are the implications of the change?
- Why is the change required?
- Is this change in the best interest of the
supplier-customer relationship? - What other changes are likely to be required?
- It is necessary for both the supplier and
customer to have a clear idea of the aim of the
system.
67Maintenance process models
- Estimate phase requires detailed knowledge of the
software system - E.g. the following anecdote told by David Parnas.
- When code is to be rewritten there is the issue
of whether to preserve long standing bugs or
features. - Consider the conversion of an old unstructured
code fragment that displays the altimeter reading
in an aircraft cockpit.
68Maintenance process models
- if not canread(alt1) goto l1
- display(alt1)
- goto l3
- l1 if not canread(alt2) goto l2
- display(alt2)
- goto l3
- l2 display(3000)
- l3
69Maintenance process models
- Convert to a modern structured code fragment
- if canread(alt1)
- display(alt1)
- else if canread(alt2)
- display(alt2)
- else display(3000)
- Is conversion correct?
70Maintenance process models
- The 3000 value is displayed when neither
altimeter can be read. - Why was 3000 used?
- Should the 3000 be changed to error' or pull
up'? - Can the display show only digits?
- The importance, in software maintenance, of
understanding the requirements should be obvious
71Maintenance process models
- Schedule phase, when is the change to be
implemented and released? - Programming phase, new release version is created
and code modified. - Test phase, the new release is tested. This may
require modifying or writing new tests. - Documentation phase, existing documentation is
modified. - Release phase, new release given to some users
for acceptance testing. - Operational phase, new release delivered to all
users.
72Maintenance process models
- A more detailed model is the IEEE maintenance
process model as described in the IEEE 1219-1998
standard. - Phases are similar to those in the Taute
maintenance model but each phase is described in
terms of four aspects. - process (what is done),
- input to process,
- output of process,
- control (how is the process controlled and output
checked?).
73Configuration management
- Configuration management is the administration of
changes to a product and versions of a product. - A product may be a plan, a specification, a
design, some code, test data, etc. - The configuration control board considers the
various modification requests, their utility and
estimated cost. - These requests are considered in the light of the
overall strategy for the system under maintenance.
74Configuration management
- Changes are made with respect to a baseline
product or version. - After a system has undergone a number of changes
that, ideally, form a logically coherent unit,
the system is said to be in a different version.
- The collection of changes that defines a version
may, for example, be all those that allow the
system to operate on a new platform. - Sometimes the changes that define a new version
have little in common and happen to be those
changes ready at the scheduled six month release
date.
75Configuration management
- It is important to know the construction history
of the various versions, i.e. - which version was modified to produce which
version. - Each change definition, i.e. each code change,
must be accompanied with associated information, - the author,
- the reason for the change (which should be
traceable back to a modification of the
requirements), - the date,
- authorisation, etc.
76Configuration management
- If change only latest version then the
derived-from relation produces a sequence of
versions. - This is the result, for example,of producing
backups only. - When some version earlier than the latest version
is modified, a new branch of the tree is formed.
- In general, versions form a directed graph rather
than a tree since different versions may merge.
77Configuration management
- A merged version contains the changes of both its
parents. - Clearly, if changes to the same line conflict
then the user must resolve the conflict, - the user chooses which change toaccept.
- Clearly, merging must be done with care.
78Configuration management
- A product is decomposed into modules, files or
assemblies. - Configuration management systems will allow
changes to a version to be made on a module by
module basis. - Configuration management systems are essential
when there are a number of programmers working on
the same system.
79Configuration management
- To make a change to a file which is part of some
version, the file is first checked out. - Checking out a file ensures that whoever is
requesting a file has permission to change that
file. - In this case the file is said to be locked.
- Different locking schemes are possible.
- pessimistic locking, single write permission but
multiple readpermissions. - Optimistic locking allows multiple write
permissions and provides some mechanism for
resolving overwrite clashes. For example, the
first write may cause all other writers to be
notified at which point there is the option to
check out the updated file.