Title: COMS W4156: Advanced Software Engineering
1COMS W4156 Advanced Software Engineering
- Prof. Gail Kaiser
- Kaiser4156_at_cs.columbia.edu
- http//bank.cs.columbia.edu/classes/cs4156/
2Topics covered in this lecture
- Software Quality
- Refactoring
- Verification and Validation
3Software Quality
4Quality is Hard to Pin Down
- Concise, clear definition is elusive
- Not easily quantifiable
- Many things to many people
- You'll know it when you see it
- Often defined as set of ilities (attributes)
5Good Quality Software Has(according to Robert
Glass)
- Understandability
- The ability of a reader of the software to
understand its function - Critical for maintenance
- Modifiability
- The ability of the software to be changed by that
reader - Almost defines "maintainability"
6Good Quality Software Has(according to Robert
Glass)
- Reliability
- The ability of the software to perform as
intended without failure - If it isn't reliable, the maintainer must fix it
- Efficiency
- The ability of the software to operate with
minimal use of time and space resources - If it isn't efficient, the maintainer must
improve it
7Good Quality Software Has(according to Robert
Glass)
- Portability
- The ease with which the software can be made
useful in another environment - Porting is usually done by the maintainer
- Testability
- The ability of the software to be tested easily
- Finding/fixing bugs is part of maintenance
- Enhancements/additions must also be tested
8Good Quality Software Has(according to Robert
Glass)
- Usability
- The ability of the software to be easily used
(human factors) - Not easily used implies more support calls,
enhancements, corrections - Notice all related to maintenance but these
qualities need to be instilled during development
9Approaches to Achieving Quality
- Continuous refactoring
- Verification and validation
- Buy rather than build
- Open source software
- Software process and process improvement
10Refactoring
11What is Refactoring?
- The process of changing the source code of a
software system such that - The external (observable) behavior of the system
does not change - e.g., functional and
extra-functional requirements are maintained - But the internal structure of the system is
improved
12How improved?
- Maintainability!
- Easier to read and understand
- Easier to (further) modify
- Easier to integrate
- Easier to test
13Simple Example Consolidate Duplicate Conditional
Fragments
- This
- if (isSpecialDeal())
- total price 0.95
- send()
- else
- total price 0.98
- send()
- Becomes this
- if (isSpecialDeal())
- total price 0.95
- else
- total price 0.98
-
- send()
14Why is it called Refactoring?
- By analogy to the factorization of polynomials
- For example,
- x2 - x - 2
- can be factored as
- (x 1)(x - 2)
- revealing an internal structure that was
- previously not visible (two roots at -1 and 2)
- Similarly, in software refactoring, the change in
- visible structure can often reveal the "hidden
internal structure of the original code
15Refactoring Process
- A disciplined technique for restructuring an
existing body of code, altering its internal
structure without changing its external behavior - Series of small behavior-preserving
transformations - Each transformation does little, but a sequence
of transformations can produce a significant
restructuring - Since each refactoring is small, it's less likely
to go wrong - The system is also kept fully working after each
small refactoring (via regression testing),
reducing the chances that a system can get
seriously broken during the restructuring
16Example Extract Method
- You have a code fragment that can be grouped
together - Turn the fragment into a method whose name
explains the purpose of the fragment
- void printOwing(double amount)
- printBanner()
- //print details
- System.out.println(name _name)
- System.out.println(amount amount)
-
- void printOwing(double amount)
- printBanner()
- printDetails(amount)
-
- void printDetails(double amount)
- System.out.println(name _name)
- System.out.println(amount amount)
17Example Replace Temp with Query
- You are using a temporary variable to hold the
result of an expression - Extract the expression into a method
- Replace all references to the temp with the
method call - The new method can then be used in other methods
double basePrice _quantity _itemPrice if
(basePrice gt 1000) return basePrice
0.95 else return basePrice 0.98
if (basePrice() gt 1000) return basePrice()
0.95 else return basePrice() 0.98 double
basePrice() return _quantity _itemPrice
18Example Introduce Null Object
- Repeated checks for a null value
- Replace the null value with a null object
if (customer null) name occupant else
name customer.getName() if (customer
null)
public class nullCustomer public String
getName() return occupant custome
r.getName()
19Example Exploit Polymorphism
- Generally, polymorphism is the ability to appear
in many forms - In OO, polymorphism refers to a programming
language's ability to process objects differently
depending on their data type or class - More specifically, it is the ability to redefine
methods for derived classes (subclasses)
20Example
- double getSpeed()
- switch (_type)
- case EUROPEAN
- return getBaseSpeed()
- case US
- return getBaseSpeed() / 1.6
- case BRITISH
- if (getDate() lt new Year(1990))
- return getBaseSpeed() / 1.6
- else return getBaseSpeed()
-
- throw new RuntimeException
- ("Should be unreachable")
-
21Refactoring is Incremental Redesign
- The idea behind refactoring is to acknowledge
that it will be difficult to get a design right
the first time - And as a programs requirements change, the
design may need to change - It is notoriously difficult (impossible?) to
design for all possible changes a priori - And as agile programming proponents say, You
aint gonna need it but what if later you do? - Refactoring provides techniques for evolving the
design in small incremental steps
22Refactoring Benefits
- Often code size is reduced after refactoring
- Confusing structures are transformed into simpler
structures - which are easier to maintain (and
often easier to unit test) - Promotes a deeper understanding of the code -
which aids in finding bugs and anticipating
potential bugs
23Contrast with Performance Optimization
- Again functionality is not changed, only internal
structure - However, performance optimizations often involve
making code harder to understand (but faster!) - Use more efficient but more complicated
algorithms and data structures - Lose generality to address specific issues of the
implemented solution - Use profiling tools to determine the 10-20 of
the code requiring 80-90 of the CPU cycles
optimize that code, refactor all the other code
24When to Refactor?
- When you add new functionality
- Do it before you add the new function, to make it
easier to add the function - Or do it after you add the function, to clean up
the code including that function - When you need to fix a bug
- As you do a code review
- Whenever
25Why to Refactor?
- General goal is maintainability
- Enhance clarity, understandability,
modifiability, integratability, testability - Very often refactoring is about
- Increasing cohesion
- Decreasing coupling
26Cohesion and Coupling
- Cohesion is a property or characteristic of an
individual unit - Coupling is a property of a collection of units
- High cohesion GOOD, high coupling BAD
- Design for change
- You don't want a change in one unit (component,
class, method) to cause errors to ripple
throughout your system - Make units highly cohesive, seek low coupling
among units
27What to Refactor?
- Duplicated Code
- Bad because if you modify one instance of
duplicated code but not all the others, you (may)
have introduced a bug! - Switch Statements
- Often duplicated in code, can typically be
replaced by use of polymorphism (in OO languages)
28What to Refactor?
- Long Method
- More difficult to understand
- Performance concerns with respect to lots of
short methods are largely obsolete - Long Parameter List
- Hard to understand, can become inconsistent
- Large Class
- Trying to do too much, which reduces cohesion
29What to Refactor?
- Divergent Change
- One type of change requires changing one subset
of methods in the module, another type of change
requires changing another subset - Shotgun Surgery
- A change requires lots of little changes in a lot
of different classes - Parallel Inheritance Hierarchies
- Each time you add a subclass to one hierarchy,
you need to do it for all related hierarchies
30What to Refactor?
- Lazy Class
- A class that no longer pays its way, e.g., a
class that was downsized by previous refactoring,
or represented planned functionality that did not
pan out - Middle Man
- If a class is delegating more than half of its
responsibilities to another class, do you really
need it?
31What to Refactor?
- Speculative Generality
- Oh I think we need the ability to do this kind
of thing someday - Alternative Classes with Different Interfaces
- Two or more methods do the same thing but have
different signature for what they do
32What to Refactor?
- Primitive Obsession
- Characterized by a reluctance to use classes
instead of primitive data types - Temporary Field
- An attribute of an object is only set in certain
circumstances - but an object should need all of
its attributes
33What to Refactor?
- Feature Envy
- A method requires lots of information from some
other class - Data Clumps
- Attributes (e.g., method parameters) that clump
together but are not part of the same class
34What to Refactor?
- Message Chains
- A client asks an object for another object and
then asks that object for another object, etc. - getA().getB().getC().getD().getE().doSomething()
- Bad because client depends on the structure of
the navigation - Inappropriate Intimacy
- Pairs of classes that know too much about each
others private details
35What to Refactor?
- Data Class
- Classes that have fields, getting and setting
methods for the fields, and nothing else - They are data holders, but objects should be
about data and behavior (with some exceptions,
e.g., entity beans) - Refused Bequest
- A subclass ignores most of the functionality
provided by its superclass
36What to Refactor?
- Incomplete Library Class
- An infrastructure class doesnt do everything you
need - Comments (!)
- Comments are sometimes used to decorate bad
code - / This is a gross hack /
37But Refactoring can be Dangerous
- If programmers spend time cleaning up the code,
then thats less time spent implementing required
functionality - and the schedule is slipping as
it is! - Refactoring can break code that previously worked
- Refactoring needs to be systematic, incremental,
and safe
38How to Make Refactoring Safe?
- Use refactoring patterns
- Catalog at http//www.refactoring.com/catalog/inde
x.html - Mostly taken from Fowlers book
http//martinfowler.com/books.htmlrefactoring - Use refactoring tools
- Long list at http//www.refactoring.com/tools.html
- Test constantly!
- Regression testing
39Regression Testing After Changes
- Can be unit tests or a combination of unit and
integration tests - Change is successful, and no new errors are
introduced - Change does not work as intended, and no new
errors are introduced - Change is successful, but at least one new error
is introduced - Change does not work, and at least one new error
is introduced
40Other Difficulties with Refactoring
- Some refactorings require that interfaces be
changed - If you own all the calling code, need to change
everywhere the interface is used - If not, the interface is published and cant
change (or shouldnt) - Business applications are often tightly coupled
to underlying database schemas - Virtually impossible to reorganize a database
schema unless the underlying database automates
the corresponding table/row/column
transformations (or your database is empty)
41Other Difficulties with Refactoring
- Dealing with hardware devices is worse than
databases and other external software interfaces - Software can change, the hardware (usually)
cannot - Real-time or other timing-dependent applications
- Refactored code will not necessarily run within
previous time bounds
42Summary
- Refactor often
- Refactor as you go
- Simplest version of refactoring add comments,
rename local variables and parameters more
intuitively - Regression test after every refactoring
43Verification and Validation
44Quality AssuranceVerification and Validation
- Validation Are we building the right product?
- QA at requirements and design level concentrates
on validation ensures that the product will
actually meet the user's need - Verification Are we building the product right?
- QA at code level concentrates on verification
ensures that the product has been built according
to the requirements and design specifications
(only useful if the specifications were correct
in the first place)
45VV Techniques
- Standards (ISO 9001, SEI CMMI)
- Metrics (Six Sigma)
- Reviews (inspections, static analysis)
- Testing
- Whole lifecycle process applied at each stage
46Inspection Overview
- Also known as walkthrough
- An approach to testing that does not actually
execute the code - Formal process for reading through the software
product as a group and identifying defects - Potentially applied to all project documents
including but not limited to source code - Used to increase software quality and improve
productivity and manageability of the development
process
47Static Analysis Overview
- Software tools parse the program text and try to
discover potentially erroneous conditions (e.g.,
lint) - Control flow analysis Checks for loops with
multiple exit or entry points, finds unreachable
code, etc. - Data use analysis Detects uninitialized
variables, variables written twice without an
intervening assignment, variables that are
declared but never used, etc. - Interface analysis Checks the consistency of
type, method, etc. declarations and their use - Should occur prior to inspection or testing
48Why Test?
- No matter how well software has been designed and
coded, it will inevitably still contain defects - Testing is the process of executing a program
with the intent of finding faults (bugs) - A successful test is one that finds errors, not
one that doesnt find errors - Testing can prove the presence of faults, but
can not prove their absence (unless the program
is so trivial that it can be exhaustively tested) - But can increase confidence that a program works
49What to Test?
- Unit test test of small code unit start with
individual methods, build up to class (and class
hierarchy if applicable), then component - Integration test test of several units combined
to form a (sub)system, preferably adding one unit
at a time - System (alpha) test test of a system release by
independent system testers - Acceptance (beta) test test of a release by
end-users or their representatives
50When to Test?
- Early
- Agile programming developers write unit test
cases before coding each unit (test-driven
development) - Many software processes involve writing
system/acceptance tests in parallel with
development - Often
- Regression testing rerun unit, integration and
system/acceptance tests - After refactoring
- Throughout integration
- Before each release
51Who should Test?
- Argument Software authors should not test their
own code because - Testers who dont believe they will find faults
generally dont find many faults (cognitive
dissonance) - Testers who have to fix any faults they find
dont tend to find very many (avoidance behavior) - Coders want code to be fault free, but effective
testers must want to find faults (conflict of
interest) - However, code authors usually do unit tests and
often integration tests - Separate independent team usually does system
tests and/or acceptance tests
52Defining a Test
- Goal the aspect of the system being tested
- Input specify the actions and conditions that
lead up to the test as well as the input (state
of the world, not just parameters) that actually
constitutes the test - Outcome specify how the system should respond
or what it should compute, according to its
requirements
53Test Harness (Scaffolding)
- test driver - supporting code and data used to
provide an environment for invoking part of a
system in isolation - stub - dummy procedure, module or unit that
stands in for another portion of a system,
intended to be invoked by that isolated part of
the system - May consist of nothing more than a function
header with no body - If a stub needs to return values, it may read and
return test data from a file, return hard-coded
values, or obtain data from a user (the tester)
and return it
54Unit Testing Overview
- Unit testing is testing some program unit in
isolation from the rest of the system (which may
not exist yet) - Usually the programmer is responsible for testing
a unit during its implementation (even though
this violates the rule about a programmer not
testing own software) - Easier to debug when a test finds a bug (compared
to full-system testing)
55Integration Testing Overview
- Motivation Units that worked in isolate may not
work in combination - Performed after all units to be integrated have
passed all black box unit tests - Reuse unit test cases that cross unit boundaries
(that previously required stub(s) and/or driver
standing in for another unit)
56System/Acceptance Testing Overview
- Full system, from end-user (or other external
role) input/output perspective - Lab testing vs. field testing
- Consider interoperability with customer software
and hardware configurations - Additional factors security, performance,
usability
57How do you know when you are done testing?
- Adequacy criteria (coverage metrics) all
statements, all branches, all control flow paths,
all data flow paths - All programmed error messages and exceptions have
been produced - Have reached tail of defect density curve
- Confidence established that the software is fit
for its purpose, good enough
58Defect Density Curve
59Final Notes
60ReminderSecond Progress Report due next week!
- Second Progress Report due October 21st
- Post in CourseWorks in your TEAM folder
61Upcoming Deadlines
- Second Progress Report due October 21st
- Demos October 27th - November 6th(schedule early
with your TA) - First Iteration Final Report due November 7th
- Midterm Individual Assessment will be posted by
November 7th, due November 14th
62COMS W4156 Advanced Software Engineering
- Prof. Gail Kaiser
- Kaiser4156_at_cs.columbia.edu
- http//bank.cs.columbia.edu/classes/cs4156/