SE 468 Software MeasurementProject Estimation

About This Presentation

Title:

SE 468 Software MeasurementProject Estimation

Description:

Cost = effort size ... Ability to translate size & complexity into human effort. ... Effort prediction using the unadjusted function count is often no worse than ... – PowerPoint PPT presentation

Number of Views:91

Avg rating:3.0/5.0

Slides: 72

Provided by: DennisL65

Category:

more less

Transcript and Presenter's Notes

Title: SE 468 Software MeasurementProject Estimation

1
SE 468 Software Measurement/Project Estimation

Dennis Mumaugh, Instructor
dmumaugh_at_cdm.depaul.edu
Office Loop, Room CDM 430, X26770
Office Hours Monday, 400-530

2
Administrivia

Comments and feedback

3
Assignment 2

Due October 12, 2009.
Questions on Function Points and effort
estimation (aka COCOMO).
Using the Estimate tool from Construx, estimate
the two projects given

4
SE 468 Class 3

Topics Estimating Size and Effort
Measuring Software Size
Estimating Software Size
Size Effort Estimation
COCOMO
Reading
Kan pp. 56, 88-91, 93-96, 456
Articles on the Class Page and Reading List

5
Thought for the Day

All measurements are made in order to inform a
decision.
A valid measurement is an indication that will
lead a knowledgeable observer to an appropriate
decision.

6
Software Metrics
7
Software Metrics

To be comparative between systems, system
measurements need to be taken.
Metrics allow measurements to be made of system
attributes such as reliability, robustness, size,
complexity or maintainability.
Metrics measure some property of a system and, to
be valid, there must be a relationship between
that property and the system behavior measured.

8
Software Metrics

Metrics Strategy
Gather Historical Data (from source code, project
schedules, RFC, reports etc)
Record metrics.
Use current metrics within the context of
historical data. Compare effort required on
similar projects etc.

9
Commonly Used Metrics

Schedule Metrics (55)
Tasks completed/late/rescheduled.
Lines of Code (46)
KLOC, Function Point for scheduling and
costing.
Schedule, Quality, Cost Tradeoffs (38)
of tasks completed on schedule / late /
rescheduled.
Requirements Metrics (37)
or changed / new requirements (Formal RFC)
Test Coverage (36)
Fraction of lines of code covered. (50/60 ?
90).

10
Commonly Used Metrics

Overall Project Risk (36)
Level of confidence in achieving a schedule date.
Fault Density
Unresolved faults. (e.g. Release at .25
/KNCSS)
Fault arrival and close rates
Determine - ready to deploy. Easier to find than
solve.

11
Measuring Software Size
12
Software Size

One of the basic measures of a system is its
size.
This is usually used to estimate the build
effort.
Several measures of size
Number of modules (compilable units)
Number of functions
Number of classes
Number of methods per class
Amount of memory used
And the one used most often
Length in Lines of Code LOC or KLOC
Problem The variation in developers code
compactness which can be around 51 in
difference.
Some standards alleviate this problem.

13
Estimating Size

Why are we interested in size?
Cost effort size
Actually we calculate duration and derive cost by
multiplying salary times duration
In more detail
Cost LOC / (Productivity Staff)
Where productivity is in LOC per day
Example a programmer averages 20 lines of code
per day
A project with 20,000 lines of code and 10
programmers would take
20,000/(2010) 100 days
Management understands size (so it thinks)

14
Lines of Code

Lines of Code (LOC)
The basis of the Measure LOC is that program
length can be used as a predictor of program
characteristics
such as effort and ease of maintenance.
The LOC measure is used to measure size of the
software.
One version
Only Source lines that are DELIVERED as part of
the product are included test drivers and other
support software is excluded
SOURCE lines are created by the project staff
code created by application generators is
excluded
One INSTRUCTION is one line of code or card image
Declarations are counted as instructions
Comments are not counted as instructions

15
Lines of Code

Problems
Only Source lines that are DELIVERED as part of
the product are included test drivers and other
support software is excluded
Not useful in estimating effort testing may take
as much code as delivered product
One INSTRUCTION is one line of code or card image
Does not consider
Multi-line statements
Several statements on a line
Comments
Space and punctuation (braces)
Cannot measure specifications
Does not consider functionality or complexity

16
Lines of Code

Lines of Code (LOC)
Advantages
Artifact of ALL software development projects.
Easily countable
Scope for Automation of Counting Since Line of
Code is a physical entity manual counting effort
can be easily eliminated by automating the
counting process.
Well used (many models)
An Intuitive Metric Line of Code serves as an
intuitive metric for measuring the size of
software due to the fact that it can be seen and
the effect of it can be visualized.
Function Point is more of an objective metric
which cannot be imagined as being a physical
entity, it exists only in the logical space.
This way, LOC comes in handy to express the size
of software among programmers with low levels of
experience.

17
Lines of Code

Disadvantages
Measuring programming progress by lines of code
is like measuring aircraft building progress by
weight. (Bill Gates)
Lack of Accountability Lines of code measure
suffers from some fundamental problems. Some
think it isn't useful to measure the productivity
of a project using only results from the coding
phase, which usually accounts for only 30 to 35
of the overall effort.
Lack of Cohesion with Functionality estimates
done based on lines of code can adversely go
wrong, in all possibility.
Developer's Experience Implementation of a
specific logic differs based on the level of
experience of the developer.
Penalize well designed,shorter programs
What about complexity?
Is a simple numerical calculation equivalent of
an SQL query?
But they are the same number of lines of code.

18
Lines of Code

Disadvantages
Problem with non-procedural languages.
Advent of GUI Tools huge variations in
productivity and other metrics with respect to
different languages
Problems with Multiple Languages
Programming language dependent
Difference in Languages consider C vs. COBOL
Lack of Counting Standards
What is a line of code? Use statement instead.
Counting comments and blank lines? Declarations?
Level of detail required is not known early in
the project.
Psychology A programmer whose productivity is
being measured in lines of code, will be rewarded
for generating more lines of code even though he
could write the same functionality with fewer
lines. "What you measure is what you get."

19
Lines of Code

From this data we can develop
Errors per KLOC (thousand lines of code)
Defects per KLOC
per LOC
Pages of Documentation per KLOC
Errors / person-month
LOC per person-month
/page of documentation

20
Estimating Software Size
21
Accuracy

Accuracy of software project estimate is
predicated on
Correct estimate of the size complexity of the
product to be built.
Ability to translate size complexity into human
effort.
The degree to which the project plan reflects the
abilities of the software team.
The stability of product requirements.
The maturity of the software engineering
environment.

22
Conventional Methods LOC/FP Approach

Compute LOC/FP using estimates of information
domain values
Lines of Code (LOC) aka non-commented source
lines or non-commented source statements
Function Points (FP) formula using inputs,
outputs and computation
Use historical data to build estimates for the
project

23
Productivity

Measured in terms of work effort per unit of time
Lines of Code per unit time or Function Points
per unit time
Huge variations in productivity and quality among
individuals and even teams, as much as 101.
Sackman, Erikson, and Grant found
Coding time 201
Debugging 251
Program size 51
Program execution speed 101
Productivity increases due to new development
methods
Some programmers can do things few others can.
They may be 100 times as productive.
Productivity vs. project size
See Brooks, Mythical Man Month

24
A Case Study

Computer Aided Design (CAD) for mechanical
components.
System is to execute on an engineering
workstation.
Interface with various computer graphics
peripherals including a mouse, digitizer,
high-resolution color display, laser printer.
Accepts two three dimensional geometric data
from an engineer.
Engineer interacts with and controls CAD through
a user interface.
All geometric data supporting data will be
maintained in a CAD database.
Required output will display on a variety of
graphics devices.

Assume the following major software functions
are identified
25
Estimation of LOC

CAD program to represent mechanical parts
Estimated LOC (Optimistic 4(Likely)
Pessimistic)/6
Three point estimation formula (see lecture 4)

26
Example LOC Approach

Average productivity for systems of this type
620 LOC/pm.
Burdened labor rate 8000 per month, the cost
per line of code is approximately 13.
Burdened labor is usually 1 to 1.5 times average
salary.
Based on the LOC estimate and the historical
productivity data, the total estimated project
cost is 431,000 and the estimated effort is 54
person-months.

27
Function Points

Function points are a measure of the size of
computer applications and the projects that build
them.
The size is measured from a functional, or user,
point of view.
It is independent of the computer language,
development methodology, technology or capability
of the project team used to develop the
application.
Can be subjective
Can be estimated EARLY in the software
development life cycle.

28
Function Points

They were devised by Albrecht (1979) and are
language independent. A function point is
An external input and output or a user
interaction or an external interface or a file
used
Each FP is then weighted by a complexity factor
to achieve the
Unadjusted FP Count (UFC) is ?(fp) (weight)
The UFC is then adjusted by system attributes
such as distributed processing, re-use,
performance, etc.
There are 14 factors each with a weight of 0 to
5.
to get the Adjusted Function Point Count (AFC).

29
Function Types

The approach is to identify and count a number of
unique function types
External inputs (E.g. File names)
External outputs (E.g. Reports, messages)
Queries (interactive inputs needing a response)
Internal files (invisible outside the system)
External files or interfaces (files shared with
other software systems)
Each of these is then individually assessed for
complexity and given a weighting value which
varies from 3 (for simple external inputs) to 15
(for complex internal files).
Function Type Low Ave High
External Input x3 x4 x6
External Output x4 x5 x7
External Inquiry x3 x4 x6
Logical Internal File x7 x10 x15
External Interface File x5 x7 x10

30
Adjusted FP

In order to find adjusted FP, UFP is multiplied
by technical complexity factor ( TCF ) which can
be calculated by the formula
TCF 0.65 (sum of factors) 0.01
There are 14 technical complexity factors. Each
complexity factor is rated on the basis of its
degree of influence, from no influence to very
influential 0-5

Data communications
Performance
Heavily used configuration
Transaction rate
Online data entry
End user efficiency
Online update
Complex processing
Reusability
Installation ease
Operations ease
Multiple sites
Facilitate change
Distributed functions

Then FP UFP x TCF
31
Function Points

Advantages of FP
It is not restricted to code
Language independent
The necessary data is available early in a
project. We need only a detailed specification.
More accurate than estimated LOC
Disadvantages of FP
Subjective counting
Hard to automate and difficult to compute
Ignores quality of output
Oriented to traditional data processing
applications
Effort prediction using the unadjusted function
count is often no worse than when the TCF is added

32
Computing Function Points
5
15
8
32
40
10
80
8
10
2
177
33
Calculate Degree of Influence (DI)
3

Does the system require reliable backup and
recovery?
Are data communications required?
Are there distributed processing functions?
Is performance critical?
Will the system run in an existing, heavily
utilized operational environment?
Does the system require on-line data entry?
Does the on-line data entry require the input
transaction to be built over multiple screens or
operations?
Are the master files updated on-line?
Are the inputs, outputs, files, or inquiries
complex?
Is the internal processing complex?
Is the code designed to be reusable?
Are conversion and installation included in the
design?
Is the system designed for multiple installations
in different organizations?
Is the application designed to facilitate change
and ease of use by the user?

4
1
3
2
4
3
3
2
1
3
5
1
1
34
The FP Calculation

Inputs include
Count Total (Unadjusted Function Points)
DI ? Fi (i.e. sum of the Adjustment factors
F1.. F14)
Calculate Function points using the following
formulaFP UFP X 0.65 0.01 X ? Fi
In this exampleFP 177 X 0.65 0.01 X
(34132433213511)FP 177 X 0.65
0.01 X (36)FP 177 X 0.65 0.36FP 177 X
1.01FP 178.77

TCF Technical complexity factor
35
Using FP to estimate effort

If for a certain project
FPEstimated 372
Organizations average productivity for systems
of this type is 6.5 FP/person month.
Burdened labor rate of 8000 per month
Cost per FP
8000/6.5 ? 1230
Total project cost
372 X 1230 457,650
372/6.5 57.23 person-months
Based on the FP estimate and the historical
productivity data, the total estimated project
cost is 457,650 and the estimated effort is 58
person-months.

36
3D Function point index
37
AVC

A full specification is needed to estimate
function points and criticism shows that FP
counts can vary by up to 2000 depending on how
one attributes weights.
Function points can be used to estimate the final
code size by using historical data of the average
lines of code (AVC).
Code size AVC Number of function points.
AVC 200-300 LOC per fp in assembler
2 40 LOC per fp in 4GL

38
Reconciling FP and LOC
39
Size Effort Estimation
40
Estimation for OO Projects - I

Develop estimates using effort decomposition, FP
analysis, and any other method that is applicable
for conventional applications.
Using object-oriented analysis modeling, develop
use-cases and determine a count.
From the analysis model, determine the number of
key classes (called analysis classes).
Categorize the type of interface for the
application and develop a multiplier for support
classes
Interface type Multiplier
No GUI 2.0
Text-based user interface 2.25
GUI 2.5
Complex GUI 3.0

41
Estimation for OO Projects - II

Multiply the number of key classes (step 3) by
the multiplier to obtain an estimate for the
number of support classes.
Multiply the total number of classes (key
support) by the average number of work-units per
class. Lorenz and Kidd suggest 15 to 20
person-days per class.
Cross check the class-based estimate by
multiplying the average number of work-units per
use-case

42
Estimation with Use-Cases
Using 620 LOC/pm as the average productivity for
systems of this type and a burdened labor rate of
8000 per month, the cost per line of code is
approximately 13. Based on the use-case estimate
and the historical productivity data, the total
estimated project cost is 554,000 and the
estimated effort is 68 person-months.
43
Estimation for Agile Projects

Each user scenario (a mini-use-case) is
considered separately for estimation purposes.
The scenario is decomposed into the set of
software engineering tasks that will be required
to develop it.
Each task is estimated separately. Note
estimation can be based on historical data, an
empirical model, or experience.
Alternatively, the volume of the scenario can
be estimated in LOC, FP or some other
volume-oriented measure (e.g., use-case count).
Estimates for each task are summed to create an
estimate for the scenario.
Alternatively, the volume estimate for the
scenario is translated into effort using
historical data.
The effort estimates for all scenarios that are
to be implemented for a given software increment
are summed to develop the effort estimate for the
increment.
Also consider Project Velocity

44
Project Velocity (PV)

Dont bother to consider of programmers or
skill level. This is a rough estimation.
Project velocity tells you how many story points
you can allocate to the next iteration.
The customer gets to pick stories that equal the
project velocity.

45
Empirical Estimation Models

Most of the work in the cost estimation field has
focused on algorithmic cost modeling.
In this process costs are analyzed using
mathematical formulas linking costs or inputs
with metrics to produce an estimated output.
The formulae used in a formal model arise from
the analysis of historical data.
The accuracy of the model can be improved by
calibrating the model to your specific
development environment, which basically involves
adjusting the weightings of the metrics.
Generally there is a great inconsistency of
estimates. Kemerer conducted a study indicating
that estimates varied from as much as 85 - 610
between predicated and actual values. Calibration
of the model can improve these figures.
However, models still produce errors of 50-100.

46
Empirical Estimation Models

Effort equation is based on a single variable,
usually a measure of size.
There are several possible variations
Effort A size C
Effort A sizeB
Effort A sizeB C
where A, B and C are constants determined by
regression analysis on historical data.
Effort may be measured in
Staff hours, weeks, months, years . . .
Size may be measured in
Lines of code, modules, I/O formats . . .

47
Empirical Estimation Models

Empirical data supporting most empirical models
is derived from a limited sample of projects.
NO estimation model is suitable for all classes
of software projects.
USE the results judiciously.
General model
E A B (ev)c
where A, B, and C are empirically derived
constants.E is effort in person monthsev is the
estimation variable (either in LOC or FP)

48
LOC-Oriented Estimation Models
49
COCOMO
50
COCOMO

"Cost does not scale linearly with size", is
perhaps the most important principle in
estimation. Barry Boehm used a wide range of
project data and came up the following
relationship of effort versus size
effort C x sizeM
This is known as the Constructive Cost Model
(COCOMO). C and M are always greater than 1, but
their exact values vary depending upon the
organization and type of project. Typical values
for real-time projects utilizing very best
practices are
C3.6, M1.2.
Poor software practices can push the value of M
above 1.5.
One fall out of the COCOMO model is that it is
more cost effective to partition a project into
several independent sub-projects each with its
own autonomous team. This "cheats" the
exponential term in the COCOMO model.

51
COCOMO Static Adjusted Baseline

Static single variable effort equation acts as a
baseline equation,
e.g., effort A sizeb
This provides a basic estimate of effort
The initial estimate is adjusted by a set of
multipliers that attempts to incorporate the
effect of important product and process
attributes
E.g., if the initial estimate is E 100 staff
months and the complexity of the job is rated
higher than normal, a multiplier 1.1 is
associated with it, yielding an adjusted estimate
of 110 staff months

52
The COCOMO Model

A hierarchy of estimation models
Model 1 BasicComputes software development
effort ( cost) as a function of size expressed
in estimated lines of code.
Model 2 IntermediateComputes effort as a
function of program size and a set of 15 cost
drivers that include subjective assessments of
product, hardware, personnel, and project
attributes.
Model 3 AdvancedIncludes all aspects of the
intermediate model with an assessment of the cost
drivers impact on each step (analysis, design,
etc) of the software engineering process.

53
Three classes of software projects

OrganicRelatively small, simple. Teams with
good application experience work to a set of less
rigid requirements.
Semi-detachedIntermediate in terms of size and
complexity. Teams with mixed experience levels
meet a mix of rigid and less rigid requirements.
(Ex transaction processing system)
EmbeddedA software project that must be
developed within a set of tight hardware,
software and operational constraints. (Ex
flight control software for an aircraft)

54
COCOMO Model

The basic COCOMO model follows the general layout
of effort estimation models
E a(S)b
and
TDEV cEd
where
E represents effort in person-months
TDEV represents project duration in calendar
months
S is the size of the software development in KLOC
a, b, c, and d are values, derived from past
project data, dependent on the development mode
a, b, c and d values are
Organic development mode a 2.4 b
1.05 c 2.5 d 0.38
Semi-detached development mode a 3.0 b
1.12 c 2.5 d 0.35
Embedded development mode a 3.6 b
1.20 c 2.5 d 0.32

55
The COCOMO Model

The intermediate COCOMO is an extension of the
basic COCOMO model, which used only one predictor
variable, the size (KLOC) variable
The intermediate COCOMO uses 15 more predictor
variables called cost drivers. The manager
assigns a value to each cost driver from the
range
Very low
Low
Nominal
High
Very high
Extra high
For each of the above values, a numerical value
corresponds, which varies with the cost drivers

56
The COCOMO Model

The manager assigns a value to each cost driver
according to the characteristics of the specific
software project
The numerical values that correspond to the
manager assigned values for the 15 cost drivers
are multiplied
The resulting value I is the multiplier that we
use in the intermediate COCOMO formulas for
obtaining the effort estimates
Thus
I RELY DATA CPLX TIME STOR VIRT TURN
ACAP AEXP PCAP VEXP LEXP MODP TOOL
SCED
Note that although the effort estimation formulas
for the intermediate model are different from
those used for the basic model, the schedule
estimation formulas are the same

57
The COCOMO Model

The required effort to develop the software
system (E) as a function of the nominal effort
(Enom), (where E and Enom are expressed in
Person-Months, and S in KLOC) is
E Enom I,
where
INTERMEDIATE COCOMO MODEL
MODE EFFORT
Organic Enom 3.2 (S1.05)
Semi-detached Enom 3.0 (S1.12) See note
section
Embedded Enom 2.8 (S1.20)
Note intermediate constants differ from the
basic model

58
The COCOMO Model

The number of months estimated for software
development (TDEV) (where TDEV is expressed in
calendar months, and E in Person-Months)
INTERMEDIATE COCOMO MODEL
MODE SCHEDULE
Organic TDEV 2.5 (E0.38)
Semi-detached TDEV 2.5 (E0.35)
Embedded TDEV 2.5 (E0.32)
Note intermediate constants are the same as the
basic model

59
COCOMO Cost Drivers
60
COCOMO Cost Drivers
61
The COCOMO Model

Source Code Size Used in the COCOMO Model
The source size (S) is expressed in KLOC, i.e.
thousands of delivered lines of code, i.e. , the
source size of the delivered software (which does
not include the size of test drivers or other
temporary code)
If code is reused, then the following formula
should be used for determining the equivalent
software source size Se, for use in the COCOMO
model
Se Sn (a/100) Su
where Sn is the source size of the new code, Su
is the source size of the reused code, and a is
determined by the formula
a 0.4 D 0.3 C 0.3 I
based on the percentage of effort required to
adapt the reused design (D) and code (C), as well
as the percentage of effort required to integrate
the modified code (I)

62
The COCOMO Model

Other Parameters Used in the COCOMO Model
TDEV starts when the project enters the product
design phase (successful completion of a software
requirements review) and ends at the end of
software testing (successful completion of a
software acceptance review)
E covers management and documentation efforts,
but not activities such as training, installation
planning, etc.
COCOMO assumes that the requirements
specification is not substantially changed after
the end of the requirements phase
Person-months can be transformed to person-days
by multiplying by 19, and to person-hours by
multiplying by 152

63
The COCOMO Model

Advantages of COCOMO'81
COCOMO is transparent, you can see how it works
unlike other models such as SLIM.
Drivers are particularly helpful to the estimator
to understand the impact of different factors
that affect project costs.
Drawbacks of COCOMO'81
It is hard to accurately estimate KDSI early on
in the project, when most effort estimates are
required.
KDSI, actually, is not a size measure it is a
length measure.
Extremely vulnerable to mis-classification of the
development mode
Success depends largely on tuning the model to
the needs of the organization, using historical
data which is not always available

64
Example

Mode is organic
Size 200 KDSI
Cost drivers
Low reliability gt .88
High product complexity gt 1.15
Low application experience gt 1.13
High programming language experience gt .95
Other cost drivers assumed to be nominal gt 1.00
I .88 1.15 1.13 .95 1.086
Effort 3.2 ( 2001.05 ) 1.086 906 PM
Development time 2.5 9060.38 33.2391 months

65
COCOMO-II

There is a new modeling capability, COCOMO ll
It has the equation form E A(Size)B
where A is calibrated for the local environment
and B is based upon a smaller set of variables
It is done during post architecture and early
design
Also, size may be measured in various ways,
including function points.

66
COCOMO-II

Constructive Cost Model or COCOMO II is actually
a hierarchy of estimation models that address the
following areas
Application composition model. Used during the
early stages of software engineering, when
prototyping of user interfaces, consideration of
software and system interaction, assessment of
performance, and evaluation of technology
maturity are paramount.
Early design stage model. Used once requirements
have been stabilized and basic software
architecture has been established.
Post-architecture-stage model. Used during the
construction of the software.

67
For more information

The text Software Engineering Theory and
Practice, Shari Lawrence Pfleeger, Chapter 3.3,
Effort estimation, pp. 98-109 has more
information on the subject.

68
Summary

Models should be an aid to software development
management and engineering and to evolving the
discipline
First do a prediction back of the envelope
Apply the local model and examine the results
Apply the general model, e.g., COCOMO
Examine the range of prediction offered by the
models
Compare the results
What do they say about the project, my
environment?
What assumptions did I make and do I believe
them?
Am I satisfied with the prediction, when should I
re-predict?

69
Summary

No size or effort estimation model is appropriate
for all software development environments,
development processes, or application types.
Models must be customized (parameters in the
formula must be altered) so that results from the
model agree with the data from the particular
software development environment.
An effort estimation is only, ever, an estimate.
Management should treat it with caution.
To make empirical models as useful as possible,
as much data as possible should be collected from
projects and used to customize (refine) any model
used
The different estimating methods used should be
documented, and all underlying assumptions should
be recorded.

70
Next Class

Topic
Project Estimation Project Schedule Estimation
Resource Schedule Estimation Overly Optimistic
Schedules The Time Value of Money
Reading
Kan chapter 12.2. pp. 343-347
Articles on the Class Page
Term Paper Proposal
Due Monday October 5, 2009
Assignment 2
Due Monday October 12, 2009

71
Journal Exercises

Read the paper Programmer Productivity The
"Tenfinity Factor lthttp//www.devtopics.com/prog
rammer-productivity-the-tenfinity-factor/gt
Comment.
What about the impact on estimating?
Also, think about programmer style and lines of
code measurements.

Write a Comment

User Comments (0)