Evaluation of Information Systems Complexity Metrics and Models - PowerPoint PPT Presentation

1 / 50

About This Presentation

Title:

Evaluation of Information Systems Complexity Metrics and Models

Description:

Complexity metrics were developed by computer scientists and software engineers ... Simple complexity metric, often based on number of ... Availability Metrics ... – PowerPoint PPT presentation

Number of Views:224

Avg rating:3.0/5.0

Slides: 51

Provided by: Gle780

Category:

more less

Transcript and Presenter's Notes

Title: Evaluation of Information Systems Complexity Metrics and Models

1
Evaluation of Information SystemsComplexity
Metrics and Models

INFO 630
Glenn Booker

2
Origin

Complexity metrics were developed by computer
scientists and software engineers
Strongly based on empirical (real world)
measurement, with little theory
Primarily broken into internal and external
measures

3
Internal versus External

Internal measures describe the complexity within
a module (number of decisions, loops,
calculations, etc.)
External measures describe relationships among
modules (program or function calls, external
file activities, input/output, etc.)

4
Internal Measures
5
Internal Product Attributes

Size measures
Input to prediction models
Normalizing factor for cost, productivity, etc.
Progress during development
Typically use lines of code (LOC) or function
point counts
LOC is a better measure for predicting cost and
schedule

6
Lines of Code

Simple complexity metric, often based on number
of executable statements or instruction
statements
Highest defect rates often occurs in small
modules
Larger modules have a smaller defect rate (if
they exist at all) - until too cumbersome
Optimum module size 250 lines

7
Function Points

Function points help avoid biases due to the
programming language(s) used
Provide a more fair basis for comparing
different environments
Focuses on how much work the program
accomplishes, not how concisely it is expressed

8
Halstead Metrics

Also known as Software Science, 1977
Examine program as compilable tokens
Tokens are either operators (, -) or operands
(variables)
Derive metrics such as Vocabulary, Length,
Volume, Difficulty, etc.
Not widely used

9
Data Structure (Halstead)

Halsteads ?2 - number of distinct operands in a
module
Operands include number of variables, number
unique constants, and number of labels
Operand usage (OU)
OU ?2/N2 where N2 is the total number of
operand references

10
Software Complexity

Is a characteristic that influences the
resources needed to build and maintain it
Many different characteristics of software relate
to complexity
These complexity characteristics revolve around
the structure of the software

11
Types of Structural Measures

Control flow
Addresses sequence in which instructions are
executed
Iteration and looping
Data flow
Follows trail of data as it is created and
handled
Depicts behavior of data as it interacts with the
program

12
Types of Structural Measures

Data structure
Concerned with organization of data itself
Provides information about difficulties in
handling data and in defining test cases

13
Control Flow

Modeled by directed graphs (control flow graphs)
Each node corresponds to a single program
statement
Arcs (directed edges) indicate flow of control
from one statement to another

14
Control Flow

Control flow graphs are useful for
Analysis (estimating number of defects)
Expressing complexity by a single value
Assessing testability and test coverage

15
Basic Control Constructs
16
Cyclomatic Complexity

McCabe, 1976
Based on a programs control flow chart
Related to number of separate graphable areas, or
number of linearly independent paths in the
program
Complexity MC edges - nodes 2( of
unconnected paths)

17
Cyclomatic Complexity

Complexity under 10 generally desired
Can also find M as number of binary decisions
(yes/no) minus one
Multiple choice decisions with n choices count
as (n-1) binary decisions
Ignores differences between specific types of
control structures

18
Cyclomatic Complexity

Uses of complexity metric
Identify complex modules needing detailed
inspection or redesign
Identify simple modules needing minimal
inspection and/or testing
Estimate programming, testing and maintenance
effort
Identify potentially troublesome code

19
Control Flow Representation of Programs

Software programs can be represented by linear
directed segments combined with the basic
control flow constructs
Control flow constructs may be nested, e.g. an IF
statement can be inside of a WHILE loop

20
Control Flow Representation of Programs

Example

21
Control Flow--Linearly Independent Paths
Set of linearly independent paths b1 abcg
b2 abcbcg b3 abefg b4 adefg
b5 adfg Any arbitrary path is equal to a linear
combination of the linearly independent
paths listed above For example, path abcbefg is
equal to b2 b3 - b1
22
Knots - Control Flow Crossovers

Knot measure -- total number of points at which
control flow lines cross

23
Syntactic Constructs

Examine effect of using specific control
structures on defect rate
Is, by definition, language-specific
Can result in statistically significant
relationships
e.g. Lo used to show that DO WHILE should be
avoided in COBOL

24
External Measures
25
Computational Complexity

Examines algorithmic efficiency and use of
machine resources (memory, I/O, storage)
Studies quantitative aspects of solutions to
computational problems
Examples may include sorting efficiency for a
database, managing I/O constraints across a large
scale network, etc.

26
Psychological Complexity

Concerned with characteristics of software that
affect human performance
Injection of defects (when and why does a
programmer make errors?)
Ease of building the software (effort required)
Ease of maintenance (effort required)

27
Data Structure (Database)

Database size per program size (DBSPPS)
DBSPPS DBS/PS
Where DBS is database size in bytes or
characters
PS is program size in source instructions
Used in COCOMO model as a cost driver
Ordinal scale measure derived from DBSPPS

28
Fan-in and Fan-out

Focus is the interaction among code modules
Fan-in of modules which call a given module
Fan-out of modules which are called by a
given module
Or, more formally...

29
Fan-in and Fan-out

Fan-in of a module is the number of local flows
terminating at the module, plus the number of
data structures from which info is retrieved by
the module
Fan-out of a module is the number of local flows
that emanate from the module, plus the number of
data structures (tables, arrays) that are updated
by the module

30
Fan-in and Fan-out

Do fan-in and fan-out affect software quality?
Large fan-in modules may be interpolation or
look-up routines - no defect correlation
Large fan-out often relates to high defect rate -
has a high defect correlation
Large fan-in and fan-out is clearly bad

31
Fan-in and Fan-out

Information flow complexity
Henry and Kafura Size(fan-in fan-out)2
Shepperd (fan-in fan-out)2
Henry and Kafura measure helps predict the number
of software maintenance problems
Shepperd measure correlates with software
development time

Henry, S. and D. Kafura, IEEE Transactions on
Software Engineering, 1981. SE-7(5) p. 510-518
Shepperd, M. 1990. Software Engineering Journal
5, 1 (January), pp. 3-10.
32
Structure Metrics

Information flow metric (Henry Selig)
HC C (fan-in fan-out)2
where C is the cyclometric complexity

33
Structure Metrics

System complexity (Card Glass)
Based on structural complexity (average fan-out
squared) and data complexity (based on number of
I/O variables and fan-out)
Quantified effect of complexity on error rate

34
Module Call Graph

Module - a contiguous sequence of program
statements, bounded by boundary elements, having
an aggregate identifier
Or, a distinct, named group of LOC
The module call graph shows which modules call
each other, and what key information is passed
among them

35
Module Call Graph

Example

36
Module Coupling Measures

Average number of calls per module (ANCPM)
Fraction of modules that make calls (FMC)

37
Information Flow Measures

Types of information flows
Local direct flow
Module invokes a 2nd module passes info to it
Invoked module returns result to the caller
Local indirect flow
Invoked module returns info that is subsequently
passed to a second invoked module
Global flow
Info flows from one module to another via a
global data structure

38
IEEE-STD-982

Number of Entries and Exits per Module, m
Like fan-in and fan-out
m entries exits
Software Science measures

39
IEEE-STD-982

Graph-Theoretic Complexity
Static ComplexityC Edges - Nodes 1
Generalized Static ComplexityBased on summing
resources needed for each module (e.g. storage,
access time, etc.)
Dynamic complexityComplexity as it changes over
time across a network

40
IEEE-STD-982

Cyclomatic complexity
Minimal Unit Test Case Determination
Determine number of independent paths through a
module, to get minimum number of test cases for
unit testing
Data or information flow complexity
Fan-in and fan-out of variables

41
IEEE-STD-982

Design Structure
Adds weighted () average of six parameters
Whether designed top down (Y/N)
Module inter-dependence
Module dependence on prior processing
Database size ( of elements)
Database compartmentalization
Module single entrance and exit (Y/N)
Weighting chosen to meet project needs

42
Other Measures

Compiler measures
Size (bytes of compiled code)
Number of symbols and variables
Cross-reference of all labels
Statement count

43
Other Measures

Configuration Management Library Measures
Number of code modules
Number of versions of each module
History of change dates of each module
Module size
Number of related documents for each module

44
Availability Metrics

Most information systems are critical to
day-to-day operations
Witness the recent crash of Google making news
for only 15 minutes of non-availability
Availability depends on 1) how often the system
goes down, and 2) how long it takes to restore it
after a crash

45
Availability Metrics

Perfect availability (100) is nice to dream of,
but realistically, higher reliability is more
expensive
Often measure availability by the number of 9s
in the desired level of availability
Two nines is 99, three nines is 99.9, four
nines is 99.99, etc.

46
Availability Metrics
47
Achieving High Availability

Many techniques are used to help ensure that high
levels of availability are possible
Duplicate systems (clustering)
RAID data duplication
Duplicate power supplies
Independent power supplies
Uninterruptible power supplies (UPS)

48
Availability and Code Quality

Capers Jones demonstrated a clear connection
between code quality (defect rate) and the
corresponding mean time to failure (MTTF), which
is a key aspect of availability
Consistent methods for measurement and
definitions of terms are needed for further
refinement

49
Customer Outage Data

In order to determine availability, the actual
customer-visible system outage time needs to be
collected
In order to get this data, the customer must
place a very high priority on availability
This data could be used to identify software
components which most reduce availability

50
Availability

We also expect that availability for a new system
should increase over the first couple years of
its use
Defect causal analysis can help reduce the root
cause of defects, thereby improving availability

Write a Comment

User Comments (0)