Modularity and Evolvability - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Modularity and Evolvability

Description:

University of New Mexico. Computer Science. April 14, 2004. Overview ... Breadth is trivially optimized by putting all files in one module ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 35
Provided by: terryva7
Category:

less

Transcript and Presenter's Notes

Title: Modularity and Evolvability


1
Modularity and Evolvability
  • Terry Van Belle
  • University of New Mexico
  • Computer Science
  • April 14, 2004

2
Overview
  • How to write software thats easy to change?
  • Inspired by evolution
  • Introduction
  • The Wagner/Altenberg Model
  • Software Metrics
  • Optimization Algorithms
  • Software Projects to optimize
  • Performance Results
  • Conclusions

3
Introduction
  • Software evolves?
  • Software adapts to environmental changes
  • Beyond version 1.0
  • Environment User Requirements
  • Evolvability Capacity to evolve
  • Evolution driven by good mutations
  • Equal fitness, different evolvabilities
  • Software Evolvability
  • Ability to change software in response to changes
    in requirements

4
Biological Modularity
  • Module is
  • A complex of genes
  • Single purpose
  • Limited influence on other modules
  • How does biological modularity evolve?
  • Wagner/Altenberg (1995)
  • Modularity improves evolvability
  • Allows for independently evolving traits

5
Wagner/Altenberg Example
6
Wagner/Altenberg Example
Coloring
Leg Length
Traits
A
B
C
Genes
7
Wagner/Altenberg Example
Left Side
Right Side
Traits
A
B
C
Genes
8
Wagner/Altenberg Model
  • An organisms environment contains regularities
  • This leads traits to group together
  • Evolvability is best when the genome is organized
    to change like the environment
  • We want to group the changes together into modules

9
Software Archeology
  • In most software projects, we dont have access
    to traits (features)
  • But we do have the code change history
  • Surrogate for environment and trait changes
  • Exploit regularities in the change history
  • Use modules to group together frequent changes
  • Evolutionary metrics vs. Static metrics

10
Evolutionary Metrics for Modules
  • Why do we use modules?
  • Aggregation
  • Segregation
  • Module design lies in the tension between these
    forces
  • Two metrics to capture these forces
  • Breadth average number of modules touched
  • Weight average total touched module size

11
Calculating Breadth
Unchanged
x
Changed
Module Changed
Time
x
x
x
x
x
x
x
x
x
x
Files
x
x
x
x
x
x
x
x
x
x
x
x
x
x
2
1
2
4
1
1
3
2
Breadth 2
12
Calculating Weight
Unchanged
x
Changed
Module Changed
Time
x
x
x
x
x
x
x
x
x
x
Files
x
x
x
x
x
x
x
x
x
x
x
x
x
x
5
4
3
10
4
1
6
6
Weight 4.875
13
Evolutionary Metrics, continued
  • Breadth is trivially optimized by putting all
    files in one module
  • Weight is trivially optimized by giving every
    file its own module
  • Ideally we want to optimize both

14
Evolutionary Metrics, Correlations
  • Correlation of changes between files
  • We want to group highly correlated files together
  • Use a 2x2 contingency table
  • r
  • Set a correlation threshold parameter

AD-BC
F2
!F2
A
B
F1
v (AB)(CD)(AC)(BD)
C
D
!F1
15
Clustering Algorithm
0.5
-0.3
0.1
1.0
0.7
0.4
0.7
0.2
0.9
-0.2
-0.1
0.0
16
Clustering Algorithm
17
Kernighan-Lin Algorithm
  • A greedy algorithm
  • Originally designed for partitioning graph into
    two sub-graphs with minimum edge crossings
  • This is an NP-hard problem
  • Adapted to generate module structure
  • Use fitness instead of edge crossings
  • Fitness ? breadth weight
  • Applied recursively

18
Kernighan-Lin Algorithm
0.3
19
Kernighan-Lin Algorithm
0.2
X
20
Kernighan-Lin Algorithm
-0.1
X
X
21
Recursive Decomposition
22
Summary so far
  • Looked at the Wagner/Altenberg model
  • Discussed some metrics
  • Breadth
  • Weight
  • Change Correlation
  • Described two algorithms
  • Clustering (correlation threshold)
  • Kernighan-Lin (gamma)
  • Lets apply it all to the real world

23
Software Projects
  • Three Open-Source Projects
  • Jikes RVM
  • A Java virtual machine
  • Jakarta Tomcat
  • Java servlet container
  • Net Beans
  • An IDE based on Java Beans
  • The Task Re-divide files into new packages
  • Change data from public CVS repository
  • Hourly granularity

24
Performances, Jikes RVM
25
Performances, Jakarta Tomcat
26
Performances, Net Beans
27
Jikes Evolution
28
Jikes Change Correlations
29
Sample Module, Jikes RVM
  • Module 4
  • rvm/src/vm/arch/intel/runtime/VM_DynamicLinkerHelp
    er.java
  • rvm/src/vm/arch/powerPC/runtime/VM_DynamicLinkerHe
    lper.java
  • rvm/src/vm/compilers/optimizing/ir/util/OPT_BasicB
    lockEnumeration.java
  • rvm/src/vm/compilers/optimizing/ir/util/OPT_IREnum
    eration.java
  • rvm/src/vm/compilers/optimizing/ir/util/OPT_Instru
    ctionEnumeration.java
  • Note the repeated names
  • Intel and PowerPC architecture-specific files are
    grouped together
  • Group by function, not implementation

30
Existing Structure
vm
arch
powerPC
intel
. . .
runtime
runtime
. . .
VM_DLH.java
VM_DLH.java
. . .
. . .
31
Refactored Structure
vm
arch
runtime
. . .
VM_IntelDLH.java
VM_PowerPCDLH.java
. . .
32
Advantages and Disadvantages
  • Fewer, smaller modules touched
  • Easier to coordinate many programmers
  • Diminished scope of necessary knowledge
  • Applies to languages other than Java
  • New principle of package design
  • Based on these software projects -- may be
    different for other projects
  • A solid, measurable way to determine the best
    design
  • Unfortunately, requires a substantial change
    history

33
Conclusions
  • Modularity improves Evolvability in both Software
    and Biology
  • Software history is a rich source of information
  • Optimized package structure of three projects
  • Clustering didnt produce an improvement
  • Kernighan-Lin did
  • Suggestion Group files by function, not
    implementation

34
Jikes Optimized Modularity
Write a Comment
User Comments (0)
About PowerShow.com