Title: Quality analysis of industrial systems' LaQuSo experience Serguei Roubtsov
1Quality analysis of industrial systems.LaQuSo
experienceSerguei Roubtsov
2LaQuSo Laboratory for Quality Software
- 9 employees master students and
students-assistants (HG 5.91) - industrial projects
- research projects
- own research in software quality assessment and
tooling
3Analysis approach
- Focus on maintainability
- Static code analysis, architecture assessment,
code reviews, tooling - Based on quantifiable measures software metrics
- Provides an answer to an analysis question for
this purpose metrics - should reflect quality criteria, thresholds
- Visualisation
4 What code is hard to maintain?
- Hard to understand
- not documented
- cluttered or inconsistently used/developed
- too large
- Difficult to modify
- duplicated
- intertwined
- non-extendable
- non-portable
- Difficult to test / analyse
- too complex
5What to analyze?
- Architecture
- dependencies (layering)
- dependency cycles
- code external duplication
- dead code
- system documentation
- Code base
- code size and complexity
- duplication metrics
- potential bugs
- documentation
- adherence to standards
6Our Tooling
Software Quality Analysis and Visualisation
Toolset
Name, Class Count, Abstract Class Count, Ca, Ce,
A, I, D, V bsh,0,0,1,0,0,0,1,1 com.caucho.burlap.
client,0,0,1,0,0,0,1,1 com.caucho.burlap.io,0,0,1,
0,0,0,1,1 com.caucho.burlap.server,0,0,1,0,0,0,1,1
com.caucho.hessian.client,0,0,1,0,0,0,1,1 com.cau
cho.hessian.io,0,0,1,0,0,0,1,1 com.caucho.hessian.
server,0,0,1,0,0,0,1,1 com.ibatis.common.util,0,0,
1,0,0,0,1,1 oracle.toplink.essentials.sessions,0,0
,1,0,0,0,1,1 oracle.toplink.exceptions,0,0,2,0,0,0
,1,1 oracle.toplink.expressions,0,0,1,0,0,0,1,1 or
acle.toplink.internal.databaseaccess,0,0,1,0,0,0,1
,1 oracle.toplink.jndi,0,0,1,0,0,0,1,1 oracle.topl
ink.logging,0,0,1,0,0,0,1,1 oracle.toplink.publici
nterface,0,0,2,0,0,0,1,1 oracle.toplink.queryframe
work,0,0,1,0,0,0,1,1 oracle.toplink.sessionbroker,
0,0,1,0,0,0,1,1 oracle.toplink.sessions,0,0,2,0,0,
0,1,1 oracle.toplink.threetier,0,0,1,0,0,0,1,1 ora
cle.toplink.tools.sessionconfiguration,0,0,1,0,0,0
,1,1 oracle.toplink.tools.sessionmanagement,0,0,1,
0,0,0,1,1 org.aopalliance.aop,0,0,9,0,0,0,1,1 org.
aopalliance.intercept,0,0,24,0,0,0,1,1 org.apache.
axis.encoding.ser,0,0,1,0,0,0,1,1 org.apache.catal
ina.loader,0,0,1,0,0,0,1,1 org.aspectj.weaver,0,0
,2,0,0,0,1,1 org.aspectj.weaver.ast,0,0,1,0,0,0,1,
1 org.aspectj.weaver.bcel,0,0,1,0,0,0,1,1 org.aspe
ctj.weaver.internal.tools,0,0,1,0,0,0,1,1 org.aspe
ctj.weaver.loadtime,0,0,1,0,0,0,1,1 org.quartz.sp
i,0,0,1,0,0,0,1,1 org.quartz.utils,0,0,1,0,0,0,1,1
org.quartz.xml,0,0,1,0,0,0,1,1 org.springframework
.aop,24,20,17,6,0,83,0,26,0,09,1 org.springframewo
rk.aop.aspectj,39,7,3,24,0,18,0,89,0,07,1 org.spri
ngframework.aop.aspectj.annotation,27,3,0,19,0,11,
1,0,11,1 org.springframework.aop.aspectj.autoproxy
,3,0,1,8,0,0,89,0,11,1 org.springframework.aop.con
fig,17,3,1,15,0,18,0,94,0,11,1 org.springframework
.aop.framework,37,9,22,18,0,24,0,45,0,31,1 org.spr
ingframework.jdbc.core,53,20,6,20,0,38,0,77,0,15,1
org.springframework.jdbc.core.metadata,22,2,1,10,
0,09,0,91,0,1 org.springframework.jdbc.core.namedp
aram,10,4,3,12,0,4,0,8,0,2,1 org.springframework.j
dbc.core.simple,17,6,0,12,0,35,1,0,35,1 org.spring
framework.jdbc.core.support,8,5,2,14,0,62,0,88,0,5
,1 org.springframework.jdbc.datasource,27,7,13,14,
0,26,0,52,0,22,1 org.springframework.jdbc.datasour
ce.lookup,8,2,2,13,0,25,0,87,0,12,1 org.springfram
ework.jdbc.object,14,8,0,12,0,57,1,0,57,1 org.spri
ngframework.jdbc.support,15,5,12,16,0,33,0,57,0,1,
1 org.springframework.jdbc.support.incrementer,15,
4,0,8,0,27,1,0,27,1 org.springframework.jdbc.suppo
rt.lob,18,5,5,12,0,28,0,71,0,02,1 org.springframew
ork.jdbc.support.nativejdbc,10,2,2,7,0,2,0,78,0,02
,1 org.springframework.jdbc.support.rowset,4,2,2,6
,0,5,0,75,0,25,1 org.springframework.jdbc.support.
xml,7,6,0,7,0,86,1,0,86,1 org.springframework.web.
servlet.view.xslt,4,2,0,17,0,5,1,0,5,1 org.springf
ramework.web.struts,16,5,0,22,0,31,1,0,31,1 org.sp
ringframework.web.util,24,6,26,15,0,25,0,37,0,38,1
org.w3c.dom,0,0,12,0,0,0,1,1 org.xml.sax,0,0,3,0,
0,0,1,1
(2)
AV Repository
(3)
(1)
(4)
7SQuAVisiT
- Flexible
- Plug-in architecture
- Languages
- C, Cobol, Java, JavaScript, PL/SQL, Delphi, C
- Analysis tools (third party and our own)
- dependency extractors, duplication detectors,
error detectors, metrics calculators, parsers,
code style checkers - Visualization tools
- MetricsView, GraphViz, ExTraVis , MatrixZoom,
SolidSX, visualization modules of third-party
tools
8Real life industrial systems
- They are often
- Heterogeneous (C/Assembler, Cobol/PL SQL,
Java/Object mapping to SQL) - Incomplete (some code is in libraries and
third-party components) - Not compilable and executable within analysis
environment ( weird OS, proprietary development
environment, )
9Industrial cases
- Range from 150 KLOC to 1.7 MLOC
- Homogeneous and heterogeneous
- Customers usually report problems experienced
- Need to migrate due to discontinuation of support
- Lack of knowledge about the system due to high
degree of staff rotation - Danger of architecture deterioration due to
extensive changes - Maintenance (dis)continuation decision
- As an illustration we discuss only some of the
analyses carried out in each case.
10Overview of industrial cases
11Expert system
Industrial case Insurance companys expert system
- What kind of system do we have?
- Heterogeneous JavaScript, PL/SQL, C, Java,
Cobol - Medium size 300 KLOC
- 15 years old
- Scarce documentation
- Oracle DB
- Problem reported
- Maintenance (dis)continuation decision
12Dependencies Model Matrix View
Data layer
- (Almost) layered good design
- BUT data layer is accessed from several layers
- Layers affected by calls from top layer are
visible (red squares)
13Dependencies Model Extravis
- Green bubbles controversial coding approach
- Parameters as names f(1,3) -gt f_1_3
- Absence of dedicated data access layer is
confirmed -
14Code duplication
- Code is polluted with duplication restructuring
would improve maintainability but may change the
architecture
CCFinder/Gemini (Toshihiro Kamiya)
15Summary
- Layered architecture
- System is well-structured but
- JavaScript two-tier architecture could cause
serious maintenance problems in the future - Code polluted by duplication
- Low impact if no major changes are expected
- Analysis advice
- Short term
- Refactor and maintain for limited amount of time
(3-5 years) - Develop overview documentation
- Long term
- Migrate to three-tier architecture
16Industrial case Embedded System
- What kind of code do we have?
- Component system with compile-time binding via
make files - C with embedded Assembler
- Complete
- Medium size 150 KLOC
- Developers assumption
- Layered architecture
- Problems reported
- Extensive change. Is architectural purity still
preserved?
17Dependencies Structure
- system is poorly layered
- unexpected cyclic dependencies exist between
components -
18Summary
- Layered architecture
- System is poorly-structured (indications of
decay) but - Code is of good quality, well documented
- The system is NOT large or complex
- Analysis advice
- Reengineer affected parts according to the
presumed layered architecture
19Industrial case Pension fund
- What kind of system do we have?
- Homogeneous Cobol
- Large 1.7 MLOC
- 17 years old
- Oracle DB
- Problem reported
- Need to migrate due to discontinuation of support
20Dead code?
- Empty spaces in the visualization
- 1216 modules not called by other modules
- Dead code?
- Other (sub)systems?
- 651 are dead
- Confirmed by the developers
21Results of Analysis effort
Halstead metrics
Time to understand (T) is proportional to
Halstead Effort T E / 18 /3600
Time to understand, hours
22Summary
- Architecture
- System structure is preserved but intertwined in
some places - Dead code is widely spread
- Code is polluted by duplication but
- Percentage of weak (large, complex) parts is low
- Analysis advice
- Short term
- Refactor weak parts, eliminate dead code and
maintain for limited amount of time - Long term
- Redevelop on a modern platform
23Expert system
Industrial case Insurance companys front-end
- What kind of system do we have?
- Technical data
- Homogeneous Java
- Large 750 KLOC
- Oracle DB
- J2EE application (Spring Framework, Hibernate)
- Recently developed by a third party
- No documentation
- Problem reported
- Purchase decision
24Code not available? What can we do?
Customer
SQuAVisiT
- Install locally.
- Perform measurements.
results
report
25Understandability Documentation
- Comments percentage (LOCs counter tool)
- Average 43
- Ranging from 0 to 1500. Why?
- large repeated header blocks of comments
- commented out code
- Javadoc (CheckStyle tool)
- 85260 violations
- Missing or malformed declarations
- Documentation generation is impossible, or
- Documentation quality is compromised
- Documentation quality should be reassessed!
26Dn Distance from the main sequence
Abstractness AbstrClasses/Classes
1
zone of uselessness
main sequence
Instability Out/(OutIn)
zone of pain
0
1
Dn Abstractness Instability 1
R.Martin 1994
27Average Dn
1.00
Benchmarksopen source
our System
0.186
0.00
28What about distributions?
of packages beyond threshold
an average open source system
our System
Dn threshold value
29What have we seen?
- Architecture is good
- Documentation should be reassessed
30Conclusions
- Approach comprising analysis and visualization
- Supported by SQuAVisiT, a flexible tool allowing
- To address different maintainability aspects
- To combine different analysis and visualization
techniques - Confirmed by analysis of several middle-size to
large systems (150 KLOC 1.7 MLOC)
31Future Work
(Your?)
- Flexible SQuAVisiT data structure
- How to store data (dependencies, metrics) about
historically and hierarchically different
software artifacts? - (Fully-featured) parsers/fact extractors for C,
C, Delphi, - Improved dependency analysis
- Analysis of dynamic bindings, injected
dependencies, dependencies via global and static
variables, - More metrics to retrieve
- e.g. Dn-based analysis for different languages,
analysis/benchmarking of distributions of
different metrics