Title: Reverse Engineering of Design Patterns from Java Source Code
1Reverse Engineering ofDesign Patterns from Java
Source Code
UC DAVIS
- Nija Shi
- shini_at_cs.ucdavis.edu
- Ron Olsson
- olsson_at_cs.ucdavis.edu
2Outline
- Design patterns vs. reverse engineering
- Reclassification of design patterns
- Pattern detection techniques
- PINOT
- Ongoing and future work
3Design Patterns
- A design pattern offers guidelines on when,
how, and why an implementation can be created to
solve a general problem in a particular context. - -- Design Patterns Elements of Reusable
Object-Oriented Software - Gang of Four (GoF)
- A few well-known uses
- Singleton Java AWTs (GUI builder) Toolkit class
- Proxy CORBAs (middleware) proxy and real
objects - Chain of Responsibility Tomcats (application
server) request handlers
4Reverse Engineering of Design Patterns
Singleton
Composite
layoutMgr 1
AbstractFactory
1
Strategy
Bridge
5Representative Current Approaches
Tools Language Techniques Case Study Patterns Targeted
SPOOL C Database query ET Template Method, Factory Method, Bridge
DP C Database query DTK Composite, Flyweight, Class Adapter
Vokac et al. C Database query SuperOffice CRM Singleton, Template Method, Observer, Decorator
Antoniol et al. C Software metric Leda, libg, socket, galib, groff, mec Adapter, Bridge
SPQR C Formal semantic test programs Decorator
Balanyi et al. C XML matching Jikes, Leda, Star Office Calc, Writer Builder, Factory Method, Prototype, Bridge, Proxy, Strategy, Template Method
PTIDEJ Java Constraint Solver Java.awt., Java.net. Composite, Facade
FUJABA Java Fuzzy logic and Dynamic analysis Java AWT Bridge, Strategy, Composite
WoP Scanner Java AST query AWT, Swing, JDBC API, etc. Abstract Factory
HEDGEHOG Java Formal Semantic PatternBox, Java 1.1, 1.2 Most GoF patterns (discussed later)
Heuzeroth et al. Java Dynamic analysis Java Swing Observer, Mediator, CoR, Visitor
KT SmallTalk Dynamic analysis KT Composite, Visitor, Template Method
MAISA UML UML matching Nokia DX200 Switching System Abstract Factory
6Current Approaches
- Limitations
- Misinterpretation of pattern definitions
- Limited detection scope on implementation
variants - Can be grouped as follows
- Targeting structural aspects
- Analyze class/method declarations
- Analyze inter-class relationships (e.g., whether
one class extends another) - Targeting behavioral aspects
- Analyze code semantics (e.g., whether a code
segment is single entry)
7Targeting Structural Aspects
- Method
- Extract structural relationships (inter-class
analysis) - For a pattern, check for certain structural
properties - Drawback
- Relies only on structural relationships, which
are not the only distinction between patterns
8Targeting Behavioral Aspects
- Method
- Narrow down search space
- using inter-class relationships
- Verify behavior in method bodies
- Dynamic analysis
- Machine learning
- Static program analysis
9Targeting Behavioral Aspects
- Drawback
- Dynamic analysis
- Requires good data coverage
- Verifies program behavior but does not verify the
intent - Complicates the task for detecting patterns that
involve concurrency - Machine learning
- Most patterns have concrete definitions, thus
does not solve the fundamental problem.
10A Motivating Example
public class Singleton private static
Singleton instance private Singleton()
public static Singleton getInstance()
- Detecting the
- Singleton Pattern
- As detected by FUJABA
- Common search criteria
- private Singleton()
- private static Singleton instance
- public static Singleton getInstance()
- Problem
- No behavioral analysis on getInstance()
- Solution?
if (instance NULL) instance new
Singleton() return instance
instance new Singleton() return instance
return new Singleton()
11GoF Patterns Reclassified
12Language-provided Patterns
- Patterns provided in the language or library
- The Iterator Pattern
- Provides a way to access the elements of an
aggregate object sequentially without exposing
its underlying representation GoF - In Java
- Enumeration since Java 1.0
- Iterator since Java.1.2
- The for-each loop since Java 1.5
- The Prototype Pattern
- Specify the kinds of objects to create using a
prototypical instance, and create new objects
based on this prototype - In Java
- The clone() method in java.lang.Object
- Pattern Detection
- Recognizing variants in legacy code
13Structure-driven Patterns
- Patterns that are driven by software
architecture. - Can be identified by inter-class relationships
- The Template Method, Composite, Decorator,
Bridge, Adapter, Proxy, Facade patterns - Inter-class Relationships
- Accessibility
- Declaration
- Inheritance
- Delegation
- Aggregation
- Method invocation
14Behavior-driven Patterns
- Patterns that are driven by system behavior.
- Can be detected using inter-class and program
analyses. - The Singleton, Abstract Factory, Factory Method,
Flyweight, CoR, Visitor, Observer, Mediator,
Strategy, and State patterns. - Program analysis techniques
- Program slicing
- Data-flow analysis
- Call trace analysis
15Domain-specific Patterns
- Patterns applied in a domain-specific context
- The Interpreter Pattern
- Given a language, define a representation for
its grammar along with an interpreter that uses
the representation to interpret sentences in the
language GoF - Commonly based on the Composite and Visitor
patterns - The Command Pattern
- Encapsulate a request as an object, thereby
letting you parameterize clients with different
requests, queue or log requests, and support
undoable operations GoF - A use of combining the Bridge and Composite
patterns to separate user interface and actual
command execution. The Memento pattern is also
used to store a history of executed commands - Pattern Detection
- Requires domain-specific knowledge
16Generic Concepts
- Patterns that are generic concepts
- The Builder Pattern
- Separate the construction of a complex object
from its representation so that the same
construction can create different representation
GoF - System bootstrapping pattern, object creation is
not necessary - The Memento Pattern
- Without violating encapsulation, capture and
externalize an objects internal state so that
the object can be restored to this state later
GoF - Implementation of memo pool and representation of
states are not specifically defined. - Pattern detection
- Lack implementation trace
17Recognizing the Singleton Pattern
- Structural aspect
- private Singleton()
- private static Singleton instance
- public static Singleton getInstance()
- Behavioral aspect
- Analyze the behavior in getInstance()
- Check if lazy-instantiation is implemented
- Check if instance is returned
- Slice the method body for instance and analyze
the sliced program
recall
18Recognizing the Singleton Pattern
public class SingleSpoon private
SingleSpoon() private static SingleSpoon
theSpoon public static SingleSpoon
getTheSpoon() if (theSpoon null)
theSpoon new SingleSpoon() return
theSpoon
19Pattern INference and recOvery Tool
- PINOT
- A fully automated pattern detection tool
- Designed to be faster and more accurate
- Detects structural- and behavioral-driven
patterns - How PINOT works
Pattern Instances
JAVA
Text
PINOT
Source Code
Pattern Instances
view
XMI
editors
U
M
L
20Implementation Alternatives
- Program analysis tools
- Extract basic information of the source code
- Class, method, and variable declarations
- Class inheritance
- Method invocations, call trace
- Variable refers-to and refers-by relationships
- Parsers
- Extract the abstract syntax tree (AST)
- Compilers
- Extract the AST and provide related symbol tables
and built-in functions operating on the AST
21Implementation Overview
- A modification of Jikes (open source C Java
compiler) - Analysis using Jikes abstract syntax tree (AST)
and symbol tables - Identifying Structure-driven patterns
- Considers Java language constructs
- Considers commonly used Java utility classes
java.util.Collection and java.util.Iterator - Identifying Behavior-driven patterns
- Applies data-flow analysis, inter-procedural
analysis, alias analysis - PINOT considers related patterns
- Speed up the process of pattern recognition
- E.g., Strategy and State Patterns, CoR and
Decorator, etc.
22PINOT HEDGEHOG FUJABA
Creational Abstract Factory Yes Yes No
Creational Builder
Creational Factory Method Yes Yes No
Creational Prototype No
Creational Singleton Yes Yes Yes
Structural Adapter Yes Yes No
Structural Bridge Yes Yes Yes
Structural Composite Yes Yes No
Structural Decorator Yes Yes No
Structural Facade Yes Yes
Structural Flyweight Yes Yes No
Structural Proxy Yes Yes
Behavioral Chain of Responsibility Yes No
Behavioral Command
Behavioral Interpreter
Behavioral Iterator Yes No
Behavioral Mediator Yes No
Behavioral Memento No
Behavioral Observer Yes Yes No
Behavioral State Yes No
Behavioral Strategy Yes Yes Yes
Behavioral Template Method Yes Yes Yes
Behavioral Visitor Yes Yes
Yes. The tool provides recognition for the
pattern and correctly identifies it.
No. The tool provides recognition for the
pattern but fails to identify it.
Blank. The tool does not provide recognition for
the pattern.
23Benchmarks
- Java AWT (GUI toolkit)
- javac (Sun Java Compiler)
- JHotDraw (GUI framework)
- Apache Ant (Build tool)
- Swing (Java Swing library)
- ArgoUML (UML editor tool)
24PINOT Results
- PINOT works well in terms of accuracy it
recognizes many pattern instances in the
benchmarks. - Like other pattern detection tools, PINOT is not
perfect - False positives
- Prototype vs. Factory Method
- PINOT does not detect Prototype pattern
- Prototype pattern involves object creation
- PINOT identifies implementation of clone methods
as factory methods - False Negatives
- User-defined data structures
- Container structures are commonly used with
Observer, Mediator, Composite, Chain of
Responsibility patterns, etc.
25Pattern Interpretation
- Flyweight vs. Immutable
- Immutable classes are sharable singletons
- Mediator vs. Facade
- Colleagues of participating in the Mediator
pattern can have different types - A mediator class becomes a facade against an
individual colleague class
26PINOT Results
27Timing Results
- A comparison
- PtideJ
- 2-3 hours analyzing JHotDraw
- Platform AMD Athlon 2GHz 64b processor
- Fujaba
- 22 minutes analyzing Java AWT
- Platform Pentium III 933MHz processor
- with 1G of memory
28Ongoing and Future Work
- Investigate other domain-specific patterns
- High performance computing (HPC) patterns
- Real-time patterns
- Extend usability of PINOT
- Formalize pattern definitions
- Visualizing detection results
29PINOT Eclipse
30Conclusion
- Reverse engineering of design patterns
- Reclassifying the GoF patterns for
reverse-engineering - PINOT a faster and more accurate pattern
detection tool - Ongoing and future work
- More information on our website
http//www.cs.ucdavis.edu/shini/research/pinot