Title: Huiqing Li
1Refactoring Functional Programs
- Huiqing Li
- Claus Reinke
- Simon Thompson
- Computing Lab, University of Kent
- www.cs.kent.ac.uk/projects/refactor-fp/
2Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\t") fomrat xs
3Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\t") fomrat xs
4Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\t") format xs
5Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\t") format xs
6Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\t") format xs
7Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x \t") format xs
8Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
9Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- appNL String -gt String
- appNL
- appNL x x
- appNL (xxs) (x "\n") appNL xs
10Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . format
-
- appNL String -gt String
- appNL
- appNL x x
- appNL (xxs) (x "\n") appNL xs
11Writing a program
- -- appNL a list of Strings, one per line
- table String -gt String
- table concat . appNL
-
- appNL String -gt String
- appNL
- appNL x x
- appNL (xxs) (x "\n") appNL xs
12Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . appNL
-
- appNL String -gt String
- appNL
- appNL x x
- appNL (xxs) (x "\n") appNL xs
13Writing a program
- -- format a list of Strings, one per line
- table String -gt String
- table concat . appNL
- where
- appNL String -gt String
- appNL
- appNL x x
- appNL (xxs) (x "\n") appNL xs
14Refactoring
- Refactoring means changing the design of program
- without changing its behaviour.
- Refactoring comes in many forms
- micro refactoring as a part of program
development, - major refactoring as a preliminary to revision,
- as a part of debugging,
- As programmers, we do it all the time.
15Not just programming
- Paper or presentation
- moving sections about amalgamate sections move
inline code to a figure animation - Proof
- introduce lemma remove, amalgamate hypotheses,
- Program
- the topic of the lecture
16Overview of the talk
- Example refactorings what do we learn?
- Refactoring functional programs
- Generalities
- Tooling demo, rationale, design.
- What comes next?
- Conclusions
17Refactoring Functional Programs
- 3-year EPSRC-funded project
- Explore the prospects of refactoring functional
programs - Catalogue useful refactorings
- Look into the difference between OO and FP
refactoring - A real life refactoring tool for Haskell
programming - A formal way to specify refactorings and a
set of proofs that the implemented refactorings
are correct. - Currently mid-project the second HaRe release
is module-aware.
18Refactoring functional programs
- Semantics can articulate preconditions and
- verify transformations.
- Absence of side effects makes big changes
predictable and verifiable unlike
OO. - Language support expressive type system,
abstraction mechanisms, HOFs, - Opens up other possibilities proof
19Rename
- f x y
- ?
- Name may be too specific, if the function is a
candidate for reuse.
- findMaxVolume x y
- ?
- Make the specific purpose of the function clearer.
Needs scope information just change this f and
not all fs (e.g. local definitions or
variables). Needs module information change f
wherever it is imported.
20Lift / demote
- f x y h
- where
- h
- ?
- Hide a function which is clearly subsidiary to f
clear up the namespace.
- f x y (h y)
-
- h y
- ?
- Makes h accessible to the other functions in the
module and beyond.
Needs free variable information which of the
parameters of f is used in the definition of
h? Need h not to be defined at the top level,
, DMR.
21Lessons from the first examples
- Changes are not limited to a single point or even
a single module diffuse and bureaucratic - unlike traditional program transformation.
- Many refactorings bidirectional
- as there is never a unique correct design.
22How to apply refactoring?
- By hand, in a text editor
- Tedious
- Error-prone
- Depends on extensive testing
- With machine support
- Reliable
- Low cost easy to make and un-make large changes.
- Exploratory a full part of the programmers
toolkit.
23Machine support invaluable
- Current practice editor type checker (
tests). - Our project automated support for a repertoire
of refactorings - integrated into the existing development
process Haskell IDEs such as vim and emacs.
24- Demonstration of HaRe, hosted in vim.
25Proof of concept
- To show proof of concept it is enough to
- build a stand-alone tool,
- work with a subset of the language,
- pretty print the refactored source code in a
standard format.
26 or a useful tool?
- To make a tool that will be used we must
- integrate with existing program development
tools the program editors emacs and vim only
add to their capabilities - work with the complete Haskell 98 language
- preserve the formatting and comments in the
refactored source code - allow users to extend and script the system.
27Refactorings implemented in HaRe
- Rename
- Delete
- Lift (top level / one level)
- Demote
- Introduce definition
- Remove definition
- Unfold
- Generalise
- Add and remove parameters
- All these refactorings are module aware.
28Implementing HaRe an example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
- Promote the definition of sq to top level
29Implementing HaRe an example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
- Identify the definition of sq to be promoted
30Implementing HaRe an example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
- Is sq defined at top level, here or in importing
modules is sq imported from elsewhere?
31Implementing HaRe an example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
- Does sq use anything defined locally to
sumSquares ?
32Implementing HaRe an example
- -- This is an example
- module Main where
- sumSquares x y sq pow x sq pow y
- where sq Int-gtInt-gtInt
- sq pow x x pow
- pow 2 Int
- main sumSquares 10 20
- If so, generalise to add these as parameters, and
change type signature.
33Implementing HaRe an example
- -- This is an example
- module Main where
- sumSquares x y sq pow x sq pow y
- where pow 2 Int
- sq Int-gtInt-gtInt
- sq pow x x pow
- main sumSquares 10 20
- Finally, move the definition to top level.
34The Implementation of Hare
Information gathering
Pre-condition checking
Program transformation
Program rendering
35Information needed
- Syntax replace the function called sq, not the
variable sq parse tree. - Static semantics replace this function sq, not
all the sq functions scope information. - Module information what is the traffic between
this module and its clients call graph. - Type information replace this identifier when it
is used at this type type annotations.
36Infrastructure
- To achieve this we chose to
- build a tool that can interoperate with emacs,
vim, yet act separately. - leverage existing libraries for processing
Haskell 98, for tree transformation, yet - modify them as little as possible.
- be as portable as possible, in the Haskell space.
37The Haskell background
- Libraries
- parser many
- type checker few
- tree transformations few
- Difficulties
- Haskell98 vs. Haskell extensions.
- Libraries proof of concept vs. distributable.
- Source code regeneration.
- Real project
38Programatica
- Project at OGI to build a Haskell system
- with integral support for verification at
various levels assertion, testing, proof etc. - The Programatica project has built a Haskell
front end in Haskell, supporting syntax, static,
type and module analysis - freely available under BSD licence.
39The Implementation of Hare
Information gathering
Pre-condition checking
Program transformation
Program rendering
40First steps lifting and friends
- Use the Haddock parser full Haskell given in
500 lines of data type definitions. - Work by hand over the Haskell syntax 27 cases
for expressions - Code for finding free variables, for instance
41Finding free variables 100 lines
- instance FreeVbls HsExp where
- freeVbls (HsVar v) v
- freeVbls (HsApp f e)
- freeVbls f freeVbls e
- freeVbls (HsLambda ps e)
- freeVbls e \\ concatMap paramNames ps
- freeVbls (HsCase exp cases)
- freeVbls exp concatMap freeVbls cases
- freeVbls (HsTuple _ es)
- concatMap freeVbls es
- etc.
42This approach
- Boiler plate code
- 1000 lines for 100 lines of significant code.
- Error prone significant code lost in the noise.
- Want to generate the boiler plate and the tree
traversals - DriFT Winstanley, Wallace
- Strafunski Lämmel and Visser
43Strafunski
- Strafunski allows a user to write general (read
generic), type safe, tree traversing programs - with ad hoc behaviour at particular points.
- Traverse through the tree accumulating free
variables from component parts, except in the
case of lambda abstraction, local scopes, - Strafunski allows us to work within Haskell
other options are under development.
44Rename an identifier
- rename (Term t)gtPName-gtHsName-gtt-gtMaybe t
- rename oldName newName applyTP worker
- where
- worker full_tdTP (idTP adhocTP
idSite) -
- idSite PName -gt Maybe PName
- idSite v_at_(PN name orig)
- v oldName return (PN newName
orig) - idSite pn return pn
45The coding effort
- Transformations with Strafunski are
straightforward - the chore is implementing conditions that
guarantee that the transformation is meaning-
preserving. - This is where the bulk of our code lies.
46The Implementation of Hare
Information gathering
Pre-condition checking
Program transformation
Program rendering
47Program rendering example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
- Promote the definition of sq to top level
48Program rendering example
- module Main where
- sumSquares x y
- sq pow x sq pow y where pow 2 Int
- sq Int-gtInt-gtInt
- sq pow x x pow
- main sumSquares 10 20
- Using a pretty printer comments lost and layout
quite different.
49Program rendering example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
- Promote the definition of sq to top level
50Program rendering example
- -- This is an example
- module Main where
- sumSquares x y sq pow x sq pow y
- where pow 2 Int
- sq Int-gtInt-gtInt
- sq pow x x pow
- main sumSquares 10 20
- Layout and comments preserved.
51Rendering our approach
- White space and comments in the token stream.
- 2 views of the program token stream and AST.
- Modification of the AST guides the modification
of the token stream. - After a refactoring, the program source is
extracted from the token stream not the AST. - Use heuristics to associate comments with
semantic entities.
52Production tool (version 0)
Programatica parser and type checker
Refactor using a Strafunski engine
Render code from the token stream and syntax tree.
53Production tool (version 1)
Programatica parser and type checker
Refactor using a Strafunski engine
Render code from the token stream and syntax tree.
Pass lexical information to update the syntax
tree and so avoid reparsing
54Module awareness example
- Move a top-level definition f from module A to B.
- -- Is f defined at the top-level of B?
- -- Are the free variables in f accessible
within module B? - -- Will the move require recursive modules?
- -- Remove the definition of f from module A.
- -- Add the definition to module B.
- -- Modify the import/export in module A, B and
the client - modules of A and B if necessary.
- -- Change uses of A.f to B.f or f in all
affected modules. - -- Resolve ambiguity.
55What have we learned?
- Emerging Haskell libraries make it a practical
platform. - Efficiency issues type checking large systems.
- Limitations of IDE interactions in vim and emacs.
- Reflections on Haskell itself.
56Reflecting on Haskell
- Cannot hide items in an export list (though you
can on import). - The formal semantics of pattern matching is
problematic. - Ambiguity vs. name clash.
- Tab is a nightmare!
- Correspondence principle fails
57Correspondence
- Operations on definitions and operations on
expressions can be placed in correspondence - (R.D.Tennent, 1980)
58Correspondence
- Definitions
- where
- f x y e
- f x
- g1 e1
- g2 e2
- Expressions
- let
- \x y -gt e
- f x if g1 then e1 else if g2
59Where do we go next?
- Larger-scale examples ADTs, monads,
- An API for do-it-yourself refactorings, or
- a language for composing refactorings
- Detecting bad smells
- Evolving the evidence GC6.
60What do users want?
- Find and remove duplicate code.
- Argument permutations.
- Data refactorings.
- More traditional program transformations.
- Monadification.
61Monadification (cf Erwig)
- do v1 lt- e1
- v2 lt- e2
- r lt- f v1 v2
- return r
62Larger-scale examples
- More complex examples in the functional domain
often link with data types. - Dawning realisation that can some refactorings
are pretty powerful. - Bidirectional no right answer.
63Algebraic or abstract type?
flatten Tr a -gt a flatten (Leaf x)
x flatten (Node s t) flatten s flatten
t
Tr Leaf Node
data Tr a Leaf a Node a (Tr a) (Tr a)
64Algebraic or abstract type?
Tr isLeaf isNode leaf left right mkLeaf mkNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf isNode
65Algebraic or abstract type?
- ?
- Pattern matching syntax is more direct
- but can achieve a considerable amount with
field names. - Other reasons? Simplicity (due to other
refactoring steps?).
- ?
- Allows changes in the implementation type without
affecting the client e.g. might memoise - Problematic with a primitive type as carrier.
- Allows an invariant to be preserved.
66Outside or inside?
Tr isLeaf isNode leaf left right mkLeaf mkNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf
67Outside or inside?
Tr isLeaf isNode leaf left right mkLeaf mkNode fl
atten
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf flatten
68Outside or inside?
- ?
- If inside and the type is reimplemented, need to
reimplement everything in the signature,
including flatten. - The more outside the better, therefore.
- ?
- If inside can modify the implementation to
memoise values of flatten, or to give a better
implementation using the concrete type. - Layered types possible put the utilities in a
privileged zone.
69API
Refactorings
Refactoring utilities
Strafunski
Haskell
70DSL
Combining forms
Refactorings
Refactoring utilities
Strafunski
Haskell
71Detecting bad smellsWork byChris Ryder
72Evolving the evidence
- Dependable System Evolution is the software
engineering grand challenge. - Build systems with evidence of their
dependability - but this begs the question of how to evolve the
evidence in line with the system. - Refactoring proofs, test coverage data etc.
73Teaching and learning design
- Exciting prospect of using a refactoring tool as
an integral part of an elementary programming
course. - Learning a language learn how you could modify
the programs that you have written - appreciate the design space, and
- the features of the language.
74Conclusions
- Refactoring functional programming good fit.
- Practical tool not yet another type tweak.
- Leverage from available libraries with work.
- We have begun to use the tool in building itself!
- Much more to do than we have time for.
- Martin Fowlers Rubicon extract definition
in HaRe version 1 fp productivity.
75- www.cs.kent.ac.uk/projects/refactor-fp/