Title: Tools for Refactoring Functional Programs
1Tools for Refactoring Functional Programs
- Simon Thompson
- with
- Huiqing Li
- Claus Reinke
- www.cs.kent.ac.uk/projects/refactor-fp
2Design
- Models
- Prototypes
- Design documents
- Visible artifacts
3All in the code
- Functional programs embody their design in their
code. - This is enabled by their high-level nature
constructs, types
data Message Message Head Body data Head
Head Metadata Title data Metadata Metadata
Tags type Title String
4Evolution
- Successful systems are long lived
- and evolve continuously.
- Supporting evolution of code and design?
5Soft-Ware
- Theres no single correct design
- different options for different situations.
- Maintain flexibility as the system evolves.
6Refactoring
- Refactoring means changing the design or
structure of a program without changing its
behaviour.
Refactor
Modify
7Not just programming
- Paper or presentation
- moving sections about amalgamate sections move
inline code to a figure animation - Proof
- add lemma remove, amalgamate hypotheses,
- Program
- the topic of the lecture
8Splitting a function in two
9Splitting a function in two
10Splitting a function in two
11Splitting a function
- module Split where
- f String -gt String
- f ys foldr () y"\n" y lt- ys
12Splitting a function
- module Split where
- f String -gt String
- f ys foldr () y"\n" y lt- ys
13Splitting a function
- module Split where
- f String -gt String
- f ys join y "\n" y lt- ys
- where
- join foldr ()
14Splitting a function
- module Split where
- f String -gt String
- f ys join y "\n" y lt- ys
- where
- join foldr ()
15Splitting a function
- module Split where
- f String -gt String
- f ys join addNL
- where
- join zs foldr () zs
- addNL y "\n" y lt- ys
16Splitting a function
- module Split where
- f String -gt String
- f ys join addNL
- where
- join zs foldr () zs
- addNL y "\n" y lt- ys
17Splitting a function
- module Split where
- f String -gt String
- f ys join (addNL ys)
- where
- join zs foldr () zs
- addNL ys y "\n" y lt- ys
18Splitting a function
- module Split where
- f String -gt String
- f ys join (addNL ys)
- where
- join zs foldr () zs
- addNL ys y "\n" y lt- ys
19Splitting a function
- module Split where
- f String -gt String
- f ys join (addNL ys)
-
- join zs foldr () zs
- addNL ys y "\n" y lt- ys
20Overview
- Example refactorings what they involve.
- Building the HaRe tool.
- Design rationale.
- Infrastructure.
- Haskell and Erlang.
- The Wrangler tool.
- Conclusions.
21Haskell 98
- Standard, lazy, strongly typed, functional
programming language. - Layout is significant offside rule and
idiosyncratic.
doSwap pnt applyTP (full_buTP (idTP adhocTP
inMatch
adhocTP inExp
adhocTP inDecl)) where inMatch
((HsMatch loc fun pats rhs ds)HsMatchP)
fun pnt case pats of
(p1p2ps) -gt do pats'lt-swap p1 p2 pats
return (HsMatch loc
fun pats' rhs ds) _ -gt
error "Insufficient arguments to swap."
inMatch m return m inExp exp_at_((Exp
(HsApp (Exp (HsApp e e1)) e2))HsExpP)
expToPNT e pnt swap e1 e2 exp
inExp e return e
22Why refactor Haskell?
- The only design artefact is (in) the code.
-
- Semantics of functional languages support
large-scale transformations (?) - Building real tools to support functional
programming heavy lifting. - Platform for research and experimentation.
23Lift / demote
- f x y h
- where
- h
- ?
- Hide a function which is clearly subsidiary to f
clear up the namespace.
- f x y (h y)
-
- h y
- ?
- Makes h accessible to the other functions in the
module and beyond.
Free variables which parameters of f are used in
h? Need h not to be defined at the top level, ,
Type of h will generally change .
24Algebraic or abstract type?
data Tr a Leaf a Node a (Tr a) (Tr a)
flatten Tr a -gt a flatten (Leaf x)
x flatten (Node s t) flatten s flatten
t
Tr Leaf Node
25Algebraic or abstract type?
Tr isLeaf isNode leaf left right mkLeaf mkNode
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf isNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
26Information required
- Lexical structure of programs,
- abstract syntax,
- binding structure,
- type system and
- module system.
27Program transformations
- Program optimisation source-to-source
transformations to get more efficient code - Program derivation calculating efficient code
from obviously correct specifications - Refactoring transforming code structure usually
bidirectional and conditional. - Refactoring Transformation Condition
28Conditions renaming f to g
- No change to the binding structure
- No two definitions of g at the same level.
- No capture of g.
- No capture by g.
29Capture of renamed identifier
30Capture by renamed identifier
- h x h f g
- where
- f y f g
- g x
- h x h g g
- where
- g y g g
- g x
31Refactoring by hand?
- By hand in a text editor
- Tedious
- Error-prone
- Implementing the transformation
- and the conditions.
- Depends on compiler for type checking,
- plus extensive testing.
32Machine support invaluable
- Reliable
- Low cost of do / undo, even for large
refactorings. - Increased effectiveness and creativity.
33- Demonstration of HaRe, hosted in vim.
34(No Transcript)
35(No Transcript)
36(No Transcript)
37The refactorings in HaRe
Move def between modules Delete/add to
exports Clean imports Make imports explicit data
type to ADT Short-cut, warm fusion All module
aware
- Rename
- Delete
- Lift / Demote
- Introduce definition
- Remove definition
- Unfold
- Generalise
- Add/remove parameters
38HaRe design rationale
- Integrate with existing development tools.
- Work with the complete language Haskell 98
- Preserve comments and the formatting style.
- Reuse existing libraries and systems.
- Extensibility and scriptability.
39Information required
- Lexical structure of programs,
- abstract syntax,
- binding structure,
- type system and
- module system.
40The Implementation of HaRe
Information gathering
Pre-condition checking
Strafunski
Program transformation
Program rendering
41Finding free variables by hand
- instance FreeVbls HsExp where
- freeVbls (HsVar v) v
- freeVbls (HsApp f e)
- freeVbls f freeVbls e
- freeVbls (HsLambda ps e)
- freeVbls e \\ concatMap paramNames ps
- freeVbls (HsCase exp cases)
- freeVbls exp concatMap freeVbls cases
- freeVbls (HsTuple _ es)
- concatMap freeVbls es
- Boilerplate code 1000 noise 100 significant.
-
42Strafunski
- Strafunski allows a user to write general (read
generic), type safe, tree traversing programs,
with ad hoc behaviour at particular points. - Top-down / bottom up, type preserving / unifying,
full
stop
one
43Strafunski in use
- Traverse the tree accumulating free variables
from components, except in the case of lambda
abstraction, local scopes, - Strafunski allows us to work within Haskell
- Other options? Generic Haskell, Template Haskell,
AG, Scrap Your Boilerplate,
44Rename an identifier
- rename (Term t)gtPName-gtHsName-gtt-gtMaybe t
- rename oldName newName applyTP worker
- where
- worker full_tdTP (idTP adhocTP
idSite) -
- idSite PName -gt Maybe PName
- idSite v_at_(PN name orig)
- v oldName
- return (PN newName orig)
- idSite pn return pn
45The coding effort
- Transformations straightforward in Strafunski
- the chore is implementing conditions that the
transformation preserves meaning. - This is where much of our code lies.
46Program rendering example
- -- This is an example
- module Main where
- sumSquares x y sq x sq y
- where sq Int-gtInt
- sq x x pow
- pow 2 Int
- main sumSquares 10 20
47Token stream and AST
- White space comments only in token stream.
- Modification of the AST guides the modification
of the token stream. - After a refactoring, the program source is
recovered from the token stream not the AST. - Heuristics associate comments with program
entities.
48Work in progress
- Fold against definitions find duplicate code.
- All, some or one? Effect on the interface
- f x e e
- Symbolic evaluation
- Data refactorings
- Interfaces bad smell detection.
49API and DSL
Combining forms
???
Refactorings
Refactoring utilities
Library functions Grammar as data Strafunski
Strafunski
Haskell
50What have we learned?
- Efficiency and robustness of libraries in
question. - type checking large systems,
- linking,
- editor script languages (vim, emacs).
- The cost of infrastructure in building practical
tools. - Reflections on Haskell itself.
51Reflections on Haskell
- Cannot hide items in an export list (cf import).
- Field names for prelude types?
- Scoped class instances not supported.
- Ambiguity vs. name clash.
- Tab is a nightmare!
- Correspondence principle fails
52Correspondence
- Operations on definitions and operations on
expressions can be placed in one to one
correspondence - (R.D.Tennent, 1980)
53Correspondence
- Definitions
- where
- f x y e
- f x
- g1 e1
- g2 e2
- Expressions
- let
- \x y -gt e
- f x if g1 then e1 else if g2
54Function clauses
- f x
- g1 e1
- f x
- g2 e2
- Can fall through a function clause no direct
correspondence in the expression language.
- f x if g1 then e1 else if g2
- No clauses for anonymous functions no reason to
omit them.
55Haskell 98 vs. Erlang generalities
- Haskell 98 a lazy, statically typed, purely
functional programming language featuring
higher-order functions, polymorphism, type
classes and monadic effects.
- Erlang a strict, dynamically typed functional
programming language with support for
concurrency, communication, distribution and
fault-tolerance.
56Haskell 98 vs. Erlang example
-- Factorial In Haskell. module Fact(fac)
where fac Int -gt Int fac 0 1 fac n ngt0
n fac(n-1)
Factorial In Erlang. -module (fact). -export
(fac/1). fac(0) -gt 1 fac(N) when N gt 0 -gt N
fac(N-1).
57Haskell 98 vs. Erlang pragmatics
- Type system makes implementation complex.
- Layout and comment preservation.
- Types also affect the refactorings themselves.
-
- Clearer semantics for refactorings, but more
complex infrastructure.
- Untyped traversals much simpler.
- Use the layout given by emacs.
- Use cases which cannot be understood statically.
- Dynamic semantics of Erlang makes refactorings
harder to pin down.
58Challenges of Erlang refactoring
- Multiple binding occurrences of variables.
- Indirect function call or function spawn
apply (lists, rev, a,b,c) - Multiple arities  multiple functions rev/1
- Concurrency
- Refactoring within a design library OTP.
- Side-effects.
59Generalisation and side-effects
-module (test). -export(f/0). repeat(0) -gt
ok repeat(N) -gt ioformat (hello\n"),
repeat(N-1). f( ) -gt repeat(5).
-module (test). -export(f/0). repeat(A, 0) -gt
ok repeat(A, N) -gt A,
repeat(A,N-1). f( ) -gt repeat (ioformat
(hello\n), 5).
60Generalisation and side-effects
-module (test). -export(f/0). repeat(0) -gt
ok repeat(N) -gt ioformat (hello\n"),
repeat(N-1). f( ) -gt repeat(5).
-module (test). -export(f/0). repeat(A, 0) -gt
ok repeat(A, N) -gt A(),
repeat(A,N-1). f( ) -gt repeat (fun( )-gt
ioformat (hello\n), 5).
61The Wrangler
Program source
Scanner/Parser
Parse Tree
Syntax tools
AST annotated with comments
Refactorer
AST comments binding structure
Program analysis and transformation by the
refactorer
Transformed AST
Pretty printer
Program source
62Teaching and learning design
- Exciting prospect of using a refactoring tool as
an integral part of an elementary programming
course. - Learning a language learn how you could modify
the programs that you have written - appreciate the design space, and
- the features of the language.
63Conclusions
- Refactoring functional programming good fit.
- Real win from available libraries with work.
- Substantial effort in infrastructure.
- De facto vs de jure GHC vs Haskell 98.
- Correctness and verification
- Language independence