Title: Parametric Polymorphism for Popular Programming Languages
1Parametric Polymorphism for Popular Programming
Languages
- Andrew Kennedy
- Microsoft Research Cambridge
2OrForall for all
- Andrew Kennedy
- Microsoft Research Cambridge(Joint work with Don
Syme)
3Curriculum Vitae for FOOLs
http//research.microsoft.com/akenn
4Parametric polymorphism
- Parameterize types and code by types
- Concept Strachey (1967)
- Language ML (Milner, 1975), Clu (Liskov, 1975)
- Foundations System F (Girard, 1971), Polymorphic
lambda calculus (Reynolds, 1974) - Engineering benefits are well-known (code re-use
strong typing) - Implementation techniques are well-researched
5Polymorphic Programming Languages
Standard ML
Eiffel
OCaml
C
Ada
Clu
GJ
Haskell
Mercury
Miranda
Pizza
6Widely-usedPolymorphic Programming Languages
C
7Widely-used Strongly-typedPolymorphic
Programming Languages
8In 2004?
C
Visual Basic?
Java
Cobol, Fortran, ?
9This talk
- The .NET generics project
- What was challenging?
- What was surprising?
- Whats left?
10What is the .NET CLR (Common Language Runtime)?
- For our purposes the CLR
- Executes MS-IL (Intermediate Language) programs
using just-in-time or way-ahead-of-time
compilation - Provides an object-oriented common type system
- Provides managed services garbage collection,
stack-walking, reflection, persistence, remote
objects - Ensures security through type-checking
(verification) and code access security
(permissions stack inspection) - Supports multiple source languages and interop
between them
11Themes
- Design Can multiple languages be accommodated by
a single design? What were the design trade-offs? - Implementation How can run-time types be
implemented efficiently? - Theory How expressive is it?
- Practice Would you like to program in it?
- Future Have we done enough?
12Timeline of generics project
May 1999
Don Syme presents proposal to C and CLR teams
Feb 2000
Initial prototype of extension to CLR
Feb 2001
Product Release of .NET v1.0
Jan 2002
Our code is integrated into the product teams
code base
Nov 2002
Anders Hejlsberg announces generics at OOPSLA02
late 2004?
Product release of .NET v1.2 with generics
13 14Design for multiple languages
C Give me template specialization
C Can I write class CltTgt T
CJust give me decent collection classes
C And template meta-programming
Visual BasicDont touch my language!
JavaRun-time types please
EiffelAll generic types covariant please
MLFunctors are cool!
HaskellRank-n types? Existentials? Kinds? Type
classes?
SchemeWhy should I care?
15Some design goals
- SimplicityDont surprise the programmer with odd
restrictions - ConsistencyFit with the object model of .NET
- Separate compilationType-check once, instantiate
anywhere
16Non-goals
- C style template meta-programmingLeave this to
source-language compilers - Higher-order polymorphism, existentialsHey,
lets get the basics right first!
17Whats in the design?
- Type parameterization for all declarations
- classes e.g. class SetltTgt
- interfaces e.g. interface IComparableltTgt
- structse.g. struct HashBucketltK,Dgt
- methods e.g. static void ReverseltTgt(T arr)
- delegates (first-class methods) e.g. delegate
void ActionltTgt(T arg)
18Whats in the design (2)?
- Bounds on type parameters
- single class bound (must extend)e.g. class
GridltTgt where T Control - multiple interface bounds (must implement)e.g.
class SetltTgt where T IComparableltTgt
19Simplicity gt no odd restrictions
interface IComparableltTgt int CompareTo(T
other) class SetltTgt IEnumerableltTgt where
T IComparableltTgt private TreeNodeltTgt root
public static SetltTgt empty new SetltTgt()
public void Add(T x) public bool
HasMember(T x) SetltSetltintgtgt s new
SetltSetltintgtgt()
Interfaces and superclass can be instantiated
Bounds can reference type parameter (F-bounded
polymorphism)
Even statics can use type parameter
Type arguments can be value or reference types
20Consistency gt preserve types at run-time
- Type-safe serialization
- Interop with legacy code
- Reflection
Object obj formatter.Deserialize(file)LinkedLi
stltintgt list (LinkedListltintgt) obj
// Just wrap existing Stack until we get round to
re-implementing it class GStackltTgt Stack st
public void Push(T x) st.Push(x) public T
Pop() return (T) st.Pop()
object obj Type ty obj.GetType().GetGenericAr
guments()0
21Separate compilation gt restrict generic
definitions
- No dispatch through a type parameter
- No inheritance from a type parameter
class CltTgt void meth() T.othermeth() //
dont know whats in T
class WeirdltTgt T // dont know whats in
T
22 23Compiling polymorphism, as was
- Two main techniques
- Specialize code for each instantiation
- C templates, MLton SML.NET monomorphization
- good performance ?
- code bloat ?
- Share code for all instantiations
- Either use a single representation for all types
(ML, Haskell) - Or restrict instantiations to pointer types
(Java) - no code bloat ?
- poor performance ? (extra boxing operations
required on primitive values)
24Compiling polymorphism in the Common Language
Runtime
- Polymorphism is built-in to the intermediate
language (IL) and the execution engine - CLR performs just-in-time type specialization
- Code sharing avoids bloat
- Performance is (almost) as good as
hand-specialized code
25Code sharing
- Rule
- share field layout and code if type arguments
have same representation - Examples
- Representation and code for methods in
Setltstringgt can be also be used for Setltobjectgt
(string and object are both 32-bit pointers) - Representation and code for Setltlonggt is
different from Setltintgt (int uses 32 bits, long
uses 64 bits)
26Exact run-time types
- We want to support if (x is Setltstringgt) ...
else if (x is SetltComponentgt) ... - But representation and code is shared between
compatible instantiations e.g. Setltstringgt and
SetltComponentgt - So theres a conflict to resolve
- and we dont want to add lots of overhead to
languages that dont use run-time types (ML,
Haskell)
27Object representation in the CLR
vtable ptr
vtable ptr
element type
fields
no. of elements
elements
normal object representationtype vtable
pointer
array representationtype is inside object
28Object representation for generics
- Array-style store the instantiation directly in
the object? - extra word (possibly more for multi-parameter
types) per object instance - e.g. every list cell in ML or Haskell would use
an extra word - Alternative make vtable copies, store
instantiation info in the vtable - extra space (vtable size) per type instantiation
- expect no. of instantiations ltlt no. of objects
- so we chose this option
29Object representation for generics
x Setltstringgt
y Setltobjectgt
vtable ptr
vtable ptr
fields
fields
code for Add
Add
Add
code for HasMember
HasMember
HasMember
ToArray
ToArray
code for ToArray
string
object
30Type parameters in shared code
- Run-time types with embedded type parameters
e.g. class TreeSetltTgt void Add(T item)
..new TreeNodeltTgt(..).. Q Where do we get
T from if code for m is shared?A Its always
obtainable from instantiation info in this
objectQ How do we look up type rep for
TreeNodeltTgt efficiently at run-time?A We keep a
dictionary of such type reps in the vtable for
TreeSetltTgt
31Dictionaries in action
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
32Dictionaries in action
vtable for Setltstringgt
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
vtable slots
string
33Dictionaries in action
vtable for Setltstringgt
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
34Dictionaries in action
vtable for Setltstringgt
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
35Dictionaries in action
vtable for Setltstringgt
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
TreeNodeltSetltstringgtgt
36Dictionaries in action
vtable for Setltstringgt
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
TreeNodeltSetltstringgtgt
Setltstringgt
37Dictionaries in action
vtable for Setltstringgt
class SetltTgt public void Add(T x)
new TreeNodeltTgt() public T ToArray()
new T Setltstringgt s new
Setltstringgt()s.Add(a)SetltSetltstringgtgt ss
new SetltSetltstringgtgt()ss.Add(s)Setltstringgt
ssa ss.ToArray()string sa s.ToArray()
vtable slots
string
TreeNodeltstringgt
string
vtable for SetltSetltstringgtgt
vtable slots
Setltstringgt
TreeNodeltSetltstringgtgt
Setltstringgt
38x86 code for new TreeNodeltTgt
mov ESI, dword ptr EDImov EAX, dword
ptr ESI24mov EAX, dword ptr EAXadd
EAX, 4mov dword ptr EBP-0CH, EAXmov
EAX, dword ptr EBP-0CHmov EBX, dword ptr
EAXtest EBX, EBXjne SHORT
G_M003_IG06G_M003_IG05push dword ptr
EBP-0CHpush ESImov EDX, 0x1b000002mov
ECX, 0x903ea0call _at_RuntimeHandlejmp
SHORT G_M003_IG07G_M003_IG06mov EAX,
EBXG_M003_IG07mov ECX, EAXcall
_at_newClassSmall
Retrieve dictionary entry from vtable
If non-null then skip
Look up handle the slow way
Create the object with run-time type
39Is it worth it?
- With no dictionaries, just run-time look-up
- new SetltTgt() is 10x to 100x slower than normal
object creation - With lazy dictionary look-up
- new SetltTgt() is 10 slower than normal object
creation
40Shared code for polymorphic methods
- Polymorphic methods
- Specialize per instantiation on demand
- Again share code between instantiations where
possible - Run-time types issue solved by dictionary-passing
style
41Performance
- Non-generic quicksortvoid Quicksort(object
arr, IComparer comp) - Generic quicksortvoid GQuicksortltTgt(T arr,
GIComparerltTgt comp) - Compare on element types int, string, double
42Performance
43 44Transposing F to C
- As musical keys, F and C? are far apart
- As programming languages, (System) F and
(Generic) C? are far apart - But
Polymorphism in Generic C? is as expressive as
polymorphism in System F
45System F and C?
46System F into C?
- Despite the differences, we can formalize a
translation from System F into (Generic) C? that - is fully type-preserving (no loss of information)
- is sound (preserves program behaviour)
- makes crucial use of the fact that
polymorphic virtual methodsexpressfirst-class
polymorphism
47Polymorphic virtual methods
- Define an interface or abstract classinterface
Sorter void SortltTgt(T a, IComparerltTgt c) - Implement the interfaceclass QuickSort
Sorter ... class MergeSort Sorter ... - Use instances at many type instantiationsvoid
TestSorter(Sorter s, int ia, string sa)
s.Sortltintgt(ia, IntComparer)
s.Sortltstringgt(sa, StringComparer)TestSorter(
new QuickSort(), ...)TestSorter(new
MergeSort(), ...)
48Compare
- Define an SML signaturesignature Sorter sig
val Sort a array (aa-gtorder) gt unit
end - Define structures that match the
signaturestructure QuickSort gt Sorter ...
structure MergeSort gt Sorter ... - Use structures at many type instantiationsfunct
or TestSorter(S Sorter) struct fun test
(ia, sa) (S.Sort(ia, Int.compare)
S.Sort(sa, String.compare) endstructure TestQS
TestSorter(QuickSort) TestQS.test(...)structu
re TestMS TestSorter(MergeSort)
TestMS.test(...)
49Or (Russo first-class modules)
- Define an SML signaturesignature Sorter sig
val Sort a array (aa-gtorder) gt unit
end - Define structures that match the
signaturestructure QuickSort gt Sorter ...
structure MergeSort gt Sorter ... - Use a function to test the structuresfun
TestSorter (s, ia, sa) let structure S as
Sorter s in (S.Sort(ia, Int.compare)
S.Sort(sa, String.compare)) endTestSorter
(structure QuickSort as Sorter,
...)TestSorter (structure MergeSort as
Sorter, ...)
50Observations
- Translation from System F to C is global
- generates new class names for (families of)
polymorphic types - The generics design for Java (GJ) also supports
polymorphic virtual methods - C has template methods but not virtual ones
- for good reason it compiles by expansion
- Distinctiveness of polymorphic virtual methods
shows up in (type-passing) implementations (e.g.
CLR) - requires execution-time type application
51 52Type inference?
- ML and Haskell have type inference
- C programs must be explicitly-typed
- Is this a problem in practice?
- not for the most-frequent application collection
classes - but try parser combinators in C...
53Parser combinators (Sestoft)
class SeqSndltT,Ugt ParserltUgt ParserltTgt tp
ParserltUgt up public SeqSnd(ParserltTgt tp,
ParserltUgt up) this.tp tp this.up up
public ResultltUgt Parse(ISource src)
ResultltTgt tr tp.Parse(src) if (tr.Success)
ResultltUgt ur up.Parse(tr.Source)
if (ur.Success) return new SuccltUgt(ur.Value,
ur.Source) return new FailltUgt()
54On the other hand
- .NET generics are supported by
- debugger
- profiler
- class browser
- GUI development environment
55Try it!
- Rotor shared-source release of CLR and C
- http//msdn.microsoft.com/NET/sscli
- Generics Rotor Gyro
- Gyro extends Rotor with generics support in CLR
and C - http//research.microsoft.com/projects/clrgen
56 57Extension Variance
- Should we add variance? e.g.
- IEnumeratorltButtongt lt IEnumeratorltComponentgt
- IComparerltComponentgt lt IComparerltButtongt
- Can even use this to support broken Eiffel
class CellltTgt T val void Set(T newval)
val newval T Get() return val
class CellltTgt T val void Set(object
newval) val (T) newval T Get()
return val
Run-time check
invariant in T
covariant in T
58Extension Parameterize by superclass
- Can type-check given sufficient constraints
T must extend D
class D virtual void m1() virtual
void m2() class CltTgt T where T D
int f override void m2(T x) x.m1()
new virtual void m3()
Override method D.m2
Know m1 exists because of constraint on T
New method, name can clash with method from T
59ExtensionParameterized by superclass (2)
- Provides a kind of mixin facility
- Unfortunately, implementation isnt easy
- Wed like to share rep code for CltPgt and CltQgt
for reference types P and Q, but it may be the
case that - object size of CltPgt ? size of CltQgt
- field offset of CltPgt.f ? offset of CltQgt.f
- vtable slot of CltPgt.m3 ? slot of CltQgt.m3
- gt abandon sharing, or do more run-time lookup
60Open problem
- Most widely used polymorphic library is probably
C STL (Standard Template Library) - STL gets expressivity and efficiency from
checking and compiling instantiations separately - Really ML functors cant match it
- How can we achieve the same expressivity and
efficiency with compile-time-checked parametric
polymorphism?
61