A Cluster of Languages for Mathematical Computing - PowerPoint PPT Presentation

About This Presentation
Title:

A Cluster of Languages for Mathematical Computing

Description:

A Cluster of Languages for Mathematical Computing Stephen M. Watt Department of Computer Science Western University London Ontario, Canada DIKU University of Copenhagen – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 101
Provided by: Rui123
Category:

less

Transcript and Presenter's Notes

Title: A Cluster of Languages for Mathematical Computing


1
A Cluster of Languages for Mathematical Computing
  • Stephen M. Watt
  • Department of Computer ScienceWestern University
    London Ontario, Canada

DIKU University of Copenhagen 7 September 2012
2
Moving Windows Around
  • Add a border
  • Add a scroll bar
  • Respond to a button.
  • Derp, derp,
  • We have harderproblems now.

3
Declaration of Prejudices
  • Key problem How to cascade efficient,
    effective abstractions.

4
Mathematics as a Programming Language Canary
5
Why?
  • Complex problems with many parts
  • Complex interactions among the parts
  • Many different levels of abstraction
  • Precise definition
  • Can tell if an answer is right or wrong

6
Examples
  • Garbage collection
  • Lisp ? underground ? Java etc
  • Algebraic expressions
  • Fortran
  • Big integer
  • Crypto
  • Generics
  • ? Java, C,

7
Computer Algebra
  • Solve problems in terms of symbolic parameters,
    rather than numerically.
  • Having the computer figure out the
    formulasrather than using formulas given by
    humans.
  • Algorithms computational mathematics
  • Software mathematical computation

8
Computer Algebra
  • Start with symbols and
    compute with symbols gt
  • Exact results
  • Hopefully, insightful results

9
Finding an Answer
  • One day an individual went to the horse races.
    Instead of counting the number of humans and
    horses, she counted 74 heads and 196 legs.
  • How many humans and horses were there?
  • humans horses 74 humans
    2 horses 4 196

10
Finding an Answer
  • One day an individual went to the horse races.
    Instead of counting the number of humans and
    horses, she counted 74 heads and 196 legs.
  • How many humans and horses were there?
  • humans horses 74 humans
    2 horses 4 196
  • horses 24 humans 50

11
Finding an Answer
  • One day an individual went to the horse races.
    Instead of counting the number of humans and
    horses, she counted H heads and L legs.
  • How many humans and horses were there?
  • humans horses H humans 2
    horses 4 L
  • horses ?H L/2 humans 2 H ? L/2

12
Computer Algebra
  • A couple of research problems of personal
    interest
  • Symbolic-numeric algorithms
  • Symbolic exponents

13
Approximate Polynomials
14
Symbolic Exponents
15
Examples
  • Maple
  • Axiom
  • Aldor
  • MathML
  • InkML
  • Warning 3x too much stuff here.We will skip to
    what the audience wants.

16
Language 1 Maple
  • Waterloo 1980 on
  • Geddes Gonnet initiators.
  • University, then company. Collaboration.
  • Dynamically typed, interpreted language for
    scripting computer algebra programs.

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
An Example (small)
25
(No Transcript)
26
(No Transcript)
27
Maple
  • Compiled kernel, interpreted library
  • What was compiled was hand-chosen
  • Support many students on shared 1980s hw
  • Easy to lay down code, quick library growth
  • Language not very structured, so limitations
  • Commercially viable project
  • Company focus education and CAE

28
Example 2 Axiom
  • 1984 moved from Waterloo to IBM Research
  • Scratchpad in-house research project
  • Jenks and Trager initiators.
  • 1991 released as commercial product by NAG

29
Axiom
  • Main idea code re-use through abstraction
  • Generic algorithms based on structures of modern
    algebra (groups, rings, algebras, fields).
  • The language is the thing
  • Compiled programming language for writing
    libraries in the large
  • Syntactically similar, dynamically typed
    interpreted language for scripting.

30
Type Inference in Interpreter
31
More Complicated Types
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
Axiom
  • Great concept for building well-structured and
    flexible libraries.
  • Not enough dogfooding.
  • Top-level tried to hide types from user, but was
    not sufficiently successful at doing that.
  • Powerful and flexible, but too complex for most
    users.
  • Now open source.

37
Example 3 Aldor
  • Re-design of Axiom language 1984 on.
  • Initiator Watt.
  • The language is the thing, writ large
  • Efficiency, elegance, take no prisoners
  • Nothing special about built-in types
  • Dependent types everywhere
  • Interoperability with C and Lisp

38
Aldor and Its Type System
  • Types and functions are values
  • May be created dynamically
  • Provide representations of mathematical sets and
    functions
  • The type system has two levels
  • Each value belongs to a unique type, its domain,
    known statically.
  • This is an abstract data type that gives the
    representation.
  • The domains are values with domain Domain.
  • Each value may belong to any number of subtypes
    of its domain.
  • Subtypes of Domain are called categories.
  • Categories
  • specify what exports (operations, constants) a
    domain provides.
  • fill the role of OO interfaces or abstract base
    classes.

39
Why Two Levels?
  • OO inheritance pb with multi-argument fns
  • class SG (SG, SG) -gt SG DoubleFloat
    extends SG ...Permutation extends SG ...x, y ?
    DoubleFloat ? SGp, q ? Permutation ? SG
  • x y ?p q ?
  • p y ? ??? Bad, Bad, Bad

40
Why Two Levels?
  • OO inheritance pb with multi-argument fns
  • SG ... (, ) -gt DoubleFloat SG
    ...Permutation SG ...x, y ? DoubleFloat ?
    SGp, q ? Permutation ? SG
  • x y ?p q ?
  • p y ?

41
Parametric Polymorphism
  • PP is via category- and domain-producing
    functions.
  • -- A function returning an integer.
  • factorial(n Integer) Integer if n 0
    then 1 else nfactorial(n-1)
  • -- Functions returning a category and a domain.
  • Module(R Ring) Category Ring with (R,
    ) -gt
  • Complex(R Ring) Module(R) with
  • complex (,)-gtR real -gtR imag -gtR
    conj -gt ...
  • add
  • Rep Record(real R, imag R) 0
    1 (x ) (y ) ...

42
Dependent Types
  • Give dynamic typing, e.g.f (n Integer, R
    Ring, m IntegerMod(n)) -gt SqMatrix(n, R)
  • Recover OO through dependent productsprodl
    List Record(S Semigroup, s S)
    DoubleFloat, x, Permutation,
    p, DoubleFloat, y
  • With categories, guarantee required operations
    available
  • f(R Ring)(a R, b R) R ab ba

43
Multi-sorted Algebras
  • Category signature as a dependent product type.
  • ArithmeticModel Category with
  • Nat IntegralDomain
  • Rat Field
  • / (Nat, Nat) -gt Rat

44
Aldor and Its Type System
  • Type producing expressions may be
    conditionalUnivariatePolynomial(R Ring)
    Module(R) with
  • coeff (, Integer) -gt R
  • monomial (R, Integer) -gt
  • if R has Field then EuclideanDomain
  • ...
  • add
  • ...
  • Post facto extensions allow domains to belong to
    new categories after they have been initially
    defined.

45
Without Post Facto Extension forStructuring
Libraries
  • DirectProduct(n Integer, S Set) Set with
  • component (Integer, ) -gt S
  • new Tuple S -gt
  • if S has Semigroup then Semigroup
  • if S has Monoid then Monoid
  • if S has Group then Group
  • ...
  • if S has Ring then Join(Ring, Module(S))
  • if S has Field then Join(Ring,
    VectorField(S))
  • ...
  • if S has DifferentialRing then
    DifferentialRing
  • if S has Ordered then Ordered
  • ...
  • add ...

46
Post Facto Extension forStructuring Libraries
  • DirectProduct(n Integer, S Set) Set with
  • component (Integer, ) -gt S
  • new Tuple S -gt
  • add ...
  • extend DirectProduct(n Integer, S Semigroup)
    Semigroup ...
  • extend DirectProduct(n Integer, S Monoid)
    Monoid ...
  • extend DirectProduct(n Integer, S Group) Group
    ...
  • ...
  • extend DirectProduct(n Integer, S Ring)
    Join(Ring, Module(S)) ...
  • extend DirectProduct(n Integer, S Field)
    Join(Ring, VectorField(S)) ...
  • ...
  • extend DirectProduct(n Integer, S Field)
    Join(Ring, VectorField(S)) ...
  • extend DirectProduct(n Integer, S
    DifferentialRing) DifferentialRing ...
  • extend DirectProduct(n Integer, S Ordered)
    Ordered ...
  • ...
  • Normally these extensions would all be in
    separate files.

47
Higher Order Operations
  • E.g. Reorganizing constructions
  • Polynomial(x) Matrix(n) Complex R Complex
    Matrix(n) Polynomial(x) R
  • Slightly simpler example
  • List Array String R String Array List R

48
Higher Order Operations
  • Ag gt (S BasicType) -gt LinearAggregate S
  • swap(XAg, YAg)(SBasicType)(xX Y S)Y X S
    s for s in y for y in x
  • al Array List Integer array(list(ij-1 for i
    in 1..3) for j in 1..3)
  • la List Array Integer swap(Array,
    List)(Integer)(al)

49
Phew!
50
Using Genericity
  • LinearOrdinaryDifferentialOperator(
  • A DifferentialRing,
  • M LeftModule(A) with differentiate -gt
  • ) MonogenicLinearOperator(A) with
  • D
  • apply (, M) -gt M
  • ...
  • if A has Field then
  • leftDivide (, ) -gt (quotient ,
    remainder )
  • rightDivide(, ) -gt (quotient ,
    remainder )
  • // rgcd, lgcd
  • ...

51
Using Genericity
  • LinearOrdinaryDifferentialOperator(
  • A DifferentialRing,
  • M LeftModule(A) with differentiate -gt
  • ) ...
  • SUP(A) add
  • ...
  • if A has Field then
  • Op OppositeOperator(, A)
  • DOdiv NonCommutativeOperatorDivisio
    n(, A)
  • OPdiv NonCommutativeOperatorDivisio
    n(Op,A)
  • leftDivide (a,b) leftDivide(a,
    b)DOdiv
  • rightDivide(a,b) leftDivide(a,
    b)OPdiv
  • ...

52
Design Principles I
  • No compromises on flexibility
  • No compromises on efficiency
  • Use optimization to bridge the gap.
  • Compilation. Separate compilation.
  • Generated intermediate code is platform
    independent, even though word-sizes, etc, vary.
  • Libraries can be distributed, if desired, as
    binary only.
  • Be a good citizen in a multi-language framework.
  • Call and be called by C/C/Fortran/Lisp/Maple
  • Functional arguments
  • Cooperating memory management

53
Design Principles II
  • Language-defined types should have no privilege
    whatsoever over application-defined types.
  • Syntax, semantics (e.g. in type exprs),
    optimization (e.g. constant folding)
  • Language semantics should be independent of type.
  • E.g. named constants overloaded, not functions
  • Combining libraries should be easy, O(n), not
    O(n2).
  • Should be able to extend existing things with new
    concepts without touching old files or
    recompiling.
  • Safety through optimization removing run-time
    checks, not by leaving off the checks in the
    first place.

54
The Compiler as an Artefact
  • Written primarily in C (C too immature in 1990)
  • 1550 files, 295 K loc C 65 K loc Aldor
  • Intermediate code (FOAM)
  • Primitive types booleans, bytes, chars, numeric,
    arrays, closures
  • Primitive operations data access, control, data
    operations
  • Runtime system
  • Memory management
  • Big integers
  • Stack unwinding
  • Export lookup from domains
  • Dynamic linking
  • Written in C and Aldor

55
Example of Optimization
  • From the domain Segment(E OrderedAbelianMonoid)g
    enerator(segSegment E)Generator E generate
  • (a, b) (low seg, hi seg)
  • while a lt b repeat yield a a a 1
  • From the domain List(S Set)
  • generator(l List S) Generator S generate
  • while not null? l repeat yield first l l
    rest l
  • Client code
  • client()
  • ar array(...) li list(...)
  • s 0
  • for i in 1..ar for e in l repeat s s
    ar.i e
  • stdout ltlt s

56
How Generators Work
  • generator(segSegment Int)Generator Int
    generate
  • a lo seg
  • b hi seg
  • while a lt b repeat yield a a a 1
  • client()
  • ar array(...)
  • s 0
  • for i in 1..ar repeat s s a.i
  • stdout ltlt s

57
Example of Optimization (again)
  • From the domain Segment(E OrderedAbelianMonoid)g
    enerator(segSegment E)Generator E generate
  • (a, b) (low seg, hi seg)
  • while a lt b repeat yield a a a 1
  • From the domain List(S Set)
  • generator(l List S) Generator S generate
  • while not null? l repeat yield first l l
    rest l
  • Client code
  • client()
  • ar array(...) li list(...)
  • s 0 -- NOTE PARALLEL TRAVERSAL.
  • for i in 1..ar for e in l repeat s s
    ar.i e
  • stdout ltlt s

58
Inlined
B0 ar array(...) l list(...)
segment 1..ar lab1 B2 l2
l lab2 B9 s 0 goto
B1 B1 goto _at_lab1 B2 a segment.lo b
segment.hi goto B3 B3 if a gt b then
goto B6 else goto B4 B4 lab1 B5 val1
a goto B7 B5 a a 1 goto
B3 B6 lab1 B7 goto B7 B7 if lab1
B7 then goto B16 else goto B8 B8 i
val1 goto _at_lab2 B9 goto B10 B10 if
null? l2 then goto B13 else goto B11 B11 lab2
B12 val2 first l2 goto B14 B12
l2 rest l2 goto B10 B13 lab2 B14
goto B14 B14 if lab2 B14 then goto B16 else
goto B15 B15 e val2 s s ar.i e
goto B1 B16 stdout ltlt s
59
Clone Blocks for 1st Iterator
60
Dataflow
  • lab1 B2, lab1 B5, lab1 B7

61
Resolution of 1st Iterator
62
Clone Blocks for 2nd Iterator
63
Resolution of 2nd Iterator
client() ar array(...) l
list(...) l2 l s 0 a
1 b ar if a gt b then goto
L2 L1 if null? l2 then goto L2 e first
l2 s s ar.a e a a 1
if a gt b then goto L2 l2 rest l2
goto L1 L2 stdout ltlt s
64
Aldor vs C (non-floating pt)
65
Aldor vs C (floating point)
66
Follow-on Research Projects
  • Generic library inter-operability
  • Localized garbage collection
  • Dynamic abstract data types
  • Performance analysis of generics
  • Etc, etc

67
Lessons Learned
  • It is possible to be elegant, abstract and
    high-levelwithout sacrificing significant
    efficiency.
  • Well-known optimization techniques can be
    effectively adapted to the symbolic setting.
  • Optimization of generated C code is not enough.
  • Procedural integration, dataflow analysis,
    subexpression elimination and constant folding
    are the primary wins.
  • Compile-time memory optimization, including data
    structure elimination, is important.
  • Removes boxing/unboxing, closure creation,
    dynamic allocation of local objects, etc. Can
    move hot fields into registers.

68
Aldor Lessons
  • Language design 20 years old.
  • In the mean time, many of the ideas now
    mainstream.
  • Many still are not.
  • Mathematics is a valuable canary in the coal
    mine of general purpose software.
  • The general world lags in recognizing needs.
  • It has to be free.
  • Free1 is the standard price.
  • Free2 is required for engagement.

69
(No Transcript)
70
Example 4 MathML
  • First XML application, ever.
  • Language for exchange of mathematical data.
  • Initially

71
Example 4 MathML
72
MathML
  • OpenMath effort initiated 1993 for data exchange.
  • Unfulfilled ltmathgt element in HTML 3.2 Jan 1997.
  • Initial, unchartered Math WG defining microsyntax
    for ltmathgt.
  • Internecine rivalry between syntax and semantics
    camps coming from TeX, Mathematica and SGML.

73
MathML
  • Convened HTML-native math group to form unified
    proposal.
  • First ever XML application.
  • XML proposed recommendation December 1997.
  • MathML proposed recommendation February 1998.
  • Supported in major browsers, computer algebra
    systems, incorporated in HTML 5.

74
Example 5 InkML
  • Ink Messaging
  • Annotation
  • Archival

75
Pen-Based Math
  • Input for CAS and document processing.
  • 2D editing.
  • Computer-mediated collaboration.

76
Pen-Based Math
  • Does not require learning a special language

\sum_i0r g_r-i Xi sum(gr-iXi, i
0..r)
77
Pen-Based Math
  • Different than natural language recognition
  • 2-D layout is a combination of writing and
    drawing.
  • No fixed dictionary.
  • Many similar few-stroke characters.
  • Well segmented.
  • Highly ambiguous

78
Digital Ink Formats
  • Collected by surface digitizer or camera
  • Sequence of (x,y) points sampled at some known
    frequency
  • Possibly other info (angles, pressure, etc)
  • Grouping into traces, letters, words labelling

79
(No Transcript)
80
InkML Concepts
  • Traces, trace groups
  • Device information sampling rate, resolution,
    etc.
  • Pre-defined and application defined channels
  • Trace formats, coordinate transformations
  • Streaming and archival
  • Annotation text and XML

81
InkML Evolution
  • Started as low-level language for traces and
    hardware description. Explicitly disavowed
    semantics.
  • Wanted base language sufficiently rich to support
    full range of digital ink applications. Semantic
    grouping added, annotation, etc.
  • W3C Standard
  • Built in to Microsoft Office 2010

82
Various Language Projects
  • Reflex
  • Alma
  • Java/Aldor/C interop
  • Abstract Objects
  • Local GC
  • WWW GC

83
Research Symbol Recognition
  • Main idea Represent coordinate curves as
    truncated orthogonal series.
  • Advantages
  • Compact few coefficients needed
  • Geometric the truncation order is a property
    of the character set gives a natural metric on
    the space of characters
  • Algebraic properties of curves can be computed
    algebraically (instead of numerically using
    heuristic parameters)
  • Device independent resolution of the device is
    not important

84
Inner Product and Basis Functions
  • Choose a functional inner product, e.g.
  • lt f, ggt ? f(t) g(t) w(t) dt
  • This determines an orthonormal basis in the
    subspace of polynomials of degree d.Determine
    using GS on 1, t, t2, t3, ....
  • Can then approximate functions in subspaces

a, b
85
Like Symbols form Clouds
86
Problems
  • Want fast response how to work while trace is
    being captured.
  • Low RMS does not mean similar shape.

87
Pb 1. On-Line Ink
  • The main problem In handwriting recognition,
    the human and the computer take turns thinking
    and sitting idle.
  • We askCan the computer do useful work while the
    user is writing and thereby get the answer faster
    after the user stops writing?
  • We showThe answer is Yes!

88
On-Line Series Coefficients
  • If we choose the right basis functions, then the
    series coefficients can be computed on
    line.GolubitskySMW CASCON 2008, ICFHR 2008
  • The series coefficients are linear combinations
    of the moments, which can be computed by
    numerical integration as the points are received.
  • This is the Hausdorff moment problem (1921) ,
    shown to be unstable by Talenti (1987).
  • It is just fine, however, for the orders we need.

89
Pb 2. Shape vs Variation
  • The corners are not in the right places.
  • Work in a jet space to force coords derivatives
    close.
  • Use a Legendre-Sobolev inner product
  • 1st jet space gt set µi 0 for i gt 1.Choose µ1
    experimentally to maximize reco rate.Can be also
    done on-line.
  • Golubitsky SMW 2008, 2009

90
Distance Between Curves
  • Approximate the variation between curvesby some
    fn of distances between points.
  • May be coordinate curvesor curves in a jet
    space.
  • Sequence alignment
  • Interpolation (resampling)
  • Why not just calculate the area?
  • This is very fast in ortho series representation.

91
Distance Between Curves
92
Comparison of Candidate to Models
  • Use Euclidean distance in the coefficient space.
  • Just as accurate as elastic matching.
  • Much less expensive.
  • Linear in d, the degree of the approximation.lt 3
    d machine instructions (30ns) vs several
    thousand!
  • Can trace through SVM-induced cells
    incrementally.
  • Normed space for characters gives other
    advantages.

93
Distance-Based Classification
94
Distance-Based Classification
95
Geometry
  • Linear homotopies within a class

C (1? t) A t B
  • Can compute distance of a sample to this line
  • Convex hull of a set of models
  • SVM separating planes

96
Distance-Based Classification
97
Distance-Based Classification
98
Error Rates as Fn of Distance
  • SVM Convex Hull
  • Error rate as fn of distance gives confidence
    measure for classifiers MKM Golubitsky SMW

99
Recognition Summary
  • Database of samples gt set of LS points
  • Character to recognize gt
  • Integrate moments as being written
  • Lin. trans. to obtain one point in LS space
  • Classify by distance to convex hull of k-NN.
  • InkML allows natural representation of annotated
    database and real-time input.

100
Overall Conclusions
  • Mathematical problems provide excellent
    challenges for language design.
  • Rich, complex, hard
  • Well-defined
  • Performance matters a lot!
  • Dont be put off by the loud, confident
    proclamations of mass-market language designers.
Write a Comment
User Comments (0)
About PowerShow.com