Title: CS 320: Compiling Techniques
1CS 320 Compiling Techniques
2People
- David Walker (Professor)
- 412 Computer Science Building
- dpw_at_cs.princeton.edu
- office hours after each class
- Limin Jia, Jay Ligatti (TAs)
- 418a Computer Science Building
- ljia,jligatti_at_cs.princeton.edu
- office hours
- Mondays Wednesdays (well send email to the
email list)
3Information
- Web site
- www.cs.princeton.edu/courses/archive/spring05/cos3
20/index.htm - Mailing list
- To subscribe
- cos320-request_at_lists.cs.princeton.edu
- To post to this list, send your email to
- cos320_at_lists.cs.princeton.edu
4Books
- Modern Compiler Implementation in ML, Andrew
Appel - A reference manual for SML
- best choice Online references
- see course web site
- several hardcopy books
- Elements of ML Programming, Jeffrey D. Ullman
5Work
- Assignments
- build your own compiler
- approximately a module/week
- 40
- late penalty 20/day. Dont be late!
- ask questions of me, TAs, friends on course
mailing list - turn in your own work
- In class Midterm
- 25
- Final during exam period
- 35
6Assignment 0
- Write your name and other information on the
sheet circulating - Find, skim and bookmark the course web pages
- Subscribe to course e-mail list
- Begin assignment 1
- Figure out how to install, run use SML
- Due next Thursday February 16
- If youve never used a functional language like
ML, this might be a difficult assignment. Start
early!
7onward!
8What is a compiler?
- A compiler is program that translates a source
language into an equivalent target language
9What is a compiler?
while (i gt 3) ai bi i
C program
compiler does this
mov eax, ebx add eax, 1 cmp eax, 3 jcc eax, edx
assembly program
10What is a compiler?
class foo int bar ...
Java program
compiler does this
struct foo int bar ...
C program
11What is a compiler?
class foo int bar ...
Java program
compiler does this
........ ......... ........
Java virtual machine program
12What is a compiler?
\newcommand ....
Latex program
compiler does this
\sfd\sf\fadg
Tex program
13What is a compiler?
\newcommand ....
Tex program
compiler does this
\sfd\sf\fadg
Postscript program
14What is a compiler?
- Other places
- Web scripts are compiled into HTML
- assembly language is compiled into machine
language - hardware description language is compiled into a
hardware circuit - ...
15Compilers are complex
- text file to abstract syntax
- lexing parsing
- abstract syntax to intermediate form (IR)
- type checking analysis optimizations
- IR to machine code
- code generation data layout register
allocation more optimization
front-end
middle-end
back-end
16Course project
- Fun Source Language
- simple imperative language
- Only 1 IR (the initial abstract syntax generated
by the parser) - type checking high-level optimizations
- Code Generation
- instruction selection algorithms register
allocation via graph coloring
front-end
middle-end
back-end
17Standard ML
- Standard ML is a domain-specific language for
building compilers - Support for
- Complex data structures (abstract syntax,
compiler intermediate forms) - Memory management like Java
- Large projects with many modules
- Advanced type system for error detection
18Introduction to ML
- You will be responsible for learning ML on your
own. - Today I will cover some basics
- Resources
- Robert Harpers Online book an introduction to
ML is a good place to start - See course webpage for pointers and info about
how to get the software
19Preliminaries
- start sml in Unix by typing sml at a prompt
- tux sml
- Standard ML of New Jersey, Version 110.0.7,
September 28, 2000 CM autoload enabled - -
- ( quit SML by pressing ctrl-D ctrl-Z some
times... ) - ( just so you know, comments can be ( nested )
)
20Preliminaries
- Read Eval Print Loop
- - 3 2
21Preliminaries
- Read Eval Print Loop
- - 3 2
- gt 5 int
22Preliminaries
- Read Eval Print Loop
- - 3 2
- gt 5 int
- - it 7
- gt 12 int
23Preliminaries
- Read Eval Print Loop
- - 3 2
- gt 5 int
- - it 7
- gt 12 int
- - it 3
- gt 9 int
- - 4 true
- stdIn17.1-17.9 Error operator and operand don't
agree literal - operator domain int int
- operand int bool
- in expression
- 4 true
24Preliminaries
- Read Eval Print Loop
-
- - 3 div 0
- Failure Div run-time error
25Basic Values
- - ()
- gt () unit gt like void in C (sort of)
- gt the uninteresting value/type
- - true
- gt true bool
- - false
- gt false bool
- - if it then 32 else 7 else clause is always
necessary - gt 7 int
- - false andalso loop_Forever
- gt false bool and also, or else short-circuit
eval
26Basic Values
- Integers
- - 3 2
- gt 5 int
- - 3 (if not true then 5 else 7)
- gt 10 int No division between expressions
- and statements
- Strings
- - Dave Walker
- gt Dave Walker string
- - print foo\n
- foo
- gt 3 int
- Reals
- - 3.14
- gt 3.14 real
27Using SML/NJ
- Interactive mode is a good way to start learning
and to debug programs, but - Type in a series of declarations into a .sml
file - - use foo.sml
- opening foo.sml
-
list of declarations with their types
28Larger Projects
- SML has its own built in interactive make
- Pros
- It automatically does the dependency analysis for
you - No crazy makefile syntax to learn
- Cons
- May be more difficult to interact with other
languages or tools
29Compilation Manager
sources.cm
c.sml
b.sml
a.sig
Group is a.sig b.sml c.sml
- sml
- OS.FileSys.chDir /courses/510/a2
- CM.make() looks for sources.cm, analyzes
dependencies - compiling compiles files in group
- wrote saves binaries in ./CM/
- - CM.make myproj/() specify directory
30What is next?
- ML has a rich set of structured values
- Tuples (17, true, stuff)
- Records name Dave, ssn 332177
- Lists 345nil or 3,4_at_5
- Datatypes
- Functions
- And more!
- Rather than list all the details, we will write a
couple of programs
31An interpreter
- Interpreters are usually implemented as a series
of transformers
lexing/ parsing
evaluate
print
stream of characters (concrete syntax)
abstract syntax
abstract value
stream of characters
32A little language (LL)
- An arithmetic expression e is
- a boolean value
- an if statement (if e1 then e2 else e3)
- an integer
- an add operation
- a test for zero (isZero e)
33LL abstract syntax in ML
datatype term Bool of bool If of term
term term Num of int Add of term term
IsZero of term
vertical bar separates alternatives
34LL abstract syntax in ML
- This one declaration creates
- a new type (called term)
- a new set of functions for
- creating terms (Bool, If,
- Num, Add, IsZero)
- a new set of patterns you
- can use case statements
- (like Cs switch) that
- check what sort of term
- object you have
datatype term Bool of bool If of term
term term Num of int Add of term term
IsZero of term
vertical bar separates alternatives
35LL abstract syntax in ML
datatype term Bool of bool If of term
term term Num of int Add of term term
IsZero of term
-- by convention, constructors are
capitalized -- constructors can take a
single argument of a particular type
type of a tuple, in this case a triple of 3 term
objects
vertical bar separates alternatives
36LL abstract syntax in ML
Add
In your program, writing Add (Num 2, Num
3) makes an object tagged with Add containing 2
sub-objects tagged with Num represents the
expression 2 3
Num
Num
2
3
37LL abstract syntax in ML
If
If (Bool true, Num 0, Add (Num 2, Num
3)) represents if true then 0 else 2 3
Add
Bool
Num
true
Num
Num
0
3
2
38Function declarations
fun isValue (tterm) bool case t of Num
n gt true Bool b gt true _ gt false
39Function declarations
function name
function parameter t with type term
fun isValue (tterm) bool case t of Num
n gt true Bool b gt true _ gt false
patterns in pink
function result type is bool
default pattern matches anything
40Function declarations
ML type inference can infer the types of
parameters and results
fun isValue t case t of Num n gt true
Bool b gt true _ gt false
41A type error
fun isValue t case t of Num n gt n _
gt false
ex.sml22.3-24.15 Error types of rules don't
agree literal earlier rule(s) term -gt int
this rule term -gt bool in rule _ gt false
42A type error
Sometimes, ML will give you several errors in a
row ex.sml22.3-25.15 Error types of rules
don't agree literal earlier rule(s) term -gt
int this rule term -gt bool in rule _ gt
true ex.sml22.3-25.15 Error types of rules
don't agree literal earlier rule(s) term -gt
int this rule term -gt bool in rule _ gt
false
43A very subtle error
fun isValue t case t of num gt true _
gt false
The code above type checks. But when we test it
refined the function always returns true. What
has gone wrong?
44A very subtle error
fun isValue t case t of num gt true _
gt false
The code above type checks. But when we test it
refined the function always returns true. What
has gone wrong? -- num is not capitalized (and
has no argument) -- ML treats it like a variable
pattern (matches anything!)
45Exceptions
exception Error of string fun debug s unit
raise (Error s)
46Exceptions
exception Error of string fun debug s unit
raise (Error s)
in SML interpreter
- debug "hello" uncaught exception Error
raised at ex.sml15.28-15.35
47Evaluator
fun isValue t ... exception NoRule fun eval t
case t of Bool _ Num _ gt t ...
48Evaluator
... fun eval t case t of Bool _ Num _
gt t If(t1,t2,t3) gt let val v eval
t1 in case v of Bool b gt if
b then (eval t2) else (eval t3) _ gt
raise NoRule end
let statement for remembering temporary results
49Evaluator
exception NoRule fun eval1 t case t of
Bool _ Num _ gt ... ... Add (t1,t2) gt
case (eval v1, eval v2) of (Num
n1, Num n2) gt Num (n1 n2) (_,_) gt
raise NoRule
50Finishing the Evaluator
fun eval1 t case t of ... ... Add
(t1,t2) gt ... IsZero t gt ...
be sure your case is exhaustive
51Finishing the Evaluator
fun eval1 t case t of ... ... Add
(t1,t2) gt ...
What if we forgot a case?
52Finishing the Evaluator
fun eval1 t case t of ... ... Add
(t1,t2) gt ...
What if we forgot a case?
ex.sml25.2-35.12 Warning match nonexhaustive
(Bool _ Zero) gt ... If
(t1,t2,t3) gt ... Add (t1,t2) gt ...
53Summary
- All ML expressions produce values that have a
particular type - ML doesnt have statements
- ML can do type inference (and give you
hard-to-decrypt error messages - ML data types are super-cool
- a new type name (term)
- new constructors (Bool, If, Num, ...)
- new patterns (Bool b, If (x,y,_), Num _, ...)
- ML has a top-level loop to execute commands and
a compilation manager - type CM.Make() to load and compile a project
- edit sources.cm to add new files
54Last Things
- Learning to program in SML can be tricky at first
- But once you get used to it, you will never want
to go back to imperative languages - Check out the reference materials listed on the
course homepage