Title: Lecture
1Lecture 1, Jan. 9, 2007
- Course Mechanics
- Text Book
- Down-loading SML
- Syllabus - Course Overview
- Entrance Exam
- Standard ML
- This weeks assignment
- Top to bottom example
- Lexical issues
- Parsing and syntax issues
- Translation issues
2Acknowledgements
- The material taught in this course was made
possible by many people. Here is a partial list - Andrew Tolmach
- Nathan Linger
- Harry Porter
- Jinke Lee
3Class Web Page
- The CS321 class web page can be found at
- www.cs.pdx.edu/sheard/course/Cs321
- Contents of the page
- Course Syllabus
- Link to the ML home page
- Copies of the PowerPoint slides used in lectures
- Copies of the assignments
- Project Description
- Copies of the SML code illustrated in the
lectures - The web page will be updated after each lecture.
4Todays Assignments
- Reading
- Engineering a Compiler
- Available In the PSU bookstore
- Chapter 1, pp 1-26
- There will be a 5 minute quiz on the reading
Wednesday. - Search
- Find the class webpage
- 1 page programming Assignment
- Due Wednesday, Jan 10, 2007. In Just 2 Days!!
- Login to some SML system. See how the system
operates. Type in solutions (in a file) to the
programming problems (In Class exercises 1 and 2
in this handout), load them into SML. Get them
running, and print them out then turn them in on
Wednesday. What matters here is that you try out
the SML system, not that you get them perfect.
5Course Information
- CS321 - Languages and Compiler Design
- Time Monday Wednesday 1800-1950 pm
- Place PCAT 138
- Instructor Tim Sheard
- office room 115, CS Dept, 4th Ave Building,
Portland State Univ. - phone 503-725-2410 (work) 503-649-7242
(home) - office hours Before class in my office
(500-550), or by Appt. - Assignments
- Reading from text and handouts (quizzes on
reading) - Daily, 1 page programming assignments
- 3 part programming project
- Grading
- midterm exam (25)
- 3 parts of project (30)
- Daily 1 page assignments and quizzes (15)
- Final exam (30 )
6Examinations
- Entrance Exam.
- Do you know your REs and CFGs?
- Quizzes on Reading Material.
- There is a possible quiz on every reading
assignment - There will be a quiz on Wednesday!
- Mid Term exam
- Wed. Feb 14, 2007. Time in class.
- Final exam
- Monday, Mar. 19, 2007. Time 600-750.
7Text Book
- Text Engineering a Compiler
- Keith D. Cooper, and Linda Torczon
- Other Reference Materials
- Auxilliary Material
- Elements of Functional Programming (SML book)
- by Chris Reade, Addison Wesley, ISBN
0-201-12915-9 - Using the SML/NJ System
- http//www.cs.cmu.edu/petel/smlguide/smlnj.htm
- Class Handouts
- Each class, a copy of that days slides will be
available as a handout. - I will post files that contain the example
programs used in each lecture on the class web
page www.cs.pdx.edu/sheard/course/Cs321 - I will post Assignments there as well.
8Labs
- Whenever you learn a new language its great to
have someone looking over your shoulder. - In this spirit I have scheduled some lab times
where people can work on learning ML while I am
there to help. - FAB INTEL Lab (FAB 55-17) downstairs by the
Engineering and Technology Manangements
departmental offices - Friday Jan. 12, 2007. 400 530 PM
- Tueday Jan. 16, 2007 400 530
- Friday Jan. 19, 2005. 400 530 PM
- Labs are not required, but attendance of at
least one is highly recommended!
9Installing SML
- Software can be obtained at
- http//www.smlnj.org/
- I am using the most recent version 110.60
- but it displays the version 110.57 when it runs
- Browse the documentation and Literature section
of the SML web page. Find some resources that
you can use. - SML also runs on the PSU linux and Intel labs
- linux
- usepkg sml
- then logout, or start a new shell
- type sm
- Intel
- In a commnd window
- p\programs\smlnj\addpkg.cmd
- then logout, or start a new command window
- then just type
- N\sml
10Entrance Exam
- CS321 has some pretty serious prerequisites.
- Write a regular expression for the set of strings
that begins with an a which is followed by an
arbitrary number of bs or cs, and is ended
by a d. - e.g. ad, abbbd, abcbcbcd, etc.
- 2. Transform your regular expression into a DFA
- 3. Write a context free grammar that recognizes
the same set of strings as your RE - 4 Transform your CFG into a CFG that is
left-recursion free.
11Academic Integrity
- Students are expected to be honest in their
academic dealings. Dishonesty is dealt with
severely. - Homework. Pass in only your own work.
- Program assignments. Program independently.
- Examinations. Notes and such, only as each
instructor allows. - OK to discuss how to solve
- problems with other students,
- but each student should
- write up, debug, and turn in his
- own solution.
12Course Thesis
- This course is about programming languages. We
study languages in two ways. - From the perspective of the user
- From the perspective of the implementer (compiler
writer) - We will learn about some languages you may never
have heard of. We will learn to program in one of
them (Standard ML). Its good to learn a new
language in depth. - This course is also about programming. There will
be extensive programming assignments in SML. If
you dont do them - you wont learn - Youre deluding yourself if you think you can
learn the material without doing the exercises! - We will write a comiler for a Java subset. Its
good to understand the implementation details of
a language you already know.
13This course is all about programming
- What makes a good program?
- Write at least 3 things on a piece of paper.
14Standard ML
- In this course we will use an implementation of
the language Standard ML - The SML/NJ Homepage has lots of useful
information http//www.smlnj.org// - You can get a version to install on your own
machine there. - I will use the version 110.57 or 110.60 of SML.
Earlier versions probably will work as well. I
dont foresee any problems with other versions,
but if you want to use the identical version that
I use in class then this is the one.
15Characteristics of SML
- Applicative style
- input output description of problem.
- First class functions
- pass as parameters
- return as value of a function
- store in data-structures
- Less Importantly
- Automatic memory management (G.C. no new or
malloc) - Use of a strong type system which uses type
inference, i.e. no declarations but still
strongly typed.
16Syntactic Elements
- Identifiers start with a letter followed by
digits or other letters or primes or underscores. - Valid Examples a a3 ab aF
- Invalid Examples 12A
- Identifiers can also be constructed with a
sequence of operators like !_at_ - Reserved words include
- fun val datatype if then else
- if of let in end type
17Interacting
- The normal style for interaction is to start SML,
and then type definitions into the window. - Types of commands
- 4 5
- val x 34
- fun f x x 1
- Here are two commands you might find useful.
- val pwd OS.FileSys.getDir
- val cd OS.FileSys.chDir
- To load a file that has a sml program type
- Use file.sml
18The SML Read-Typecheck-Eval-Print Loop
- Standard ML of New Jersey v110.57 built Mon Nov
21 214628 2005 - -
- - 35
- val it 8 int
- -
- - print "Hi there\n"
- Hi there
- val it () unit
- -
- - val x 22
- val x 22 int
- -
- - x 5
- val it 27 int
- -
- val pwd OS.FileSys.getDir
- val pwd fn unit - string
- - val cd OS.FileSys.chDir
Note the semicolon when youre ready to evaluate.
Otherwise commands can spread across several
lines.
19In Class Exercise 1
- Define prefix and lastone in terms of head tail
and reverse. - First make a file S01code.sml
- Start sml
- Change directory to
- where the file resides
- Load the file ( use S01code.html )
- Test the function
fun lastone x hd (rev x) fun prefix x rev (tl
(rev x))
Standard ML of New Jersey v110.57 - K - val cd
OS.FileSys.chDir val cd fn string - unit -
cd "D/work/sheard/courses/PsuCs321/web/notes" -
use "S01code.html" opening S01code.html val
lastone fn 'a list - 'a val prefix fn 'a
list - 'a list val it () unit - lastone
1,2,3,4 val it 4 int
20In Class Exercise 2
- define map and filter functions
- mymap f 1,2,3 f 1, f 2, f 3
- filter even 1,2,3,4,5 2,4
- fun mymap f
- mymap f (xxs) (f x)(mymap f xs)
- fun filter p
- filter p (xxs)
- if (p x) then x(filter p xs) else (filter
p xs) - Sample Session
- - mymap plusone 2,3,4
- 3, 4, 5
- - filter even 1,2,3,4,5,6
- 2, 4, 6
21Course topics
- Programming Language
- Types of languages
- Data types and languages
- Types and languages
- Compilers
- Lexical analysis
- Parsing
- Translation to abstract syntax using modern
parser generator technology. - Type checking
- identifiers and symbol table organization,
- Next Quarter in the second class of the sequence
- Intermediate representations
- Backend analysis
- Transformations and optimizations for a number
of different kinds of languages
22Multi Pass Compilers
- Passes
- text
- tokens
- syntax trees
- intermediate forms
- (three address code, CPS code, etc)
- assembly code
- machine code
- Each phase is from one form to another, OR from
one form to the same form, which is often called
a source to source transformation.
23The Top to Bottom Example
z x pi 12.0
id(z)
eql
id(x)
plus
id(pi)
times
float(12.0)
Id(z)
Id(z)
Id(x)
Id(pi)
float(12.0)
24Passes (cont)
- Three address code
- temp1 pi 12.0
- z x temp1
- Assembly level code
- ld r1,x
- ld r2,pi
- add r1,r2
- ldi r2,12.0
- mul r1,r2
- st r1,z
25Lexical Analysis
- Produces Tokens and Deals with
- white space
- comments
- reserved word identification
- symbol table interface
- Tokens are the terminals of grammars.
- Lexical analysis reads the whole program,
character by character thus it needs to be
efficient. This implies fancy buffering
techniques etc. Modern lexical generators handle
these problems so we will ignore them.
26Tokens, Patterns Lexemes
- Many strings from the input may produce the same
TOKEN i.e. identifiers, integers constants,
floats - A PATTERN describes a rule which describes which
strings are assigned to a token. - A LEXEME is the exact sequence of input
characters matched by a PATTERN.
27Examples
- lexeme pattern token
- x Id "x"
- abc Id "abc"
- 152 Constant(152)
- then then ThenKeyword
- Many lexemes map to the same token. e.g. x and
abc . - Note, some lexemes might match many patterns.
e.g. "then" above. Need to resolve ambiguity. - Since tokens are terminals, they must be
"produced" by the lexical phase with synthesized
attributes in place. (e.g. name of an
identifier). e.g. id(x) and constant(152)
28Syntax, Parse Trees Grammars
- Syntax (the physical layout of the program)
- Grammars describe precisely the syntax of a
language. Two kinds of grammars which compiler
writers use a lot are regular, and context free - Informal Definitions of
- Regular
- concatenation, union, star
- Context Free
- only one symbol on the lhs of
- a production
29Example Grammar
- Sentence Subject Verb Object
- Subject Proper-noun
- Object Article Adjective Noun
- Verb ate saw called
- Noun cat ball dish
- Article the a
- Adjective big bad pretty
- Proper-noun tim mary
-
- Start Symbol Sentence
- Example sentence tim ate the big ball
30Recursive Grammar Examples
- Recursive Grammars describe infinite languages
- list num morenum
- morenum , num morenum
-
-
-
derives 2 , 2,4,
2,4,6 ... - Exp id
- Exp Exp
- Exp Exp
- ( Exp )
-
derives x, xx, xxx, ...
31Parse Trees
- Each nonterminal on the lhs of a production
"roots" a tree -
-
- Each node in a tree with all its immediate
children is derived from a single production of
the grammar - We desire a program which constructs a parse tree
from a string. Such programs are different for
every grammar, we some times use tools to
construct such programs (yacc).
32Syntax Directed Translations
- A syntax directed translation traverses a syntax
tree and builds a translation in the process. - Considerations
- Tree Traversal orders
- Left to right?
- right to left?
- in-order, pre-order, or post-order
- Where does the information about what to do in
the traversal come from? - Attribute grammars
- Inherited attributes
- Synthesized attributes
33Example Translation Process
- Translation as an abstract syntax to abstract
syntax transformer - We represent this as a grammar with actions
... . The action is performed when that
production is reduced. - Exp Term terms
- terms Term print "" term
-
- Term Factor factors
- factors Factor print "" factors
-
- Factor id print id.name
- ( Exp )
34Semantics
- How do we know what to translate the syntax tree
into? - How do we know if it is correct?
- Semantics
- denotational semantics
- operational semantics
- interpreters
- Very useful in writing compilers since they give
a reference when trying to decide what the
compiler should do in particular cases.
35Over view
- Compilation is a large process
- It is often broken into stages
- The theories of computer science guide us in
writing programs at each stage. - We must understand what a program means if we
are to translate it correctly. - Many phases of the compiler try and optimize by
translating one form into a better (more
efficient?) form. - Most of compiling is about pattern matching
languages and tools that support pattern matching
are very useful.