Title: Lecture
1Lecture 6, Jan. 31, 2007
- Notes about homework
- Project 1
- More about ML-LEX
- String lexers
- File lexers
- Handling exceptions
- Using named REs
- Library functions
- Anonymous functions
- The compile manager
2Survey
- How many intend to take CS322 next quarter?
- How many people would like to see the class in
the evening as it is this quarter (600 - 750
pm) - How many would like to see the class earlier in
the day. Say 200 350 pm - How many would like to see the class later in the
afternoon, but not in the evening. Say 400 550
3A note about handing in homework
- When you hand in homework, please hand in the
following - The complete file as you wrote it, with no extra
text, etc. I should be able to load the file (as
is) into SML. - A trace of your test of the program. This should
include commands to load the file above, and a
sequence of tests that test the code you wrote.
Many times I ask you to extend some code. Be sure
and test the code you wrote, not my code. My code
has enough errors I dont want to know about new
ones, unless it affects your code. - You should include enough tests to convunce me
your program works. You should include enough
tests so that every statement in your program is
exercised at least once in some test. 3 tests
per function is the minimum, some functions may
require more if they have many paths. - An optional cover sheet where you provide any
additional information I need to grade your
assignment. - Be sure that your name is clearly written on the
top left hand side of what you hand in. - If your program doesnt load, a trace of the
errors may help me figure out what went wrong, so
I can suggest a fix.
4Project 1
- Project 1 is assigned today (Wed. Jan 31, 2007)
- It is Due in 2 weeks Wed. Feb. 15, 2007
- It can be downloaded off the web page.
- Other useful links can also be found there.
- There will be no homework assigned next Monday or
Wednesday. But there is homework assigned today. - Todays homework is practice to get you started
on using sml-lex.
5Example lex file
type lexresult unit type pos int type
svalue int exception EOF fun eof () (print
"eof" raise EOF) \t\ gt (
lex() ( ignore whitespace ) ) AnneBobSpot
gt ( print (yytext" is a proper
noun\n")) athe gt ( print(yytext" is an
article\n") ) boygirldog gt (
print(yytext" is a noun\n") ) walkedchasedran
bit gt ( print(yytext" is a verb\n")
) a-zA-Z gt ( print(yytext" Might be a
noun?\n") ) .\n gt ( print yytext ( Echo
the string ) )
lexresult must be defined in every lex program
The function eof must be defined in every lex
program
6Running ml-lex
- Cd to the directory where the foo.lex file
resides (or else use the full pathname of the
file). - E.g.
- ..LexYacc\englishgt ml-lex english.lex
- Number of states 43
- Number of distinct rows 33
- Approx. memory size of trans. table 4257 bytes
- This creates the file english.lex.sml
- Start SML and then load the file.
- LexYacc\englishgt sml
- Standard ML of New Jersey v110.57 built Mon Nov
21 214628 2005 - - use "english.lex.sml"
- opening english.lex.sml
- structure Mlex
- sig
- val makeLexer (int -gt string) -gt unit -gt
Internal.result - exception LexError
- structure Internal ltsiggt
This is a synonym for the type lexresult defined
in the lex file.
7Building the lexer
- Consider the function Mlex.makeLexer
- Mlex.makeLexer (int -gt string) -gt unit -gt
lexresult - It takes a function as an argument. This function
feeds the lexical analyzer the input n
characters at a time. - val testString ref "the boy chased the dog"
- fun feed n
- let val ss !testString
- in if String.size ss lt n
- then ( testString "" ss )
- else ( testString String.extract(ss,n,N
ONE) - String.extract(ss,0,SOME n) )
- end
- val lex Mlex.makeLexer feed
8Running the lexer
- - val lex Mlex.makeLexer feed
- val lex fn unit -gt Mlex.Internal.result
- - lex()
- the is an article
- val it () Mlex.Internal.result
- - lex()
- boy is a noun
- val it () Mlex.Internal.result
- - lex()
- chased is a verb
- val it () Mlex.Internal.result
- - lex()
- the is an article
- val it () Mlex.Internal.result
- - lex()
- dog is a noun
- val it () Mlex.Internal.result
- - lex()
- unexpected end of file
9Exceptions
- exception Error of string
- fun error s raise (Error s)
- fun ex3 b
- (if b then error "true branch"
- else "false branch")
- handle Error message gt message
- other gt raise other
-
A new exception is declared, it carries a string
as error information. Exceptions can carry any
kind of data.
Exceptions can be raised, to short-circuit normal
evaluation.
Main computation returns a string
Keyword handle
Handler also returns a string
A handler, like a case has multiple clauses,
each can handle a different kind of error. Note
separating clauses.
10Syntax of case vs handle
- case is before the computation being analyzed
- case (revonto (g x) 4) of
- gt true
- (xxs) gt false
- handle is after the computation that might fail
- (compute (g y) (length zs))
- handle Error s gt g s
- BadLevel n gt n1
- other gt raise other
11Catching Exceptions in the Lexer
- val lex
- let val f Mlex.makeLexer feed
- fun lex ()
- (f ()) handle
- Mlex.UserDeclarations.EOF gt
- print "\nReached end of file\n"
- other gt print "Lex Error"
- in lex end
Things defined in the first section of a lex file
appear in the inner library
12A String Lexer
- For testing purposes it would be nice to generate
lexers that lex a string given as input (rather
than one fixed in a variable like teststring). - fun makeStringLexer s
- let val testString ref s
- fun feed n
- let val ss !testString
- in if String.size ss lt n
- then ( testString "" ss )
- else ( testString
- String.extract(ss,n,NONE)
- String.extract(ss,0,SOME n)
) - end
- in Mlex.makeLexer feed end
13Testing a lexer interactively
- - val f makeStringLexer "the boy sang"
- val f fn unit -gt Mlex.Internal.result
- - f ()
- the is an article
- val it () Mlex.Internal.result
- - f ()
- boy is a noun
- val it () Mlex.Internal.result
- - f ()
- sang Might be a noun?
- val it () Mlex.Internal.result
- - f ()
- unexpected end of file
- uncaught exception EOF
- raised at english.lex.sml9.53-9.56
14Exercise
- Can we make a string lexer that catches
exceptions? - Try it in class
15Running from a file
- To lex a file, rather than an explicit string we
need to define a function that reads and returns
n characters of a file at a time. - fun inputc h
- (fn n gt
- TextIO.inputN(h,n)string)
-
- val lex let val h TextIO.openIn
test.english - in Mlex.makeLexer (inputc h) end
Reads n characters from a file pointed to by
handle
Opens a file and returns a handle
16Running lex
- val lex
- Mlex.makeLexer
- (inputc "test.english")
- - lex()
- the is an article
- val it () unit
- - lex()
- 9val it () unit
- - lex()
- 9val it () unit
- - lex()
- chased is a verb
- val it () unit
The file test.english
the 99 chased the dog
17Named REs in SML-Lex
english2.lex
- type lexresult Token
- type pos int
- type svalue int
- fun eof () EOF
-
-
- digit0-9
- numberdigit
-
- \t\ gt ( lex() )
- AnneBobSpot number gt ( ProperNoun yytext )
- athe gt ( Article yytext )
- boygirldog gt ( Noun yytext )
- walkedchasedranbit gt ( Verb yytext )
Not sure why we need pos and svalue
We can name a RE Name RE Semicolon
Use a name by surrounding it in s
Note that all unrecognized input will raise the
default exception.
18A driver program
- datatype Token
- ProperNoun of string
- Noun of string
- Verb of string
- Article of string
- EOF
- use "english2.lex.sml"
- fun makeStringLexer s
- let val testString ref s
- fun feed n
- let val ss !testString
- in if String.size ss lt n
- then ( testString "" ss )
- else ( testString
String.extract(ss,n,NONE) - String.extract(ss,0,SOME n)
) - end
- val f Mlex.makeLexer feed
Must come after the definition of token or there
will be errors.
19Test it out
- val f makeStringLexer "Anne chased the dog"
- val f fn unit -gt Token
- - f()
- val it ProperNoun "Anne" Token
- - f()
- val it Verb "chased" Token
- - f()
- val it Article "the" Token
- - f()
- val it Noun "dog" Token
- - f()
- val it EOF Token
20More SML
- In SML we use library functions all the time.
- Int.toString
- List.exists
- The list library functions are particularly
useful. - These library functions often take a function as
an argument - List.map ('a -gt 'b) -gt 'a list -gt 'b list
- List.find ('a -gt bool) -gt 'a list -gt 'a option
- List.filter ('a -gt bool) -gt 'a list -gt 'a list
- List.exists ('a -gt bool) -gt 'a list -gt bool
- List.all ('a -gt bool) -gt 'a list -gt bool
- List.foldr ('a 'b -gt 'b) -gt 'b -gt 'a list -gt
'b - It is worth studying these functions closely
21List.map captures a pattern
- Add one to every element of a list
- fun addone
- addone (xxs) (x 1) addone xs
- addone 2,3,4 ? val it 3,4,5 int list
- Turn a list of ints into a list of strings
- fun stringy
- stringy (xxs) (Int.toString x) stringy
xs - stringy 2,5,9 ? val it "2","5","9"string
list - Negate every element of a list
- fun negL
- negL (xxs) (not x) negL xs
- negL true,3 gt 4 ? val it false,true bool
list
22Pattern
- fun addone
- addone (xxs) (x 1) addone xs
- fun stringy
- stringy (xxs) (Int.toString x) stringy
xs - fun negL
- negL (xxs) (not x) negL xs
- fun map f
- map f (xxs) (f x) (map f xs)
- val ex1 map (fn x gt (x1)) 2,3,4
- val ex1 3,4,5 int list
- val ex2 map Int.toString 2,5,7
- val ex2 "2","5","7" string list
- val ex3 map not true, 3 gt 4
- val ex3 false,true bool list
23Anonymous functions
- Study (fn x gt (x1))
- It is an anonymous function. A function without a
name. - It has one parameter x
- It adds one to its parameter, and returns the
result. - (fn x gt (x1)) 4
- val it 5 int
- Any non-recursive function can be written
anonymously. - (fn x gt x 5)
- Tests if its parameter is equal to 5
- map (fn x gt x5) 1,4,5,3,5
- val it false,false,true,false,true bool
list - (fn x gt fn y gt (x,y))
- Has two parameters
- Returns a pair
- (fn (x,y) gt (not y, x3))
- What is the type of this function?
24List.find
- Used for searching a list.
- List.find ('a -gt bool) -gt 'a list -gt 'a option
- Uses a function as a parameter to determine if
the search is successful. - E.g. Is there an even element in a list?
- List.find even 1,3,5
- val it NONE int option
- List.find even 1,3,4
- val it SOME 4 int option
25List.find and anonymous functions
- List.find (fn x gt x "Tim")
- "Tom", "Jane"
- val it NONE string option
- List.find (fn x gt even x andalso xgt10)
- 2,4,5,12
- val it SOME 12 int option
26List.filter
- Filter keeps some elements, and throws away
others. - List.filter ('a -gt bool) -gt 'a list -gt 'a list
- It uses a function (p) as a parameter to decide
which elements to keep (p x true), and which to
throw away (p x false) - val ex6 List.filter even 1,2,3,4,5,6
- val ex6 2,4,6 int list
27List.filter and anonymous functions
- val people ("tim",22),("john",18),("jane",25),(
"tim",8) - val ex7 filter
- (fn (nm,age) gt nm ltgt "tim" orelse
agegt10) - people
- val ex7
- ("tim",22),("john",18),("jane",25)
- (string int) list
28List.exists
- exists is like find in that it searches a
list - but rather than the element that completes the
search it is only interested in if such an
element exists. - List.exists ('a -gt bool) -gt 'a list -gt bool
- Uses a function as a parameter to determine if
the search is successful. - val ex8 List.exists even 2,3,5
- val ex8 true bool
- Note that even if only 1 element in the list
causes the function to be true, exists returns
true.
29List.all
- List.all tests elements in a list for a property.
It returns true only if every element has that
property. - List.all ('a -gt bool) -gt 'a list -gt bool
- Uses a function as a parameter to perform the
test. - val ex9 List.all even 2,4,5
- val ex9 false bool
- List.exists and List.all are related functions.
They are duals. - not(List.all p xs) List.exists (fn x gt not(p
x)) xs
30List.foldr captures a pattern
- Add up every element in a list.
- fun sum 0
- sum (xxs) x (sum xs)
- Compute the maximum element in a list of natural
numbers (Integers gt 0). - fun maximum 0
- maximum (xxs) Int.max(x,maximum xs)
- Compute if every element in a list of boolean is
true. - fun allTrue true
- allTrue (xxs) x andalso (allTrue xs)
31Pattern
- fun sum 0
- sum (xxs) x (sum xs)
- fun maximum 0
- maximum (xxs) Int.max(x,maximum xs)
- fun allTrue true
- allTrue (xxs) x andalso (allTrue xs)
- fun foldr acc base base
- foldr acc base (xxs)
- acc(x,foldr acc base xs)
32See the pattern in use.
- fun sum 0
- sum (xxs) x (sum xs)
- fun sum xs foldr (op ) 0 xs
- fun maximum 0
- maximum (xxs) Int.max(x,maximum xs)
- fun maximum xs foldr Int.max 0 xs
- fun allTrue true
- allTrue (xxs) x andalso (allTrue xs)
- fun allTrue xs
- foldr (fn (a,b) gt a andalso b) true xs
33Take another look
- What does this function do?
- fun ok false
- ok xs not(exists (fn ys gt xsys) (!old))
- andalso
- not(exists (fn ys gt xsys)
(!worklist))
34The Option Library
- - open Option
- opening Option
- datatype 'a option NONE SOME of 'a
- exception Option
- val getOpt 'a option 'a -gt 'a
- val isSome 'a option -gt bool
- val valOf 'a option -gt 'a
- val filter ('a -gt bool) -gt 'a -gt 'a option
- val join 'a option option -gt 'a option
- val app ('a -gt unit) -gt 'a option -gt unit
- val map ('a -gt 'b) -gt 'a option -gt 'b option
- val mapPartial ('a -gt 'b option) -gt
- 'a option -gt
- b option
-
35Interesting functions that use Options
- Int.fromString string -gt int option
- Int.fromString "234"
- val it SOME 234 int option
- Int.fromString "abc"
- val it NONE int option
- String.extract string int int option -gt
string - String.extract("abcde",1,SOME 3)
- val it "bcd" string
- String.extract("abcde",1,NONE)
- val it "bcde" string
36More option functions
- List.find ('a -gt bool) -gt 'a list -gt 'a option
- List.find even 1,3,5
- val it NONE int option
- List.find (fn x gt x"tim") "tom","tim","jane"
- val it SOME "tim" string option
- List.getItem 'a list -gt ('a 'a list) option
- List.getItem 1,2,3,4
- val it SOME (1,2,3,4)
- List.getItem
- val it NONE
37Using While Loops
- fun ident c cs
- let val xs ref cs
- val x ref c
- val ans ref
- in while (not(null(!xs))
- andalso
- Char.isAlpha (hd (!xs))) do
- ( ans !ans _at_ !x
- x hd(!xs)
- xs tl(!xs) )
- (Id (String.implode (!ans _at_ !x)), !xs)
- end
Dont forget to test for empty list
38The Compile-manager
- The compile manager is a make like facility for
SML. - It has some documentation found here.
- http//www.smlnj.org/doc/CM/index.html
- Peter Lees notes also contains a brief (but out
of date) introduction - http//www.cs.cmu.edu/petel/smlguide/smlnj.htm
- The basic approach is that a single file contains
a list of all the pieces that comprise a project. - A complete scan of all the pieces determines a
dependency graph of which pieces depend on which
other pieces. - It determines the time-stamp of each piece
- It recompiles all the pieces that are out of date
- It links everything together
39Structures and Libraries
- The compile manager compiles whole compilation
units called libraries or structures. - Break program into files where each file contains
1 library. - If a file needs stuff from another library, open
that library inside of the file. - Create a compile manager source file that lists
all the libraries.
40Libraries are bracketed by structure name
struct and end
Name of the library, can be different from the
name of the file.
File TypeDecls.sml
structure TypeDecls struct datatype Token
ProperNoun of string Noun of string Verb
of string Article of string EOF end
41Opens libraries with needed components
File Driver.sml
structure Driver struct open TypeDecls fun
makeStringLexer s let val testString ref
s fun feed n let val ss
!testString in if String.size ss lt n
then ( testString "" ss )
else ( testString
String.extract(ss,n,NONE)
String.extract(ss,0,SOME n) ) end
val f Mlex.makeLexer feed fun lex() (f
()) handle Mlex.LexError gt
(print "Lex error\n" EOF)
in lex(unit -gt Token) end end
42Opens libraries with needed components
File english2.lex
open TypeDecls type lexresult Token type pos
int type svalue int fun eof () EOF
digit0-9 numberdigit \t\
gt ( lex() ( ignore whitespace ) )
AnneBobSpot number gt ( ProperNoun yytext
) athe gt ( Article yytext ) boygirldog gt
( Noun yytext ) walkedchasedranbit gt ( Verb
yytext )
43The sources file
The name of the file where this list resides
File sources.cm
group (sources.cm) is typeDecls.sml
driver.sml english2.lex /basis.cm
/smlnj-lib.cm
The extension .lex tells the manager to run
ml-lex!
All the user defined pieces
System libraries
44Putting it all together.
- - CM.make "sources.cm"
- scanning sources.cm
- D\programs\SML110.57\bin\ml-lex english2.lex
- Number of states 39
- Number of distinct rows 33
- Approx. memory size of trans. table 4257 bytes
- parsing (sources.cm)english2.lex.sml
- compiling (sources.cm)english2.lex.sml
- code 13069, data 5565, env 1235 bytes
- New bindings added.
- val it true bool
- - open Driver
- opening Driver
- val makeStringLexer string -gt unit -gt Token
- datatype Token
- Article of string
- EOF
- Noun of string
Note how it calls ml-lex to create
english2.lex.sml
It compiles the .sml files
I open the Library Driver to get at the function
makeStringLexer
45Assignments
- Programming exercise 6 is now posted on the
website. You should download it. - It requires you to
- Create a simple lexical analyzer using SML-lex
- write short functions using library functions and
anonymous functions to answer questions about. - datatype Gender Male Female
- datatype Color Red Blue Green Yellow
Orange - val people
- ("tim",25,Male,Red) ,("mary",30,Female,Orange)
- ,("john",14,Male,Yellow) ,("bob",55,Male,Green)
- ,("jane",19,Female,Blue) ,("alexi",25,Male,Green
) - ,("joan",31,Female,Blue) ,("jill",16,Female,Gree
n)