Lexical Analysis and Introducing Haskell - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Lexical Analysis and Introducing Haskell

Description:

Phone number with optional area code? More examples in the ToeTipper. Expression. Meaning ... lex will run a short bit of code for each matched regular expression. ... – PowerPoint PPT presentation

Number of Views:238
Avg rating:3.0/5.0
Slides: 20
Provided by: cbo4
Category:

less

Transcript and Presenter's Notes

Title: Lexical Analysis and Introducing Haskell


1
Lexical Analysis and Introducing Haskell
Please read 3.1-3.3
  • Lexical analysis
  • Regular Languages and regular expressions
  • Lex
  • The Haskell programming language

2
Recall the Structure of a Compiler
character stream
Lexical Analysis
token stream
Parsing
Front End
syntax tree
Semantic Analysis
syntax tree
Intermediate Code Generate
Symbol Table
intermediate code
Optimization
Back End
intermediate code
Code Generation
target machine code
3
Structure of a Compiler
character stream
Lexical Analysis
token stream
Today were going to look at implementation of
this part!
Parsing
Front End
syntax tree
Semantic Analysis
syntax tree
Intermediate Code Generate
Symbol Table
intermediate code
Optimization
Back End
intermediate code
Code Generation
target machine code
4
Lexical Analysis
speed speed 10 time
Lexical Analysis

The tool lex creates lexical analyzers, Lexical
analysis is sometimes referred to as lexing or
scanning.
5
Identifying Lexemes
The set of all possible lexemes for a given token
is described by the use of a pattern. Patterns
are typically described using regular
expressions Example An identifier in C is a
string of letters, numbers, or underscores
beginning with letter or underscore.
6
What we need
1) We need to be able to describe our languages
lexemes unambiguously. 2) We need to create a
correct and efficient scanner that will recognize
the lexemes.
We have elegant theory for dealing with (1) and
advanced tools that completely automate (2)!
7
Regular Languages
A formal language over an alphabet S is a set of
strings made up of characters drawn from S. The
regular languages are languages that can be
accepted by a deterministic finite state machine
and are the simplest of the hierarchy of formal
languages. Regular languages work well for
expressing the lexemes in a programming language.
In our case, S is all characters (including
newline).
8
Regular Expressions
This is how we will describe our lexemes/tokens.
Well have a formal notation and a notation
that lex will accept. Example, C
identifiers. letter A-Za-z digit
0-9 identifier (letter _)
(letterdigit _)
9
Regular Expressions Overview
10
Examples
zero 0 digit 0 1 2 3 4
5 6 7 8 9 (well abbreviate
these ranges as just 0 9 and often omit the
quotation marks) letter A-Za-z
TT
11
Examples
digits 0-9 letters A-Za-z identifier
(letter _) (letter digit _)
12
Some common extensions
13
Examples
How about 7 digit phone numbers? Phone number
with optional area code? More examples in the
ToeTipper
TT
14
Regular Expressions for Languages
Well have one regular expression for each
lexeme/token. Each token will be identified by
an integer defined using defines define If
10 define Else 11 define Integer 12 define
LeftParen 13 Etc
Value will be the integers value.
lex will run a short bit of code for each matched
regular expression. Youll create the token in
that code.
15
Regular Expressions for Languages
Keywords are easy If if Else else Do
do Each keyword is a single regular expression
consisting of only the keyword. Note
Ambiguities can occur. What about this
identifier
doWhatYouWant.
Sometimes there is only a keyword token with
the value being the actual keyword.
lex resolves ambiguities by always choosing the
longest regular expression that matches.
16
Whitespace and Error Handling
We often ignore whitespace in the input stream.
Well often create a Whitespace token that is
just ignored when it is found Whitespace
\t \n For an error, create an Error
token. If nothing else is matched, return the
error token.
17
Lex
Lex is a tool for creating lexical analyzers The
GNU version is called flex. Project 1 will
introduce the use of lex/flex and well start
building our compiler.
18
Haskell
Were going to be building a complier for a
subset of Haskell. See http//haskell.org/ and
http//en.wikipedia.org/wiki/Haskell_(programming_
language) Free compilers and interpreters are
available. Im using WinHugs under Windows and
have asked that it be installed in the
labs. Haskell is a standardized, purely
functional programming language. It uses
something called Lazy evaluation.
19
Haskell Examples
Well always start with this
module Main where fac n if n 0 then 1 else
n fac (n-1) main print(fac 5)
Lots more examples in class
TT
Write a Comment
User Comments (0)
About PowerShow.com