CS3012 Formal Languages and Compilers - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

CS3012 Formal Languages and Compilers

Description:

Monday 11:00 MT6 or MT4. Thursday 11:00 MT4. Tutorials. Thursday 1pm ... all strings beginning b, and b before c, etc. Within groups of strings beginning with ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 19
Provided by: kenb8
Category:

less

Transcript and Presenter's Notes

Title: CS3012 Formal Languages and Compilers


1
CS3012Formal Languages and Compilers
Original Notes by Dr. K. N. Brown Edited for
2007 Dr J R Lishman
Lectures Monday 1100 MT6 or MT4 Thursday 1100 M
T4 Tutorials Thursday 1pm or 2pm room
205 Friday 1pm or 2pm room 205
CS3012 Formal Languages and Compilers
2
Formal Languages
Languages English, Spanish, ... PASCAL, C, ...
Problem How do we define a language? i.e. what
sentences are valid in a language? e.g. Large
red cars go quickly. Colourless green ideas
sleep furiously. Cars red large go
quickly. Coches rojos grandes marchan
rapidamente.
Formal languages Languages with a well-defined
membership, based solely on the form of the
sentences.
3
Why Study Formal Languages?
part of an underlying science/mathematics of
computing
specification of formal languages provides an
abstract model of computation that we can reason
about
implementable
applications in compilers analysis of
algorithms complexity theory artificial
intelligence pattern recognition natural
language design ...
4
Example Program
include ltstdio.hgt main() printf("hello
world\n")
35 105 110 99 108 117 100 101 32 60 115 116
100 105 111 46 104 62 13 13 109 97 105 110
40 41 32 13 123 13 9 112 114 105 110 116
102 40 34 104 101 108 108 111 32 119 111 114
108 100 92 110 34 41 59 32 13 125 13 eof
include ltstdio.hgt main ( ) printf ( "hello
world\n" ) eof
5
Compilers
source
target
error messages
Compiler A "black box" that takes a program in
one language and translates it into an equivalent
program in another language.
Applied in programming languages machine
architecture language theory software
engineering user interfaces ...
6
Inside the Black Box
source
e m r e r s o s r a g e s
target
7
Inside the compiler
what are legal tokens in this language?
what are valid programs in this language?
what tokens are in this program?
source
how do they fit together to make this program?
what does this program mean?
e m r e r s o s r a g e s
generate equivalent code for a target compiler
target
8
Inside the compiler
what are legal tokens in this language?
what are valid programs in this language?
what tokens are in this program?
source
how do they fit together to make this program?
what does this program mean?
e m r e r s o s r a g e s
generate equivalent code for a target compiler
target
9
Course Content
basic formal language theory
alphabets, strings, languages finite state
automata regular expressions and languages finite
state automata and regular languages finite state
automata with output
lexical analysis
using Lex
grammar theory
languages and grammars derivations and
ambiguity parsing
compilation
using Yacc error handling syntax directed
translation symbol table type checking run-time
environments intermediate code generation
10
Structure
basic formal language theory
lexical analysis
grammar theory
compilation
11
Course Details
Lectures with OHP's
motivation explanation demonstration
Handouts
definitions algorithms and results examples
Exercises
practice using abstract concepts test of
understanding
Practical assignments
programming putting theory into practice
Assessment
25 practical assignments
75 written exam, testing definitions understandi
ng problem solving application
12
Alphabets and Strings Definitions
A symbol is a basic unit. g, 5, 1 and are
symbols
An alphabet is a finite set of symbols. a, b,
c, d, e, ..., z, A, B, ..., Z 0,1,2,3,4,5,6,7,8,
9 0,1 a,b
A string over an alphabet T is a finite sequence
of symbols from T. Also called T-string, or
simply string. aZb, 45637, 0001010, aabababba
The empty string is the string with no
symbols, denoted by ?.
13
Definitions (cont.)
The length of a string, w, is the number
of symbols in the sequence, denoted w. 4732
4, abbca 5
Two strings, w v are equal if they have
exactly the same sequence of symbols.
The concatenation of two strings, w v, is
the sequence of symbols in w followed by the
sequence of symbols in v. Denoted wv. w abb,
v bab, wv abbbab. w? w.
Note concatenation is not commutative wv need
not equal vw. Concatenation is associative w(vu)
(wv)u.
A string u is a substring of w if there exist
other strings x y s.t. w xuy. ab is a
substring of babba. Note ??is a substring of
every string.
14
Definitions (cont.)
If u is a substring of w, and x above is ?, then
u is a prefix of w. If u ? w, then u is a proper
prefix. ba is a prefix of babba.
If u is a substring of w, and y above is ?, then
u is a suffix of w. If u ? w, then u is a proper
suffix. bba is a suffix of babba.
If T is an alphabet, then T is the set of all
strings over T. T is T without ? T a,b,
T ?,a, b, aa, ab, bb, ba, aaa,...
If a is a symbol, then an is the string of n
a's a3 aaa a ?, a, aa, aaa, ... a a,
aa, aaa, ... Note anam anm.
15
Languages
A language over an alphabet T is a set of
strings over T. Also called T-language, or
simply language. T a,b, then ?, ab, babba,
bbbbbbb is a T-language
Note L is a T-language iff L ? T
16
Language Operations
Let A and B be languages over T.
AB is the set union of A and B
A ? B is the set intersection of A and B
A' is the complement of A - i.e. all strings in
T but not in A.
AB is the concatenation of A and B - i.e.
all strings uv where u ? A and v ? B Note
associative, but not commutative.
An is the concatenation of A with itself n
times. Note A0 ?.
A A0 A1 A2 ... This is called
the Kleene Closure of A. Note (A) A
A A1 A2 ...
17
Orderings
Let T be an alphabet with a given ordering on its
symbols. Say T a, b, c, d, .... Strings over
T can be ordered in two ways
Dictionary Order All strings beginning a are
ordered before all strings beginning b, and b
before c, etc. Within groups of strings beginning
with the same symbol, strings are ordered by
their second symbol, and so on. ? is always the
first string.
Lexical Order Strings are ordered by their
length, with the shortest first. Within groups of
strings of the same length, strings are ordered
in dictionary order. ? is always the first
string.
18
Specifying Languages
L1 xn n 1, 2, 3, ... What elements are
in L1?
L2 xn n 1, 4, 9, 16, ... What elements
are in L2?
L3 xn n 1,4, 9, 48, ... What elements
are in L3?
Problem Devise a clear and precise method for
defining infinite languages.
Write a Comment
User Comments (0)
About PowerShow.com