Introduction to Language Theory - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Language Theory

Description:

Introduction to Language Theory Programming Language Translators Prepared by Manuel E. Berm dez, Ph.D. Associate Professor University of Florida Introduction to ... – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 30
Provided by: ManuelB6
Learn more at: https://www.cise.ufl.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Language Theory


1
Introduction to Language Theory
Programming Language Translators
  • Prepared by
  • Manuel E. Bermúdez, Ph.D.
  • Associate Professor
  • University of Florida

2
Introduction to Language Theory
  • Definition An alphabet (or vocabulary) S is a
    finite set of symbols.
  • Example Alphabet of Pascal
  • - / lt (operators)
  • begin end if var (keywords)
  • ltidentifiergt (identifiers)
  • ltstringgt (strings)
  • ltintegergt (integers)
  • , ( ) (punctuators)
  • Note All identifiers are represented by one
    symbol, because S must be finite.

3
Introduction to Language Theory
  • Definition A sequence t t1t2tn of symbols
    from an alphabet S is a string.
  • Definition The length of a string t t1t2tn
    (denoted t) is n. If n 0, the string is e,
    the empty string.
  • Definition Given strings s s1s2sn and
  • t t1t2tm, the concatenation of s and t,
    denoted st, is the string s1s2snt1t2tm.

4
Introduction to Language Theory
  • Note eu u ue, uev uv, for any strings u,v
    (including e)
  • Definition S is the set of all strings of
    symbols from S.
  • Note S is called the reflexive, transitive
    closure of S.
  • S is described by the graph (S, ), where
    denotes concatenation, and there is a designated
    start node, e.

5
Introduction to Language Theory
  • Example S a, b.
  • (S, )
  • S is countably infinite, so cant compute all of
    S, and can only compute finite subsets of S,
    but can compute whether a given string is in S.

aa
a
a
aba
b
a
ab
a
b
abb
  • e

b
ba
a
b
b
bb
6
Introduction to Language Theory
  • Example S Pascal vocabulary.
  • S all possible alleged Pascal programs,
    i.e. all possible inputs to Pascal compiler.
  • Need to specify L ? S, the correct Pascal
    programs.
  • Definition A language L over an alphabet S is a
    subset of S.

7
Introduction to Language Theory
  • Example S a, b.
  • L1 ø is a language
  • L2 e is a language
  • L3 a is a language
  • L4 a, ba, bbab is a language
  • L5 anbn / n gt 0 is a language
  • where an aaa, n times
  • L6 a, aa, aaa, is a language
  • Note L5 is an infinite language, but described
    finitely.

8
Introduction to Language Theory
  • THIS IS THE MAIN GOAL OF LANGUAGE SPECIFICATION
  • To describe (infinite) programming languages
    finitely, and to provide corresponding finite
    inclusion-test algorithms.

9
Language Constructors
  • Definition The catenation (or product) of two
    languages L1 and L2, denoted L1L2, is the set
  • uv u?L1, v?L2.
  • Example L1 e, a, bb, L2 ac, c
  • L1L2 ac, c, aac, ac, bbac, bbc
  • ac, c, aac, bbac, bbc

10
Language Constructors
  • Definition Ln LLL (n times),
  • and L0 e.
  • Example L a, bb
  • L3 aaa, aabb, abba,
    abbbb, bbaa, bbabb, bbbba, bbbbbb

11
Language Constructors
  • Definition The union of two languages L1 and L2
    is the set L1 L2 u u?L1 v v?L2
  • Definition The Kleene star (L) of a language is
    the set L U Ln, n gt0.
  • Example L a, bb
  • L any string composed of as and
  • bbs
  • Definition The Transitive Closure (L) of a
    language L is the set L U Ln, n gt 1.

n
n
12
Language Constructors
  • Note
  • In general, L L U e, but L ? L - e.
  • For example, consider L e. Then
  • e L ? L e e e ø.

13
Grammars
  • Goal Providing a means for describing languages
    finitely.
  • Method Provide a subgraph (S, ?) of (S, ),
    and a start node S, such that the set of
    reachable nodes (from S) are the strings in the
    language.

14
Grammars
  • Example S a, b
  • L anbn / n gt 0

a
aaa

aaba
a
aa
a
aab
b
b
b
a
ab
a
aabb
  • e

a
ba
bbaa
a
b
a
b
bba
b
bb
bbab
b
bbb
b
15
Grammars
  • gt (derives) is a relation defined by a finite
    set of rewrite rules known as productions.
  • Definition Given a vocabulary V, a production is
    a pair (u, v) ? V x V, denoted u ? v. u is
    called the left-part v is called the right-part.

16
Grammars
  • Example Pseudo-English.
  • V Sentence, NP, VP, Adj, N, V, boy, girl,
    the, tall, jealous, hit, bit
  • Sentence ? NP VP (one production)
  • NP ? N
  • NP ? Adj NP
  • N ? boy
  • N ? girl
  • Adj ? the
  • Adj ? tall
  • Adj ? jealous
  • VP ? V NP
  • V ? hit
  • V ? bit
  • Note English is much too complicated to be
    described this way.

17
Grammars
  • Definition
  • Given a finite set of productions P ? V x V
    the relation gt is defined such that
  • ?, ß, u, v ? V , ?uß gt ?vß iff
  • u ? v ? P is a production.
  • Example
  • Sentence ? NP VP Adj ? the
  • NP ? N Adj ? tall
  • NP ? Adj NP Adj ? jealous
  • N ? boy VP ? V NP
  • N ? girl V ? hit
  • V ? bit

18
Grammars
  • Sentence gt NP VP
  • gt Adj NP VP
  • gt the NP VP
  • gt the Adj NP VP
  • gt the jealous NP VP
  • gt the jealous N VP
  • gt the jealous girl VP
  • gt the jealous girl V NP
  • gt the jealous girl hit NP
  • gt the jealous girl hit Adj NP
  • gt the jealous girl hit the NP
  • gt the jealous girl hit the N
  • gt the jealous girl hit the boy

19
Grammars
  • Definition A grammar is a 4-tuple G (F, S, P,
    S)
  • where
  • F is a finite set of nonterminals,
  • S is a finite set of terminals,
  • V F U S is the grammars vocabulary,
  • S ? F is called the start or goal symbol,
  • and P ? V x V is a finite set of productions.
  • Example Grammar for anbn / n gt 0.
  • G (F, S, P, S), where
  • F S,
  • S a, b,
  • and P S ? aSb, S ? e

20
Grammars
  • Derivations
  • S gt aSb gt aaSbb gt aaaSbbb gt aaaaSbbbb ?
  • e ab aabb aaabbb
    aaaabbbb
  • Note Normally, grammars are given by simply
    listing the productions.

gt
gt
gt
gt
gt
21
Grammar Conventions
  • TWS
    convention
  • Upper case letter (identifier) nonterminal
  • Lower case letter (string) terminal
  • Lower case greek letter strings in V
  • Left part of the first production is assumed to
    be the start symbol, e.g.
  • S ? aSb
  • S ? e
  • Left part omitted if same as for preceeding
    production, e.g.
  • S ? aSb
  • ? e

22
Grammars
  • Example Grammar for identifiers.
  • Identifier ? Letter
  • ? Identifier Letter
  • ? Identifier Digit
  • Letter ? a ? A
  • ? b ? B
  • .
  • .
  • ? z ? Z
  • Digit ? 0
  • ? 1
  • .
  • .
  • ? 9

23
Grammars
  • Definition The language generated by a grammar
    G, is the set L(G) ? ? S S gt ?
  • Definition A sentential form generated by a
    grammar G is any string a such that S gt ? .
  • Definition A sentence generated by a grammar G
    is any sentential form ? such that ? ? S.

24
Grammars
  • Example
  • sentential forms
  • S gt aSb gt aaSbb gt aaaSbbb gt aaaaSbbbb gt
  • e ab aabb aaabbb
    aaaabbbb
  • Lemma L(G) ? is a sentence
  • Proof Trivial.

gt
gt
gt
gt
gt
sentences
25
Grammars
  • Example A ? aABC
  • ? aBC
  • aB ? ab
  • bB ? bb
  • bC ? bc
  • CB ? BC
  • cC ? cc

26
Grammars
  • Derivations A gt aABC gt aaABCBC gt
  • aBC aaBCBC
    aaaBCBCBC
  • abC aabCBC aaaBBCBCC
  • abc aabBCC
    aaaBBBCCC
  • aabbCC aaabBBCCC
  • (2)
  • aabbcC aaabbbCCC
  • aabbcc aaabbbcCC

  • (2)

  • aaabbbccc
  • L (G) anbncn n gt 1

gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
gt
27
The Chomsky Hierarchy
  • A hierarchy of grammars, the languages they
    generate, and the machines the accept those
    languages.

28
The Chomsky Hierarchy
Type Language Name Grammar Name Restrictions On grammar Accepting Machine
0 Recursively Enumerable Unrestricted re-writing system None Turing Machine
1 Context-Sensitive Language Context- Sensitive Grammar For all ???, ?? Linear Bounded Automaton
2 Context- Free Language Context- Free Grammar For all ???, ??F. Push-Down Automaton (parser)
3 Regular Language Regular Grammar For all ???, ??F, ???U ?FU? Finite- State Automaton
29
Language Hierarchy
0 Recursively Enumerable Languages
1 Context-Sensitive Languages
2 Context-free Languages
We will deal with type 2 (syntax) and type 3
(lexicon) languages.
3 Regular Languages an n gt 0
anbn ngt0
anbncn ngt0
English?
Write a Comment
User Comments (0)
About PowerShow.com