Lex and Yacc - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Lex and Yacc

Description:

Tutorial On Lex & Yacc Purpose of Tutorial Provide a brief, non-technical, black-box introduction to lex and yacc. 2. How to run lex and yacc. Lex: what is it? – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 31
Provided by: Dewan7
Category:
Tags: about | lex | quotes | success | yacc

less

Transcript and Presenter's Notes

Title: Lex and Yacc


1
Tutorial On Lex Yacc
2
Purpose of Tutorial
  • Provide a brief, non-technical, black-box
    introduction to lex and yacc.
  • 2. How to run lex and yacc.

3
Lex what is it?
  1. Lex a tool for automatically generating a lexer
    or scanner given a lex specification (.l file)
  2. A lexer or scanner is used to perform lexical
    analysis, or the breaking up of an input stream
    into meaningful units, or tokens.
  3. For example, consider breaking a text file up
    into individual words.

4
Skeleton of a lex specification (.l file)
5
The rules section
RULES SECTION ltpatterngt ltaction to take
when matchedgt ltpatterngt ltaction to take when
matchedgt Patterns are specified by
regular expressions. For example A-Za-z
printf(this is a word)
6
Regular Expression Basics
. matches any single character except \n
matches 0 or more instances of the preceding
regular expression matches 1 or more
instances of the preceding regular expression ?
matches 0 or 1 of the preceding regular
expression matches the preceding or
following regular expression defines a
character class () groups enclosed regular
expression into a new regular expression
matches everything within the literally
7
Lex Reg Exp (cont)
  • xy x or y
  • i definition of i
  • x/y x, only if followed by y (y not removed from
    input)
  • xm,n m to n occurrences of x
  • ? x x, but only at beginning of line
  • x x, but only at end of line
  • "s" exactly what is in the quotes (except for
    "\" and
  • following character)
  • A regular expression finishes with a space, tab
    or newline

8
Meta-characters
  • meta-characters (do not match themselves, because
    they are used in the preceding reg exps)
  • ( ) lt gt / , . \ " ? -
  • to match a meta-character, prefix with "\"
  • to match a backslash, tab or newline, use \\,
    \t, or \n

9
Regular Expression Examples
  • an integer 12345
  • 1-90-9
  • a word cat
  • a-zA-Z
  • a (possibly) signed integer 12345 or -12345
  • -?1-90-9
  • a floating point number 1.2345
  • 0-9.0-9

10
Lex Regular Expressions
  • Lex uses an extended form of regular expression
  • (c character, x,y regular expressions, s
    string, m,n integers and i identifier).
  • c any character except meta-characters (see
    below)
  • ... the list of enclosed chars (may be a range)
  • ?... the list of chars not enclosed
  • . any ASCII char except newline
  • xy concatenation of x and y
  • x same as x
  • x same as x (i.e. x but not ?)
  • x? an optional x (same as x ?)

11
Regular Expression Examples
  • a delimiter for an English sentence
  • . ? ! OR
  • .?!
  • C comment // call foo() here!!
  • //.
  • white space
  • \t
  • English sentence Look at this!
  • ( \ta-zA-Z)(.?!)

12
Special Functions
  • yytext
  • where text matched most recently is stored
  • yyleng
  • number of characters in text most recently
    matched
  • yylval
  • associated value of current token
  • yymore()
  • append next string matched to current contents of
    yytext
  • yyless(n)
  • remove from yytext all but the first n characters
  • unput(c)
  • return character c to input stream
  • yywrap()
  • may be replaced by user
  • The yywrap method is called by the lexical
    analyser whenever it inputs an EOF as the first
    character when trying to match a regular
    expression

13
Let us run a lex program
14
Yacc what is it?
Yacc a tool for automatically generating a
parser given a grammar written in a yacc
specification (.y file) A grammar specifies a
set of production rules, which define a language.
A production rule specifies a sequence of
symbols, sentences, which are legal in the
language.
15
Skeleton of a yacc specification (.y file)
.c is generated after running
x.y lt C global variables, prototypes,
comments gt DEFINITION SECTION PRODUCTIO
N RULES SECTION lt C auxiliary subroutinesgt
This part will be embedded into .c
contains token declarations. Tokens are
recognized in lexer.
define how to understand the input language,
and what actions to take for each sentence.
any user code. For example, a main function to
call the parser function yyparse()
16
Structure of yacc File Definition
section declarations of tokens type of values
used on parser stack Rules section list of
grammar rules with semantic routines User code
17
The Production Rules Section
production symbol1 symbol2 action
symbol3 symbol4 action
production symbol1
symbol2 action
18
An example
statement expression printf ( g\n,
1) expression expression expression
1 3 expression
- expression 1 - 3
NUMBER 1
According these two productions, 5 4 3 2
is parsed into
19
Choosing a Grammar
  • S -gt E
  • E -gt E T
  • E -gt E - T
  • E -gt T
  • T -gt T F
  • T -gt T / F
  • T -gt F
  • F -gt ( E )
  • F -gt ID
  • S -gt E
  • E -gt E E
  • E -gtE - E
  • E -gt E E
  • E -gt E / E
  • E -gt ( E )
  • E -gt ID

20
Precedence and Associativity
  • right '
  • left '-' ''
  • left '' '/'
  • right ''

21
Defining Values
  • expr expr '' term 1 3
  • term 1
  • term term '' factor 1 3
  • factor 1
  • factor '(' expr ')' 2
  • ID
  • NUM

22
Defining Values
1
  • expr expr '' term 1 3
  • term 1
  • term term '' factor 1 3
  • factor 1
  • factor '(' expr ')' 2
  • ID
  • NUM

23
Defining Values
  • expr expr '' term 1 3
  • term 1
  • term term '' factor 1 3
  • factor 1
  • factor '(' expr ')' 2
  • ID
  • NUM

2
24
Defining Values
  • expr expr '' term 1 3
  • term 1
  • term term '' factor 1 3
  • factor 1
  • factor '(' expr ')' 2
  • ID
  • NUM

3
Default 1
25
Example Lex
scanner.l
  • include ltstdio.hgt
  • include "y.tab.h"
  • id _a-zA-Z_a-zA-Z0-9
  • wspc \t\n
  • semi
  • comma ,
  • int return INT
  • char return CHAR
  • float return FLOAT
  • comma return COMMA / Necessary?
    /
  • semi return SEMI
  • id return ID
  • wspc

26
Example Definitions
decl.y
  • include ltstdio.hgt
  • include ltstdlib.hgt
  • start line
  • token CHAR, COMMA, FLOAT, ID, INT, SEMI

27
Example Rules
decl.y
  • /This production is not part of the "official"
  • grammar. It's primary purpose is to recover
    from
  • parser errors, so it's probably best if you
    leave ot here. /
  • line / lambda /
  • line decl
  • line error
  • printf("Failure -(\n")
  • yyerrok
  • yyclearin

28
Example Rules
decl.y
  • decl type ID list
  • printf("Success!\n")
  • list COMMA ID list
  • SEMI
  • type INT CHAR FLOAT

29
Example Supplementary Code
decl.y
  • extern FILE yyin
  • main()
  • do
  • yyparse()
  • while(!feof(yyin))
  • yyerror(char s)
  • / Don't have to do anything! /

30
Let us Run a Program
Write a Comment
User Comments (0)
About PowerShow.com