N'K' Srinathsrinath_nkyahoo'com 1 RVCE - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

N'K' Srinathsrinath_nkyahoo'com 1 RVCE

Description:

vowels and consonants. words, characters, blanks and lines ... Initialize two variable to zero for vowels andconsonents. ... Identify vowels a,e,i,o,u and ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 26
Provided by: Srin2
Category:

less

Transcript and Presenter's Notes

Title: N'K' Srinathsrinath_nkyahoo'com 1 RVCE


1
  • LEX (LEXical Analyzer Generator)
  • Features
  • Creating a Lexical Analyzer with Lex
  • Lex Source
  • LEX Regular Expressions
  • Functions


N.K. Srinath srinath_nk_at_yahoo.com 1
RVCE
2
  • LEX Turing rules into program
  • Summary of Source Format
  • Problems
  • positive integer and negative integer
  • valid identifiers
  • sentence is simple or compound
  • vowels and consonants
  • words, characters, blanks and lines


N.K. Srinath srinath_nk_at_yahoo.com 2
RVCE
3


N.K. Srinath srinath_nk_at_yahoo.com 3
RVCE
4

Lex is turning the rules into a program. Any
source not intercepted by Lex is copied into the
generated program. There are three classes of
such things.
1. Any line which is not part of a Lex rule or
action which begins with a blank or tab is copied
into the Lex generated program. Such source
input prior to the first delimiter will be
external to any function in the code if it
appears immediately after the first , it
appears in an appropriate place for declarations
in the function written by Lex which contains the
actions.
N.K. Srinath srinath_nk_at_yahoo.com 4
RVCE
5

2) Anything included between lines
containing only and is copied
out as above. The
delimiters are discarded. This format permits
entering text like preprocessor statements that
must begin in column 1, or copying lines that do
not look like programs. 3) Anything after the
third delimiter, regardless of formats,
etc., is copied out after the Lex output.
N.K. Srinath srinath_nk_at_yahoo.com 5
RVCE
6

Any line in this section not contained
between and , and beginning in
column 1, is assumed to
define Lex substitution strings. The format of
such lines is name translation and it causes the
string given as a translation to be associated
with the name. The name and translation must be
separated by at least one blank or tab, and the
name must begin with a letter. The translation
can then be called out by the name syntax in a
rule.
N.K. Srinath srinath_nk_at_yahoo.com 6
RVCE
7

Summary of Source Format The general form of a
Lex source file is definitions

rules

user subroutines The definitions section
contains a combination of 1) Definitions, in the
form name space translation''. 2) Included
code, in the form space code''. 3) Included
code, in the form
code

N.K. Srinath srinath_nk_at_yahoo.com 7
RVCE
8

Regular expressions in Lex use the following
operators x the character "x"
"x" an "x", even if x is an operator.
\x an "x", even if x is an
operator. xy the character x
or y. x-z the characters x,
y, or z. x any character
but x. . any character
but newline. x an x at the
beginning of a line. ltygtx an x
when Lex is in start condition y.
N.K. Srinath srinath_nk_at_yahoo.com 8
RVCE
9
  • x an x at the end of a line.
  • x? an optional x.
  • x 0,1,2, ... instances of x.
  • x 1,2,3, ... instances of x.
  • xy an x or a y.
  • an x.
  • x/y an x but only if followed by y.
  • xx the translation of xx from the
    definitions section.
  • xm,n m through n occurrences of x

N.K. Srinath srinath_nk_at_yahoo.com 9
RVCE
10

Examples of Regular Expressions 0-9 a
digit 0-9 An integer 0-9 no digit or
integer -?0-9 Optional negative sign with an
integer. 0-9\.0-9 pattern such as 0.0, 4.5,
or .3154 matches. The \ before the period
is to make it a literal period rather than a
wild card character. This does not match an
integer.
N.K. Srinath srinath_nk_at_yahoo.com 10
RVCE
11

(0-9) (0-9\.0-9) This
is using grouping symbols ()to specify what the
regular expressions are for the
operation. -?((0-9) (0-9\.0-9))
This indicates the above number with unary
minus. eE -?0-9 Regular expression for
an exponent.
N.K. Srinath srinath_nk_at_yahoo.com 11
RVCE
12

Lex specification for decimal number. \n\t
-?((0-9) (0-9\.0-9) (eE
-?0-9)?) printf(number\n) .
ECHO yylex()
N.K. Srinath srinath_nk_at_yahoo.com 12
RVCE
13

What does m regular expression matches?
mmm m mmmmm
m is a regular expression that matches any
string of ms
N.K. Srinath srinath_nk_at_yahoo.com 13
RVCE
14

What does 7 regular expression matches?
7777 77
7 is a regular expression that matches any string
of ms
N.K. Srinath srinath_nk_at_yahoo.com 14
RVCE
15

What does A(B1,4)C regular expression matches?
ABC ABBC ABBBC ABBBBC
N.K. Srinath srinath_nk_at_yahoo.com 15
RVCE
16

What does A(B1,4)C regular expression matches?
ABC ABBC ABBBC ABBBBC
N.K. Srinath srinath_nk_at_yahoo.com 16
RVCE
17

What does A-Za-z0-9 regular expression
matches?
This is a regular expression that matches any
letter (whether upper or lowercase), any digit,
an asterisk, an ampersand, or a .
N.K. Srinath srinath_nk_at_yahoo.com 17
RVCE
18

Write a rule to recognize the FORTRAN DO
statement. DO 50 k 1 , 20,
1
DO/( 0-9 a-zA-Z0-9a-zA-Z0-9,)
printf("found
DO")
N.K. Srinath srinath_nk_at_yahoo.com 18
RVCE
19
  • Problems
  • Write a lex program to find the number of
    positive integer and negative integer.
  • Solution
  • int posnum 0
  • int negnum 0

N.K. Srinath srinath_nk_at_yahoo.com 19
RVCE
20

\n\t (0-9) posnum
-?(0-9) negnum . ECHO main()
yylex() printf("Number of positive no.
d\n", posnum) printf("number of negative no.
d\n", negnum)
Line 1 \n and \t are ignored. Line 2 With at
least one digit available which is positive,
posnum is incremented.
Line 3 - is optional, since all the positive
numbers are already accepted by line 2, only ve
numbers are accepted by this line. Line 4 All
the other characters are echoed on to the screen.
YYlex() call the lex program. The rest is the
C-program.
N.K. Srinath srinath_nk_at_yahoo.com 20
RVCE
21

2. Write a lex program to find the number
of valid identifiers. int
count0 (" int ")(" float ")(" double
")(" char ") int ch ch
input() for() if (ch',')
count else
Count is declared as integer and initialized as
zero. int, float, double or char are the
different identifiers which are consider valid
identifiers.
The data is read from the keyboard and loaded
into variable ch.
The infinite for loop will check for a comma and
increments a counter.
N.K. Srinath srinath_nk_at_yahoo.com 21
RVCE
22

if( ch ' ) break ch input()
count main(int argc,char
argv) yyinfopen(argv1,"r") yylex() print
f("the no of identifiers used is d\n",count)
If the character is a then it breaks else it
reads the next input.
The counter is incremented to the next value for
the last identifier.
The input parameter is a file name. yyin will
read the file in the read mode and passed to the
lex.
N.K. Srinath srinath_nk_at_yahoo.com 22
RVCE
23

3.Write a lex program to find the given sentence
is simple or compound.
int flag0 (" "aAnNdD"
")(" "oOrR" ")(" "bBuUtT" ")
flag1 . main() yylex() if
(flag1) printf("COMPOUND SENTENCE \n")
Initialize a flag variable which when set
indicate that it has identified compound
sentence.
Rule says that when it identifies AND, OR and BUT
which are conjunctions the flag is set.
N.K. Srinath srinath_nk_at_yahoo.com 23
RVCE
24

else printf("SIMPLE SENTENCE \n") 5. Write a
lex program to find the number of vowels and
consonants. int vowels 0 int
consonents 0
Initialize two variable to zero for vowels
andconsonents.
N.K. Srinath srinath_nk_at_yahoo.com 24
RVCE
25

\t\n aeiouAEIOU vowels
bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ
consonents . main() yylex()
printf(" The number of vowels d\n", vowels)
printf(" number of consonents d \n",
consonents) return(0)
Identify blanks, tabs and newline character and
ignore.
Identify vowels a,e,i,o,u and increment vowels.
Identify consonants and increment the variable
consonants.
Neglect the rest.
N.K. Srinath srinath_nk_at_yahoo.com 25
RVCE
Write a Comment
User Comments (0)
About PowerShow.com