Title: Examples
1Examples
- The MPEG specification
- Example code from the Tcl computer language
Token Stream
Parser
Intermediate Code
Lets build up an example.
2MPEG Bitstream Format
- Data is a stream of bits, not bytes!
- Uses alignment points called start codes
next_start_code() while(!bytealigned())
zero_bit while(nextbits() ! 0000 0000 0000
0000 0000 0001) zero_byte
3Start codes
- Alignment points
- next_start_code()
- 32 bit specific start code
picture_start_code 00000100 user_data_start_code
000001b2 sequence_header_code 000001b3 sequence_e
rror_code 000001b4 sequence_end_code 000001b7 gr
oup_start_code 000001b8
4Video sequence layer
video_sequence() next_start_code() do
sequence_header() (a block of data)
do group_of_pictures() (GOP)
while (nextbits() group_start_code)
while(nextbits() sequence_header_code)
sequence_end_code
TT
5Sequence header
sequence_header() sequence_header_code 32
horizontal_size 12 vertical_size 12
pel_aspect_ratio 4 picture_rate 4
bit_rate 18
What makes sense as tokens here?
6Tcl
http//www.tcl.tk
Pronounced tickle A simple easy-to-use
scripting language. Just something as an
example for today. Were not going to do real
Tcl, only a subset as an example.
7First statement
set x 55
Assigns the value 0 to the variable x. If x does
not exist, it creates it.
Were going to want to determine Tokens and
regular expressions Grammar SDD? Syntax Tree?
TT
8Tokens
set x 55
"set" return SET a-zA-Z_a-zA-Z0-9_ retu
rn SYM 0-9 return INT
\t \n return '\n' \r . return
yytext0
9Grammar
program ? set sym int
Not much to it, just yet...
10Yacc
/ Tcl example / include
"stdio.h" token SET token SYM token
INT program SET SYM INT '\n'
Define Tokens
Define Productions
program ? set sym int
Yacc Yet another compiler compiler. Creates a
parser for you just like lex creates a lexical
analyzer.
11Lex that goes with that...
/ Lex file for tcl.y / include
ltstdio.hgt include "y.tab.h" "set" return
SET a-zA-Z_a-zA-Z0-9_ return
SYM 0-9 return INT \t \n return
'\n' \r . return yytext0 int
yywrap() return 1
Created by yacc
12Getting Fancier
We want to pass information from the lexical
analyzer to the parser other than just the token
(what information?)
13Getting Fancier
We want to pass information from the lexical
analyzer to the parser other than just the token
(what information?)
Some tokens have values integers, symbols,
doubles, etc. Were creating synthesized
attributes associated with the token.
14Fancier yacc
include "stdio.h" union
int intval double dval char
sval token SET token ltsvalgt SYM token
ltintvalgt INT program SET SYM INT
'\n' printf("set s d\n", 2, 3)
15Fancier yacc
include "stdio.h" union
int intval double dval char
sval token SET token ltsvalgt SYM token
ltintvalgt INT program SET SYM INT
'\n' printf("set s d\n", 2, 3)
Defines the types we can return from lex
16Fancier yacc
include "stdio.h" union
int intval double dval char
sval token SET token ltsvalgt SYM token
ltintvalgt INT program SET SYM INT
'\n' printf("set s d\n", 2, 3)
Defines tokens with or without a type
17Fancier yacc
include "stdio.h" union
int intval double dval char
sval token SET token ltsvalgt SYM token
ltintvalgt INT program SET SYM INT
'\n' printf("set s d\n", 2, 3)
Production
18Fancier yacc
include "stdio.h" union
int intval double dval char
sval token SET token ltsvalgt SYM token
ltintvalgt INT program SET SYM INT
'\n' printf("set s d\n", 2, 3)
What to do (rule)
1, 2, etc are the values of the production
right side symbols in order. Be careful,
everything counts!
19Lex for that
include ltstdio.hgt include "y.tab.h" "s
et" return SET a-zA-Z_a-zA-Z0-9_ yylval
.sval strdup(yytext) return SYM 0-9 yyl
val.intval atoi(yytext) return INT
\t \n return '\n' \r . return
yytext0 int yywrap() return 1
union int intval double dval char
sval
20Now lets start building
A Tcl script is a string containing one or more
commands. A Tcl command is terminated by a
newline or a semicolon. Blank lines are ignored.
Well not worry about comments.
set x 99 set y 52 set z 21 set p 88
Give me a grammar for this! First just deal with
newlines.
TT
21How to do it...
Note When doing bottom-up parsing (like yacc),
use left recursion!
commands commands '\n' command commands
command command
command SET SYM INT printf("set s
d\n", 2, 3)
Now add the semicolons...
TT
22How to do it...
commands commands '\n' command commands
commands ' command commands command
command command SET
SYM INT printf("set s d\n", 2, 3)
23expr
expr 7 3 42
expr evaluates a math expression and returns the
result. Lets just handle , /, , -, lt, gt,
, (, ) Assume numbers only for now. What
should be the precedence?
TT
24command EXPR expr printf("expr\n") expr
expr1 expr expr 'lt' expr1
expr expr 'gt' expr1 expr expr EQ
expr1 expr1 expr2 expr1 expr1 ''
expr2 expr1 expr1 '-' expr2 expr2
expr3 expr2 expr2 '' expr3 expr2
expr2 '/' expr3 expr3 '(' expr ')'
expr3 INT
Ill not worry about rules for now...
How would you do unary minus?
TT
25command EXPR expr printf("expr\n") expr
expr1 expr expr 'lt' expr1
expr expr 'gt' expr1 expr expr EQ
expr1 expr1 expr2 expr1 expr1 ''
expr2 expr1 expr1 '-' expr2 expr2
expr3 expr2 expr2 '' expr3 expr2
expr2 '/' expr3 expr3 '-' expr3
expr3 '(' expr ')' expr3 INT
26Next...
set x expr 3 5 expr 3 expr 2 6
In Tcl, whatever you put inside brackets is
executed as a command and returns a result.
Brackets can be used in expressions. Note that
any command can be in brackets. This is valid
Tcl set x set y 10
Note set x 3 5 is NOT valid Tcl syntax.
TT
27Solution....
command SET SYM '' command ''
expr3 '' command ''
28Variables
To get the value of a variable, put a in front
of it set y x expr 3 y
TT
29Keep going...
command SET SYM '' SYM
expr3 '' SYM
30while
while i gt 7 set i expr i 1
31commands commands '\n' command commands
commands '' command commands command
command command SET
SYM INT printf("set s d\n", 2, 3) command
SET SYM '' SYM printf("set s d\n", 2,
4) command SET SYM '' command ''
command EXPR expr printf("expr\n") expr
expr1 expr expr 'lt' expr1
expr expr 'gt' expr1 expr expr EQ
expr1 expr1 expr2 expr1 expr1 ''
expr2 expr1 expr1 '-' expr2 expr2
expr3 expr2 expr2 '' expr3 expr2
expr2 '/' expr3 expr3 '-' expr3
expr3 '(' expr ')' expr3
INT expr3 '' command '' expr3 ''
SYM
What we have so far...
while i gt 7 set i expr i 1
TT
32if expr1 ?then? body1 elseif expr2 ?then? body2
elseif ... ?else? ?bodyN? The if command
evaluates expr1 as an expression (in the same way
that expr evaluates its argument). The value of
the expression must be a boolean (a numeric
value, where 0 is false and anything is true, or
a string value such as true or yes for true and
false or no for false) if it is true then body1
is executed by passing it to the Tcl interpreter.
Otherwise expr2 is evaluated as an expression and
if it is true then body2 is executed, and so on.
If none of the expressions evaluates to true then
bodyN is executed. The then and else arguments
are optional noise words to make the command
easier to read. There may be any number of elseif
clauses, including zero. BodyN may also be
omitted as long as else is omitted too. The
return value from the command is the result of
the body script that was executed, or an empty
string if none of the expressions was non-zero
and there was no bodyN.
if i gt 7 set p 7 set p 9
if i gt 7 set x 8 elseif i gt 9 then
set y 7 else set p
99
TT
33if expr1 ?then? body1 elseif expr2 ?then? body2
elseif ... ?else? ?bodyN?
command IF '' expr '' '' commands ''
ifrest printf("if\n") ifrest ifrest
'' commands '' printf("else\n")
TT
34if expr1 ?then? body1 elseif expr2 ?then? body2
elseif ... ?else? ?bodyN?
command IF '' expr '' '' commands ''
ifrest printf("if\n") ifrest ifrest
'' commands '' printf("else\n") ifrest
ELSEIF '' expr '' '' commands '' ifrest
35break incr varName ?increment?