Compiler Construction - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Compiler Construction

Description:

... is performed by a deterministic finite automaton (DFA) generated by Lex. ... down automaton (PDA): A pushdown automaton is a finite state automaton that can ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 18
Provided by: scott358
Category:

less

Transcript and Presenter's Notes

Title: Compiler Construction


1
Compiler Construction
  • Main Source
  • Compiler Construction Lecture Notes,
  • Prof Trevor Mudge and Prof Mark Hodges
  • University of Michigan

2
Lexical Analyzer generators Lex/Flex
  • Lex helps write programs whose control flow is
    directed by instances of regular expressions in
    the input stream.
  • Flex is a Fast scanner generator tool
  • Used for Automatic generation of scanners
  • Hand-coded ones are faster
  • But tedious to write, and error prone!
  • Lex/Flex
  • Given a specification of regular expressions
  • Generate a table driven FSA
  • Output is a C program that you compile to produce
    your scanner

3
Lexical Analyzer generators Lex/Flex
  • Lex source is a table of regular expressions and
    corresponding program fragments.
  • The table is translated to a program which reads
    an input stream, copying it to an output stream
    and partitioning the input into strings which
    match the given expressions.
  • As each such string is recognized the
    corresponding program fragment is executed.
  • The recognition of the expressions is performed
    by a deterministic finite automaton (DFA)
    generated by Lex.
  • The program fragments written by the user are
    executed in the order in which the corresponding
    regular expressions occur in the input stream.

4
How Does Lex Work?
FLEX
Regular Expressions
C code
Some kind of DFAs and NFAs Activity taking place
in the box
5
How Does Lex Work?
Flex
RE ? NFA NFA ? DFA Optimize DFA
REs for Tokens
DFA Simulation
Character Stream
Token stream (and errors)
6
Regular Expression to NFA
  • It is possible to construct an NFA from a regular
    expression using an algorithm. For example-
  • Thompsons Construction algorithm builds the NFA
    inductively
  • Defines rules for each base RE
  • Combines rules for more complex REs

general machine
s
f
E
more on this in supplementary lectures
7
Syntax Analysis Parser
  • Checks input stream for syntactic correctness
  • Framework for subsequent semantic processing
  • Implemented as a push down automaton (PDA)
  • A pushdown automaton is a finite state automaton
    that can make use of a stack containing data in a
    binary form.
  • Lots of variations
  • Hand coded,
  • Table driven (top-down or bottom-up)
  • For any non-trivial language, writing a correct
    parser is a challenge!

8
Parser Generator Tool Yacc/Bison(yet another
compiler-compiler)
  • Yacc
  • (yet another compiler-compiler)
  • Given a context free grammar
  • Generates a parser for that language (again a C
    program)
  • Bison is a general-purpose parser generator that
    converts a grammar description for a context-free
    grammar into a C program to parse that grammar.

9
Static Semantic Analysis
  • Involves several distinct actions to perform
  • Check definition of identifiers, ascertain that
    the usage is correct
  • Disambiguate overloaded operators
  • Translate from source to IR (intermediate
    representation)
  • The standard formalism to define the application
    of semantic rules is
  • the Attribute Grammar (AG)

10
Static Semantic Analysis Attribute Grammar (AG)
  • AG is graph that provides for the migration of
    information around the parse tree.
  • AG defines the information that will need to be
    in the parse tree in order to successfully
    perform semantic analysis.
  • This information is stored as attributes of the
    nodes of the tree.

11
Revisit the General Structure of a Modern Compiler
Source Program
Lexical Analysis
Scanner
Syntax Analysis
Parser
Context Symbol Table CFG
Front end
Build high-level IR
Semantic Analysis
High-level IR to low-level IR conversion
Controlflow/Dataflow
Optimization
Back end
Code Generation
Assembly Code
Machine independent asm to machine dependent
12
Front-end
  • So far we look at some of the basic aspects of
    front end compiler phases dealing with
  • Statements, loops, etc
  • These statements should then be broken down into
    multiple assembly statements.

13
Backend
  • Machine independent assembly code involves
  • 3-address code (TAC)
  • Each TAC can be described as a quadruple
    (operator, operand1, operand2, result).
  • Each statement has the general form of
  • x y op Z
  • where x, y and z are variables, constants or
    temporary variables generated by the compiler, op
    represents any operator, e.g. an arithmetic
    operator.
  • Infinite virtual registers, infinite resources
  • Standard opcode repertoire
  • load/store architecture
  • Goals
  • Optimize code quality
  • Map application to real hardware

14
Dataflow and Control flow Analysis
  • Provide the necessary information about variable
    usage and execution behavior to determine when a
    transformation is legal/illegal
  • Dataflow analysis
  • Is a process for collecting run-time information
    about data in programs without actually executing
    them.
  • Identify when variables contain interesting
    values.
  • Which instructions created values or consume
    values

15
Control Flow Analysis
  • Execution behavior caused by control statements
  • Ifs, for/while loops, gotos
  • Uses Control flow graph (CFG)
  • Source (http//en.wikipedia.org/wiki/Control_flow
    _graph)
  • An abstract data structure representation of a
    program,
  • maintained internally by a compiler.
  • Each node in the graph represents a basic block,
    i.e. a straight-line piece of code without any
    jumps or jump targets
  • jump targets start a block, and jumps end a
    block. Directed edges are used to represent jumps
    in the control flow.

16
Optimization
  • Is about how to make the code go faster.
    Alternative optimizers are
  • Classical optimizations which involve
  • Dead code elimination remove useless code
  • Common sub-expression elimination recomputing
    the same thing multiple times
  • Machine independent
  • Useful for almost all architectures
  • Machine dependent
  • Depends on processor architecture
  • Memory system, branches dependencies

17
Code Generation
  • Is the mapping of machine independent assembly
    code to the target architecture.
  • Takes care of virtual to physical binding-
  • Instruction selection
  • Register allocation infinite virtual registers
    to N physical registers
  • Scheduling binding to resources
  • Assembly emission
  • Machine assembly is our output,
  • Assembler, linker will then take over to create
    binary
Write a Comment
User Comments (0)
About PowerShow.com