Language Processors Review Main Source: Martin Slades Slides - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Language Processors Review Main Source: Martin Slades Slides

Description:

10/3/09. 1. Language Processors. Review. Main Source: Martin Slade's Slides. So far we looked at the language processing problem: ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 21
Provided by: martinan3
Category:

less

Transcript and Presenter's Notes

Title: Language Processors Review Main Source: Martin Slades Slides


1
Language Processors ReviewMain Source Martin
Slades Slides
  • So far we looked at the language processing
    problem-
  • converting HLL programs that are close to human
    thought to machine language code (like Simpletron
    instruction numbers)
  • Then we looked at interpreters
  • which convert one instruction at a time into
    actions carried out by the interpreter program
    itself

2
Language Processors Review
  • Then we looked at assemblers as an example of a
    translator-
  • Which converts instructions into machine code for
    a whole file, the machine code is then loaded and
    run by the machine at a later time
  • Before we deal with the phases of compilers in
    some detail we shall look briefly at how a
    compiler works in this lecture.

3
Compilers - Recap
  • Compilers convert all of the HLL program file
    (called source code) into the sequence of machine
    instructions (called the object code) that are
    necessary to carry out the Actions specified by
    the HLL program on the some specific machine
    (called the target machine)
  • The sequence of machine language instructions are
    held in a file (called an executable file)

4
Compilers - Recap
  • executable file can then be loaded into memory of
    target computer and then executed
  • machine instructions are executed directly by
    target machine
  • languages usually compiled include C, C,
    Pascal, Cobol, Fortran

5
Phases (stages) of compilation
  • There are four phases in compilation
  • 1. Lexical analysis - divides up source code
    characters into meaningful units
  • 2. Syntax analysis - constructs representation of
    the structure of program that can be used to
    generate machine code

6
Phases (stages) of compilation
  • 3. Code generation - produces machine code that
    implements effect of source code program on
    machine
  • 4. Optimisation - optional - produces more
    compact or faster executing version of machine
    code program

7
Lexical analysis - the problem
  • When compiler reads a source code file, it just
    reads one character value after another.
  • It does not know the significance of the
    characters, what they mean, or how they are
    grouped together.
  • similar to young child may read by spelling out
    individual letters without understanding their
    significance until the child puts them together
    to give a meaningful word

8
Lexical analysis - the problem
  • The lexical analysis problem is how to read in a
    sequence of characters from the source code file
    and group characters together that for the
    purposes of the program belong together

9
Introduction to Lexical Analysis
  • Lexical rules define when a string of characters
    belong together to form a lexical unit. The
    lexical units belong to a number of different
    lexical types
  • keywords e.g. method,
  • operators e.g. ,
  • punctuation e.g. ,
  • identifiers e.g. variable and method names, and
  • constant literals e.g. 5

10
Introduction to Lexical Analysis
  • lexical units of these lexical types are used in
    defining what the program does-
  • they are called terminal symbols
  • there are also lexical types such as white space
    and comment text.
  • Lexical units formed from these lexical types are
    ignored
  • they are not involved in defining what the
    program does.

11
Introduction to Lexical Analysis
  • To be useful the information about the nature of
    the terminal symbol has to be stored for later
    use.
  • A token is used to represent information about a
    terminal symbol. For example
  • the type of terminal symbol (lexical type)
  • the value of the symbol-
  • identifier name, literal value, etc.

12
Introduction to Lexical Analysis
  • So each terminal symbol will be represented by a
    token
  • Thus the whole program will be represented by an
    array of tokens

13
Lexical Analyser
  • Lexical analyser thus
  • 1. reads one character at a time from source code
  • 2. discards redundant characters (white space or
    comments)
  • 3. groups other characters together into terminal
    symbols
  • 4. produces a token for each terminal symbol to
    give an array of tokens

14
Lexical Analyser
  • e.g. lexical analysis of code
  • if (ch y)
  • x Integer.parseInt(input)
  • gives sequence of terminal symbols
  • if - ( - ch - - - y - - ) - x - - Integer
    - . - parseInt - ( - input - ) -
  • if Lexical Analyser finds characters in input
    stream that are invalid then this will be flagged
    as a lexical error e.g. a double with 2 or more
    decimal points

15
Syntax
  • Syntax is another word that is used for the
    grammar of some language.
  • The syntax of a language are the set of rules
    that specify how various lexical units of a
    language can be put together to form a program.
  • e.g. in an if statement after the if the
    conditional expression to be tested is placed
    inside a pair of brackets

16
Syntax
  • syntax rules are important.
  • They are necessary to make the meaning of
    statements clear e.g.
  • The condition to be tested in an if statement is
    to be found within the brackets following the
    if and not somewhere else

17
Syntax analysis - the problem
  • We now have a sequence of tokens, but like with a
    sequence of characters the sequence of tokens
    needs some extra work on it to allow it to
    represent the program code.
  • We need to know what sort of statement a given
    series of tokens defines, so we can later use
    this information in order to produce machine code
    that will execute the actions specified by the
    statement.

18
Syntax analysis - the problem
  • So syntax analysis problem is how to break up
    (analyse) the series of tokens into the syntactic
    structures of the program and represent that
    structure for later use
  • a parse tree is used for this

19
Syntax Analysis
  • Structure used to represent syntactic structure
    of program is a parse tree.
  • A tree structure is used because syntactic
    structures form hierarchies. For example given
    following fragment of code
  • if (x lt 10)
  • y y 2
  • x is a variable, which belongs to a conditional
    expression (with other things like lt and 10)

20
Syntax Analysis
  • The conditional expression in turn forms part of
    an if statement
  • The if statement will be part of some other
    structure
  • e.g. method body, which in turn is part of a
    class body, etc.
Write a Comment
User Comments (0)
About PowerShow.com