Title: Why study compilers?
1Why study compilers?
- Ties lots of things you know together
- Theory (finite automata, grammars)
- Data structures
- Modularization
- Utilization of software tools
- You might build a parser.
- The theory of computation/formal language still
applies today. - As long as we still program with 1-D text.
- Helps you to be a better programmer
2One-dimensional Text
int x cin gtgt x if(xgt5) cout ltlt
Hello else cout ltlt BOO
The formatting has no impact on the meaning of
program
int xcin gtgt xif(xgt5) cout ltlt Hello else
3What is a translator?
- Takes input (SOURCE) and produces output (TARGET)
SOURCE
TARGET
ERROR
4Types of Target Code
- Pure machine code
- No operating system required.
- No library routines.
- Good for developing software for new hardware.
- Augmented code
- More common
- Executable code relies on o/s provided support
and library routines loaded as program is
prepared to execute.
5Conventional Translator
skeletal source program
source program
preprocessor
compiler
target assembly program
absolute machine code
assembler
loader / linker
relocatable machine code
library, relocatable object files
6Types of Target Code (cont.)
- Virtual code
- Code consists entirely of virtual instructions.
- Used by Re-Targetable compilers
- Transporting to a new platform only requires
implementing a virtual machine on the new
hardware. - Similar to interpreters
7Translator for Java
Java source code
Java bytecode
Java interpreter
Java compiler
Java bytecode
Bytecode compiler
absolute machine code
8Types of Translators
- Compilers
- Conventional (textual source code)
- Imperative, ALGOL-like languages
- Other paradigms
- Interpreters
- Macro processors
- Text formatters
- Silicon compilers
9Types of Translators (cont.)
- Visual programming language
- Interface
- Database
- User interface
- Operating System
10Conventional Translator
skeletal source program
source program
preprocessor
compiler
target assembly program
absolute machine code
assembler
loader / linker
relocatable machine code
library, relocatable object files
11Structure of Compilers
Lexical Analyzer (scanner)
Syntax Analysis (Parser)
Tokens
Source Program
Syntactic Structure
Semantic Analysis
Intermediate Representation
Optimizer
Symbol Table
Code Generator
Target machine code
12Structure of Compilers
Lexical Analyzer (scanner)
Tokens
Source Program
What about white spaces? Do they matter?
int x cin gtgt x if(xgt5) cout ltlt
Hello else cout ltlt BOO
int x cin gtgt x if ( x gt 5 )
cout ltlt Hello else cout ltlt BOO
13Tokenize First or as needed?
int x cin gtgt x if(xgt5) cout ltlt
Hello else cout ltlt BOO
Tokens Meaningful units in a program Value/Type
pairs
gtgt
symbol
cin
int datatype
x ID
14Tokenize First or as needed?
ArrayltArrayltintgtgt someArray
int
Array
lt
gtgt
ArrayltArrayltintgt gt someArray
gt
int
gt
Array
lt
15Structure of Compilers
Lexical Analyzer (scanner)
Syntax Analysis (Parser)
Tokens
Source Program
Syntactic Structure
Parse Tree
16Parse Tree (Parser)
Program
Data Declaration
datatype
ID
gtgt
cin
int
x
17Who is responsible for errors?
- int xy
- int 32xy
- 45b
- 45ab
- x x _at_ y
Lexical Errors / Token Errors?
18Who is responsible for errors?
Syntax errors
19Who is responsible for errors?
- 45ab
- One wrong token?
- Two tokens (45 ab)? Are whitespaces needed?
- Either way is okay.
- Lexical analyzer can catch the illegal token
(45ab) - Parser can catch the syntax error. Most likely
45 followed by ab will not be syntactically
correct.
20Structure of Compilers
Lexical Analyzer (scanner)
Syntax Analysis (Parser)
Tokens
Source Program
Syntactic Structure
Semantic Analysis
int x cin gtgt x if(xgt5) x SHERRY else
cout ltlt BOO
Symbol Table
21Structure of Compilers
Lexical Analyzer (scanner)
Syntax Analysis (Parser)
Tokens
Source Program
Syntactic Structure
Semantic Analysis
Intermediate Representation
Optimizer
Symbol Table
Code Generator
Target machine code
22Structure of Compilers
Front-end
Lexical Analyzer (scanner)
Syntax Analysis (Parser)
Tokens
Source Program
Syntactic Structure
Semantic Analysis
Intermediate Representation
Optimizer
Symbol Table
Code Generator
Back-end
Target machine code
23Translation Steps
- Recognize when input is available.
- Break input into individual components.
- Merge individual pieces into meaningful
structures. - Process structures.
- Produce output.
24Translation (Compilers) Steps
- Break input into individual components. (lexical
analysis) - Merge individual pieces into meaningful
structures. (parsing) - Process structures. (semantic analysis)
- Produce output. (code generation)
25Compilers
- Two major tasks
- Analysis of source
- Synthesis of target
- Syntax-directed translation
- Compilation process driven by syntactic structure
of the source being translated
26Interpreters
- Executes source program without explicitly
translating to target code. - Control and memory management reside in
interpreter, not user program. - Allow
- Modification of program as it executes.
- Dynamic typing of variables
- Portability
- Huge overhead (time space)
27Structure of Interpreters
Program Output
Interpreter
Source Program
Data
28Misc. Compiler Discussions
- History of Modern Compilers
- Front and Back ends
- One pass vs. Multiple passes
- Compiler Construction Tools
- Compiler-Compilers, Compiler-generators,
Translator-writing Systems - Scanner generator
- Parse generator
- Syntax-directed engines
- Automatic code generator
- Dataflow engines