Title: Compiler Design Chapter 5
1Compiler Design - Chapter 5
Semantic Analysis
2Semantic Analysis
- se-man-tic of or relating to meaning in
language
- Websters Dictionary
3Semantic Analysis
- The compilation process is driven by the
syntactic structure of the program as discovered
by the parser.
- Semantic routines interpret meaning of the
program based on its syntactic structure
- finish analysis by deriving context-sensitive
information
- begin synthesis by generating the IR or target
code
- Semantic analysis connects variable definitions
to their uses, check that each expression has
correct type, etc.
4Semantic Analysis
- What semantic issues can be determined?
- Is X declared before it is used?
- Are any names declared but not used?
- Which declaration of X does this reference?
- Is an expression type-consistent?
- Do the dimensions of a reference match the
declaration?
- Is an array reference in bounds?
5Symbol Tables
- Symbol Tables ? Environments
- Mapping IDs to Types and Locations
- Definitions ? Insert in the table
- Use ? Lookup ID
- Scope
- Where the IDs are visible
- Ex formal parameters, local variables in
MiniJava
- - inside the method where defined
See Figure 5.7 on page 111
6Changing Scope
- Identifiers come into scope at the beginning of a
subprogram/block and go out of scope at the end.
- Example (in C)
- void testfunc ()
-
- int a // a enters scope
- for ( int b1 b
-
- int c // c enters scope
-
- // b,c leave scope
-
- // a leaves scope
7Binding
- As the declarations of types, variables, and
functions are processed, identifiers are bound to
meanings in the symbol table.
- A symbol table (environment) is a set of
bindings.
- Example
- environment s0 contains the bindings
- g string, a int
a is an integer variable
g is a string variable
8Environments
- Initial Env s0
- Class C
- int a int b int c
- Env s1 s0 a - int, b - int, c - int
- public void m()
- System.out.println(ac)
- int j ab
- Env s2 s1 j - int
- String a hello
- Env s3 s2 a - String
- System.out.println(a)
- System.out.println(j)
- System.out.println(b)
-
- Env s1
-
- Env s0
- Bindings in the right-hand table override those
in the left
- a ? String takes precedence in s3
9Implementing Environments
- Functional Style
- Keep previous env and create new one
- When restored, discard new one and back to old
- Imperative Style
- Destructive update the env (symbol tables)
- Modify s1 until it becomes s2.
- While s2 exists, we cannot look things up in s1
- Undo need undo stack
- A symbol is added to the undo stack when it is
added to the environment
- At the end of scope, symbols popped from the undo
stack have their latest binding removed from s
10Multiple Symbol Tables ML-style
- There can be several active environments at once
Each module / class / record has a symbol table
s of its own
- structure M sturct
- structure E struct
- val a 5
- end s0 s2
- structure N struct
- val b 10
- val a E.a b
- end s0 s2 s4
- structure D struct
- val d E.a N.a
- end
- End s7
Initial Env s0 s1 a - int s2 E - s1
s3 b - int, a - int s4 N - s3
s5 d - int s6 D - s5 s7 s2 s4
s6
11Multiple Symbol Tables Java-style
- Java allows forward reference (e.g., D.d is legal
inside N), so E, N, and D are all compiled in s7
and the result is M ? s7
Package M s7 class E static int a 5
s7 class N static int b 10 st
atic int a E.a b s7 class D sta
tic int d E.a N.a s7 End s7
Initial Env s0 s1 a - int s2 E - s1
s3 b - int,a - int s4 N - s3 s
5 d - int s6 D - s5 s7 s2 s4
s6
12Efficient Imperative Symbol Tables
- Imperative-style environment are usually
implemented using hash tables
- Operation s s a ? t is implemented by
inserting t in the hash table with key a
- Program 5.2 (p106) ith bucket is a linked list
of all the elements whose keys hash to i mod SIZE
- Hash table with external chaining (linked list)
is good for deletion
- delete a ? t to recover s at the end of the
scope of a
13Efficient Imperative Symbol Tables
Update
s
s s d - t4
Undo
a
t1
d
t4
b
t3
c
t2
See Program 5.2 (p106)
14Efficient Functional Symbol Table
- Efficient Functional Approach
- s s a - t
- would return s a - t
- i.e., compute s s a ? t but still
have s available
- If implemented with a Hash table would have to
create O(n) buckets for each scope
- Is this a good idea?
15Implementation by Search Tree
m1
m2
Creating a new tree (sharing some structure with
the old one)
m2 m1 mouse - 4
log(n) time using red-black tree
m1 bat - 1 , camel - 2, dog - 3
16Symbols
- Symbol Representation
- Program 5.2 hash table examine strings
- hash operations and lookup
- Convert each string to a symbol (integer number)
- See Program 5.5 on page 109
- Comparing symbols (integers) for equality is
fast.
- Extracting an integer hash key is fast.
- Comparing two symbols for greater-than is
fast.
- See Program 5.6 on page 110 for implementation
17Some sample program(I)
- /
- The Table class is similar to
java.util.Dictionary,
- except that each key must be a Symbol and
there is
- a scope mechanism.
- /
- public class Table
- private java.util.Dictionary dict
- new java.util.Hashtable
()
- private Symbol top
- private Binder marks
- public Table()
18Some sample program(II)
- package Symbol
- class Binder
- Object value
- Symbol prevtop
- Binder tail
- Binder(Object v, Symbol p, Binder t)
- valuev prevtopp tailt
-
p
t
v
19Some sample program(III)
- /
- Puts the specified value into the Table,
- bound to the specified Symbol.
- /
- public void put(Symbol key, Object value)
- dict.put(key, new Binder(value, top,
- (Binder)dict.get(ke
y)))
- top key
-
- /
- Gets the object associated with the specified
- symbol in the Table.
- /
- public Object get(Symbol key)
- Binder e (Binder)dict.get(key)
- if (enull) return null
- else return e.value
-
20Some sample program(IV)
- /
- Remembers the current state of the Table.
- /
- public void beginScope()
- marks new Binder(null,top,marks)
topnull
- /
- Restores the table to what it was at the most
recent
- beginScope that has not already been ended.
- /
- public void endScope()
- while (top!null)
- Binder e (Binder)dict.get(top)
- if (e.tail!null) dict.put(top,e.tail)
- else dict.remove(top)
- top e.prevtop
-
- topmarks.prevtop
- marksmarks.tail
21Type-Checking in MiniJava
- Binding for type-checking in MiniJava
- Variable and formal parameter
- Variable name type of variable
- Method
- Method name result type, parameters
(including position information), local
variables
- Class
- Class name variables, method declaration,
parent class
22Symbol Table example
- See Figure 5.7 on page 111
- Primitive types
- int - IntegerType()
- Boolean - BooleanType()
- Other types
- Int - IntArrayType()
- Class - IdentifierType(String s)
23SymbolTable for MiniJava
- class SymbolTable
- public SymbolTable()
- public boolean addClass(String id, String
parent)
- public Class getClass(String id)
- public boolean containsClass(String id)
- public Type getVarType(Method m, Class c,
String id)
- public Method getMethod(String id, String
classScope)
- public Type getMethodType(String id, String
classScope)
- public boolean compareTypes(Type t1, Type t2)
24SymbolTable for MiniJava
- getVarType(Method m, Class c, String id)
- In c.m, find variable id
- Local variable in method
- Parameter in parameter list
- Variable in the class
- Variable in the parent class
- getMethod(), getMethodType()
- May be defined in the parent Classes
- compareTypes()
- Primitive types int, boolean, IntArrayType
- Subtype IdentifierType
25SymbolTable Class
- class Class
- public Class(String id, String parent)
- public String getId()
- public Type type()
- public boolean addMethod(String id, Type
type)
- public Method getMethod(String id)
- public boolean containsMethod(String id)
- public boolean addVar(String id, Type type)
- public Variable getVar(String id)
- public boolean containsVar(String id)
- public String parent()
26SymbolTable Variable
- class Variable
- public Variable(String id, Type type)
- public String id()
- public Type type()
-
27SymbolTable Method
- class Method
- public Method(String id, Type type)
- public String getId()
- public Type type()
- public boolean addParam(String id, Type type)
- public Variable getParamAt(int i)
- public boolean getParam(String id)
- public boolean containsParam(String id)
- public boolean addVar(String id, Type type)
- public Variable getVar(String id)
- public boolean containsVar(String id)
28Type-Checking Two Phases
- Phase I - Build Symbol Table
- Phase II - Type-check statements and expressions
- public class Main
- public static void main(String args)
- try
- Program root new MiniJavaParser(System.i
n).Goal()
- BuildSymbolTableVisitor v1 new
BuildSymbolTableVisitor()
- root.accept(v1)
-
- root.accept(new TypeCheckVisitor(v1.getSym
Tab()))
-
- catch (ParseException e)
- System.out.println(e.toString())
-
-
29BuildSymbolTableVisitor()
- Phase I - implemented by a visitor that visits
nodes in a MiniJava syntaxtree and builds a
symbol table
- See Program 5.8 on Page 112
- public class BuildSymbolTableVisitor extends
TypeDepthFirstVisitor
-
- private Class currClass
- private Method currMethod
-
- // Type t
- // Identifier i
- public Type visit(VarDecl n)
-
- Type t n.t.accept(this)
- String id n.i.toString()
30BuildSymbolTableVisitor() - Contd
- if (currMethod null)
- if (!currClass.addVar(id,t))
- System.out.println(id "is already
defined in "
- currClass.getId())
- System.exit(-1)
-
- else
- if (!currMethod.addVar(id,t))
- System.out.println(id "is already
defined in "
- currClass.getId() "."
- currMethod.getId())
- System.exit(-1)
-
-
- return null
-
31BuildSymbolTableVisitor() TypeVisitor()
- public Type visit(MainClass n)
- public Type visit(ClassDeclSimple n)
- public Type visit(ClassDeclExtends n)
- public Type visit(VarDecl n)
- public Type visit(MethodDecl n)
- public Type visit(Formal n)
- public Type visit(IntArrayType n)
- public Type visit(BooleanType n)
- public Type visit(IntegerType n)
- public Type visit(IdentifierType n)
-
32TypeCheckVisitor(SymbolTable)
- Phase II - implemented by a visitor that
type-checks all statements and expressions.
- See Program 5.9 on page 113
- package visitor
- import syntaxtree.
- public class TypeCheckVisitor extends
DepthFirstVisitor
- static Class currClass
- static Method currMethod
- static SymbolTable symbolTable
-
- public TypeCheckVisitor(SymbolTable s)
- symbolTable s
-
-
33TypeCheckVisitor(SymbolTable) - Contd
-
- // Identifier i
- // Exp e
- public void visit(Assign n)
- Type t1 symbolTable.getVarType(currMethod,curr
Class,
-
n.i.toString())
- Type t2 n.e.accept(new TypeCheckExpVisitor()
)
- if (symbolTable.compareTypes(t1,t2)false)
- System.out.println("Type error in assignment
to "
- n.i.toString())
- System.exit(0)
-
-
34TypeCheckExpVisitor()
- package visitor
- import syntaxtree.
- public class TypeCheckExpVisitor extends
TypeDepthFirstVisitor
-
- // Exp e1,e2
- public Type visit(Plus n)
- if (! (n.e1.accept(this) instanceof
IntegerType) )
- System.out.println("Left side of Plus must
be of type integer")
- System.exit(-1)
-
- if (! (n.e2.accept(this) instanceof
IntegerType) )
- System.out.println("Right side of Plus
must be of type integer")
- System.exit(-1)
-
- return new IntegerType()
-
-
35TypeCheckVisitor Visitor()
- public void visit(MainClass n)
- public void visit(ClassDeclSimple n)
- public void visit(ClassDeclExtends n)
- public void visit(MethodDecl n)
- public void visit(If n)
- public void visit(While n)
- public void visit(Print n)
- public void visit(Assign n)
- public void visit(ArrayAssign n)
-
36TypeCheckExpVisitor() TypeVisitor()
- public Type visit(And n)
- public Type visit(LessThan n)
- public Type visit(Plus n)
- public Type visit(Minus n)
- public Type visit(Times n)
- public Type visit(ArrayLookup n)
- public Type visit(ArrayLength n)
- public Type visit(Call n)
- public Type visit(IntegerLiteral n)
- public Type visit(True n)
- public Type visit(False n)
- public Type visit(IdentifierExp n)
- public Type visit(This n)
- public Type visit(NewArray n)
- public Type visit(NewObject n)
- public Type visit(Not n)
37Overloading of Operators, .
- When operators are overloaded, the compiler must
explicitly generate the code for the type
conversion.
- integer integer
- double double
- integer double?
- For an assignment statement, both sides have the
same type.
- When we allow extension of classes, the right
hand side is a subtype of LHS.
- Parent ObjP Child ObjC
38Method Calls e.m()
- Lookup method in the SymbolTable to get parameter
list and result type
- Find m in class e
- The parameter types must be matched against the
actual arguments.
- Result type becomes the type of the method call
as a whole.
- Etc, etc, .
39Error Handling
- For a type error or an undeclared identifier, it
should print an error message.
- And must go on..
- Recovery from type errors?
- Do as if it were correct.
- Not a big deal in the examples we will examine
but the reality is bit more complicated.
-
- int i new C()
- int j i i
-
-
- enter i in the symbol as an integer and go on