Title: The Procedure Abstraction
1The Procedure Abstraction
2Procedure Abstraction
- Part of compile time vs. run time
- a.k.a. static versus dynamic
- Most issues arise with procedures
- Issues
- Compile-time versus run-time behavior
- Finding storage for everything
- Mapping names to addresses
- Generating code to compute addresses
- Interfaces with other programs and the OS
- Efficiency of implementation
3Procedure Abstractions
- Control Abstraction
- Well defined entries exits
- Mechanism to return control to caller
- Parameter passing
- Clean Name Space
- Writing to locally visible names
- Local names mask identical non-locals
- Local names cannot be seen outside
- External Interface
- Access by procedure name parameters
- Protection for both caller callee
4The Procedure ...
- Procedures are key to building large systems
- Require system-wide agreement on
- memory layout protection,
- resource allocation
- code for calling sequences
- target architecture and O/S
- Establish a private context
- private storage for each invocation
- Encapsulate control flow, data abstractions
5The Procedure ...
- Procedures allow separate compilation
- Separate compilation allows us to build
non-trivial programs - Keeps compile times reasonable
- Lets multiple programmers collaborate
- A procedure linkage convention
- Each proc. has a valid run-time environment
- A callers environment is restored on return
- Compiler must generate code to ensure this
6Procedure Abstraction
- A procedure is an abstract software structure
- Underlying hardware understands
- bits, bytes
- integers, reals, addresses
- Underlying hardware does not understand
- Entries and exits
- Interfaces
- Call and return mechanisms
- might be able to save context at call
- Name space
-
7Procedure Abstraction
- Procedures have well-defined control flow
8Procedure Abstraction
- Procedures have well-defined control flow
int p(a,b,c) int a, b, c int d d
q(c,b) ...
s p(10,t,u)
9Procedure Abstraction
- Procedures have well-defined control flow
int p(a,b,c) int a, b, c int d d
q(c,b) ...
int q(x,y) int x,y return x y
s p(10,t,u)
10Procedure Abstraction
- Procedures have well-defined control flow
int p(a,b,c) int a, b, c int d d
q(c,b) ...
int q(x,y) int x,y return x y
s p(10,t,u)
11Procedure Abstraction
- Procedures have well-defined control-flow
int p(a,b,c) int a, b, c int d d
q(c,b) ...
int q(x,y) int x,y return x y
s p(10,t,u)
12Procedure Abstraction
- Procedures have well-defined control-flow
- Most languages allow recursion !!!
int p(a,b,c) int a, b, c int d d
q(c,b) ...
int q(x,y) int x,y return x y
s p(10,t,u)
13Procedure Abstraction
- Needs code to
- save restore return address
- map actual to formal parameters
- create storage for locals (and parameters)
- Compiler includes code to do this at run time
int p(a,b,c) int a, b, c int d d
q(c,b) ...
int q(x,y) int x,y return x y
s p(10,t,u)
14Procedure Abstraction
- Must preserve ps state while q executes
- recursion causes the real problem here
- keep a stack of activations
- Compiler includes code to do this at run time
- Compiler emits code that causes all this to
happen at run time
int p(a,b,c) int a, b, c int d d
q(c,b) ...
int q(x,y) int x,y return x y
s p(10,t,u)
15Procedure Abstraction
- Each procedure call creates its own activation
- Any name can be declared locally
- Local names mask identical non-local names
- Local names cant be seen outside procedure
- Nested procedures are inside
- Such rules are called lexical scoping
16Procedure Abstraction
- Why introduce lexical scoping?
- Gives compile-time mechanism for binding free
variables - Gives rules for naming resolves conflicts
- How can the compiler track of all these names?
- At point p, which declaration of x is current?
- At run-time, where is x found?
- At change of scopes, how is x deleted ?
- Symbol Tables !!!
17Variable Locations
- Locals/Parameters (i.e. automatics)
- Kept in activation record or in a register
- Kept in activation record or in a register
- Thus, lifetime matches procedures lifetime
- Global
- One or more named global data areas
- One per variable
- Static
- keep as global, but add identifying prefix
- ex. proc.mystaticvar or file.mystaticproc
18Placement
Better utilization if stack heap grow
toward each other Very old result
(Knuth) Code data separate or
interleaved Uses address space, not
allocated memory
S t a c k
S G t l a o t b i a c l
H e a p
C o d e
high
0
Single Logical Address Space
Code, static, global data have known size
Use symbolic labels in the code Heap
stack both grow shrink over time This is a
virtual address space - interface with O/S to
grow it
19The Big Picture
virtual address spaces
Compilers view
S t a c k
S G t l a o t b i a c l
S G t l a o t b i a c l
H e a p
S G t l a o t b i a c l
S G t l a o t b i a c l
S t a c k
S t a c k
S t a c k
C o d e
H e a p
C o d e
C o d e
H e a p
C o d e
H e a p
...
...
OSs view
0
high
Physical address space_
Hardwares view
20Activation Records
- Need a data area per invocation
- We call such an activation record (AR)
- Compiler can also store control data here
- One AR per procedure instance
- AR can be derived from the symbol table
21Translating Local Names
- How does a compiler find an instance of x ?
- Name is translated into a static coordinate
- lt level,offset gt pair
- offset is unique within that scope
- level is nesting level of the procedure
- emitted code will use static coordinate
- to generate addresses
- to generate references
22Storage for Blocks within a Single Procedure
- Fixed length data can always be at a constant
offset from the beginning of a procedure - In our example, the a declared at level 0 will
always be the first data element, stored at byte
0 in the fixed-length data area - The x declared at level 1 will always be the
sixth data item, stored at byte 20 in the fixed
data area - The x declared at level 2 will always be the
eighth data item, stored at byte 28 in the fixed
data area - But what about the a declared in the second block
at level 2?
23Variable-Length Data
- Arrays
- If size is fixed at compile time, store in
fixed-length data area - If size is variable, store descriptor in fixed
length area, with pointer to variable length area - Variable-length data area is assigned at the end
of the fixed length area for block in which it is
allocated
B0 int a, b assign value to
a B1 int v(a), b, x B2
int x, y(8) .
a
b
v
b
x
x
y(8)
v(a)
Variable-length data
Includes variable length data for all blocks in
the procedure
24AR Basics
Space for parameters to the current routine
parameters
Saved register contents
register save area
If function, space for return value
return value
Address to resume caller
return address
addressability
Help with non-local access
ARP
callers ARP
To restore callers AR on a return
local variables
Space for local values variables
One AR for each invocation of a procedure
25AR Details
- How does the compiler find the variables?
- At known offsets from AR pointer (ARP)
- The static coordinate leads to a loadAI
- Level specifies ARP, offset is the constant
- Variable-length data
- put below local variables
- Leave a pointer to it at an offset from ARP
- Otherwise, put on the heap
26Parameter Passing
- Call-by-reference
- passes a pointer to actual parameter
- Requires slot in the AR (for address of
parameter) - Multiple names with the same address?
- Call-by-value
- passes a copy of its value at time of call
- Arrays passed by reference, not value
- Each name gets a unique location
- Requires slot in the AR
27Procedure Linkages
- How do procedure calls actually work?
- At compile time, callee may not be available
- Calls may be in other compilation units
- May not know system call from user call
- All calls must use the same protocol
- Must use a standard sequence of operations
- Divides responsibility between caller callee
- Enforces control data abstractions
- ... Usually a system-wide agreement
28Procedure Linkages
- Standard procedure linkage
Procedure has standard prolog standard
epilog Each call involves a pre-call sequence
post-return sequence Exact code depends on the
number type of the actual parameters
procedure p
prolog
procedure q
prolog
pre-call
post-return
epilog
epilog
29Procedure Linkages
- Pre-call Sequence
- Sets up callees basic AR
- Helps preserve its own environment
- The Details
- Allocate space for the callees AR
- not space for local variables (yet)
- store each parameters value (or address)
- Save return address, callers ARP in callees AR
- Save any caller-save registers into callers AR
- Jump to address of callees prolog code
30Procedure Linkages
- Post-return Sequence
- Finish restoring callers environment
- Place any value back where it belongs
- The Details
- Copy return value from callees AR
- Free the callees AR
- Restore any caller-save registers
- Restore by-reference parameters to registers
- Copy back call-by-value/result parameters
- Continue execution after the call
31Procedure Linkages
- Prolog Code
- Finish setting up the callees environment
- Preserve parts of the callers environment that
will be disturbed - The Details
- Preserve any callee-save registers
- Allocate space for local data
- Easiest scenario is to extend the AR
- Find any static data areas referenced in the
callee - Handle any local variable initializations
32Procedure Linkages
- Epilog Code
- Wind up the business of the callee
- Start restoring the callers environment
- The Details
- return value is set by the return IR code
- Restore callee-save registers
- Free space for local data
- Load return address from AR
- Restore callers ARP
- Jump to the return address