Type Compatibility

About This Presentation

Title:

Type Compatibility

Description:

'different names, different meanings' Type Compatibility ... If Ada used strict type name compatibility, could not assign count to index or vice-versa ... – PowerPoint PPT presentation

Number of Views:96

Avg rating:3.0/5.0

Slides: 140

Provided by: WakeF

Category:

more less

Transcript and Presenter's Notes

Title: Type Compatibility

1
Type Compatibility

Another Type Checking issue What are equivalent
types?
Need to be concerned with how type is named
and/or defined
Equivalence by name gt name type compatibility
Equivalence through definition structure type
compatibility

2
Type Compatibility

Name Type Compatibility
Two variables have compatible types only if they
are defined in declarations that use the same
type name.
Easy to implement
Checking of type bindings stored in symbol table
Safe
different names, different meanings

3
Type Compatibility

Strict name type compatibility is very
restrictive
The following is an Ada declaration
type Indextype is 1..100
count Integer
index Indextype
If Ada used strict type name compatibility,
could not assign count to index or vice-versa

4
Type Compatibility

Name type compatibility clearly makes sense in
one direction
index count should be illegal (as Indextype is
a subset of the integers)
count index should be alright, so we may need
to augment name type compatibility to allow this
to happen

5
Type Compatibility

Structure type compatibility
Two variables have compatible types only if their
types have identical structures
Usually, for aggregates, requires that grouping
type is composed of same types in same order
Same memory layout
More flexible than name compatibility, but
difficult to implement.
Implementation
Instead of just checking names in the symbol
table, the entire structure of two types must be
compared.

6
Type Bindings

Type binding also provides us with information on
how much memory is required for a variable
Primitive types specific number of bytes
int x gt 4 bytes
Aggregate types sum over (requirements from
grouping multiple primitives or other aggregates)
int x100 gt 400 bytes

7
Memory Layouts
Note layout of structs Support additive
movement through structs, arrays
8
Type Compatibility

Checking type structures
Arrays?
Same data-type, same length?
Same data-type, same length, different array
indices?
Same data-type, different length?
Self-referencing types?
struct LinkedListNode int data LinkedListNode
next

9
Type Compatibility

Checking type structures
Are these two types compatible?
struct person string name int age
struct vehicle string name int numberOfTires
Are these two types compatible?
struct PersonType1 int age string name
struct PersonType2 string name int age
How do you differentiate?
struct person string name int age
struct vehicle string name int age
Different names generally mean desire different
abstractions maybe structure compatibility is
too flexible?

10
Type Compatibility

Languages you are using
In general, Java and C use name type
compatibility
Plain old vanilla C, for all types except structs
and unions, uses structural type compatibility

11
C Type Compatibility

In most cases, plain, original C uses structural
type compatibility
Every struct and union declaration creates a new
type incompatible with other types, even if they
are structurally the same.
Unless structs and unions declared in separate
files do use structural type equivalence.
Any type defined with typedef is equivalent to
its parent type (as typedef is just providing an
alias).

12
Example of C structural compatibility vs C name
compatibility
Same program written in C, C enums are
constructions over integers (each enum item maps
to an integer) C structural Theyre both ints!
Feel free to assign away! C name two
different things (Suits and Colors)
Differences arise when compiled using respective
compilers
13
Coercion/Casting

Most modern languages support automatic coercion
between types
Coercion Automatic translation of a type to
another type to fulfill type checking
requirements
A more specific type is translated to the more
general type
int i 2 double v i
Truck t new Truck() Automobile A t
Guaranteed that operations defined on more
general type are defined on more specific type,
as all items of specific type are also items of
general type
Doubles are a larger set of numbers than
integers.
Automobiles are a larger set of vehicles than
Trucks.

14
Coercion/Casting

Think of coercion as automated casting.
Casting Programmer specified translation of
types
Translation from more general to more specific
double c 2.5 int i (int)c
Truck t (Truck)myVector.elementAt(0)
Java containers store Objects, the super-class of
everything (most general)

15
Coercion/Casting

Moving to a more general type widening
conversion
Widening is almost always safe (not losing
information), so usually performed automatically.
Moving to a more specific type narrowing
conversion
Narrowing can lose information (preciseness of
number 2.5 as a double vs 2 as an int), so
usually requires programmer request.

See pg. 323 (Section 7.4) for more info on this
topic
16
Coercion/Casting

Liskov Substitution Principle
I view it as an argument for why automatic
coercion is safe
Let q(x) be a property provable about objects x
of type T. Then q(y) should be true for objects y
of type S where S is a subtype of T
From Wikipedia based on the idea of
substitutability that is, if S is a subtype of
T, then objects of type T in a program may be
replaced with objects of type S without altering
any of the desirable properties of that program
(e.g. correctness)

http//en.wikipedia.org/wiki/Liskov_substitution_p
rinciple
17
Implementation of casting/coercion
Casting instructions are hardware based a good
thing.
18
Semantic checks
Why is this check required? Can it raise runtime
errors?
19
Reinterpret casts

Reinterpret
Static
Dynamic
Used in subtype
cast like Java,
does runtime check

20
Coercion/Casting

While C/C/Java do support automatic coercion,
not all languages do?
An example?
ML

21
Location Bindings

Location binding Binding a variable to a
particular location in memory
Also referred to as storage binding
Two parts of the process, with familiar terms
Allocation Reserving a block from a pool of
available memory
Deallocation Returning a bound block to the pool
of available memory

22
Pools of Memory

Two areas where memory can be allocated from
Stack
Heap
Stack is allocated in a
strict order
Heap is much more
like a free pool of memory
Hope they dont collide with
each other

Heap
Memory Allocated To Program
Stack
Another example (Windows related) http//www.nosta
rch.com/download/greatcode2_ch8.pdf
23
Lifetime

The period for which a variable is bound to a
particular memory address is its lifetime.
There are four common storage/lifetime bindings
for variables
Static Explicit heap-dynamic
Stack-dynamic Implicit heap-dynamic

24
Static Variables

Static variables
Bound to memory cells before execution, released
at termination
Compile time binding of storage
Placed in a spot than can be accessed directly,
rest of code can ask for that spot specifically
Commonly allocated from (around) the heap area
Used for global variables in languages that
support globals, and for true static variables in
C (where value is persistent across function
calls)

25
Static Variables

Fast to use no runtime overhead to
create/delete, fixed direct addressing
But, if have only static variables in your
language, cant support recursion
In recursion, same variables are used multiple
times, and previous uses are still important and
must be maintained

26
Stack Dynamic Variables

Run time binding of storage
Allocated when declaration is encountered
Deallocated when move out of scope of declaration
(no longer visible)
Types are already statically bound via
compilation
Used for allocation of local variables in
subprograms
Usually all local variables, even if declared in
middle of function, are allocated space at start
of function call
Deallocation when subprogram terminates
Allocated from stack part of memory
Well talk about stack management more later
(Chapter 10)

27
Stack Dynamic Variables

Important for recursion
Allows each recursive call to allocate memory
from the stack for the variable instances in that
particular call to the subprogram
Disadvantages
Overhead of allocation, deallocation at runtime
(every method call)
Not that expensive though compiler can ask for
a whole chunk as it can precompute amount it
needs to ask for (again wait until Chapter 10)
Require indirect addressing (relative position in
stack)
Dont know where your method is being put on the
stack until method starts
Does not allow history sensitive variables like
static does

28
Explicit Heap-Dynamic Variables

Allocated, deallocated explicitly by the
programmer via special instructions
Referenced through pointer/reference variables
Indirect addressing (2 memory accesses)
Run time binding of storage

29
Explicit Heap-Dynamic Variables

C - new and delete statements, can use on all
types (scalars, aggregates)
int intnode
intnode new int
delete intnode
Java every object (instance of a class)
PrintWriter pw new PrintWriter()
// no delete for Java well see this again
(Chapter 6)

30
Explicit Heap-Dynamic Variables

Advantages
Useful for dynamic structures (linked lists,
trees) that can adapt to programs data
requirements
Disadvantages
Multiple memory accesses for pointers
Difficulty of programming with pointers correctly
Heap management well come back to this, not
trivial (Chapter 6)

31
Implicit Heap-Dynamic Variables

Bound to heap storage only when assigned values
Essentially, these are dynamically typed
variables, where all features (type, value,
location) are bound upon assignment
Javascript example again
list 10.2, 3.5
list 47 gt reallocated storage?
list 10.2, 3.5, 28.2 gt reallocation
(bigger)
While flexible, suffers from the usual dynamic
binding problems discussed earlier, as well as
heap management problems mentioned on previous
slide

32
Examples
33
Constants

Constants are interesting, two key ways of
implementing
Placement in special read only memory
Compiler verification wont allow changes after
constant defined
Any guess on what C does?

34
ExampleConstant
35
Pointers

Pointer definition
A data object (variable) whose value is
The memory location of another data object, or
Null, a general term for a pointer to nowhere
Pointers, when available, are at the same level
as other types
Can declare pointer variables
Can hold pointers in an array
Can have a pointer as part of an aggregrate
data-structure (ListNode for example)

36
Pointer Specifications

Attributes
Pointer variable name
Type of data being pointed to
Double myDoublePointer
Truck myTruckObjectPointer
Values pointer can take on
Any addressable memory address
Usually any integer
(64 bit architectures?)

37
Pointer Specifications

Operations Declaration
Sets up space for pointer
Could come from stack stack-dynamic
(if pointer is a local variable)
Could come from heap heap-dynamic
(if requested at runtime)

38
Pointer Specifications

Operations Assignment with object creation
A very common use of pointers is when objects and
variable are heap-dynamic.
C Syntax
double myDoublePtr new double
RHS requests allocation of a fixed size data
object
The return value from the new statement is the
address of the data object just created.
Address is stored in pointer variable

Pointer
2012
2000
2004
2.345
2012 2020
Actual Data
39
Pointer Specifications

Operations Assignment with object creation
double myDoublePtr new double
Data objects created on the RHS are anonymous
Not bound to a name in the program
Thus, they can be lost as well
Creation and assignment can be performed at
anytime during program execution

40
Review Question

Imagine this is a part of a method you are
writing
List l new List()
Which part(s) are stack dynamic and which are
heap dynamic?
Which parts(s) are allocated at method startup
and which are allocated at statement execution
time?

41
Pointer Specifications

Operations Dereferencing
The operation that allows the data referenced by
the pointer to be accessed
In C/C, this uses the (asterisk) operator
Syntactic Sugar As a shortcut to a field/method
of an aggregate datastructure, can use -gt (arrow)
operator
cout ltlt myDoublePointer ltlt endl
cout ltlt (myAutomobilePointer).getYear() ltlt endl
cout ltlt myAutomobilePointer-gtgetYear() ltlt endl

42
Pointer Specifications

Operations Assignment with addressing
Pointers can be used to reference variables that
arent created using new
To obtain address of an arbitrary variable, use
addressing operator.
C/C - use (ampersand)
int myIntVariable 5
int myIntPointer myIntVariable
myIntPointer myIntPointer 1

43
Pointer Specifications

Operations Arithmetic
Mathematical operations that work directly on the
pointer
int ptr new int
ptr ptr 1
Changes the value of the pointer, not what is
being referenced
Doesnt necessarily update address by one byte
(as it syntactically would appear)

44
Pointer Specifications

Pointers reference a particular type
i.e. an integer pointer references an item 4
bytes long
All fixed size data structures (primitives,
classes, structs, unions, etc) the compiler can
figure out the size beforehand
Pointer arithmetic moves the pointer up by the
size of the data item being pointed to
(ie it moves completely over that item)

45
Example
46
Pointer Specifications

Many systems implement arrays using pointers
int list5 int ptr list
// int list new int5 (dynamic allocation)
cout ltlt ptr ltlt endl //print list0
cout ltlt (ptr 1) ltlt endl //print list1
//print first 5 items
int i 0 while (i lt 5) cout ltlt (ptr) ltlt
endl i
void updateArray(int list) // array parameter

47
Review Question

What is the difference in outputs between these
two sets of code?
int list new int3 list0 24 list1
33 list2 52
for (int i 0 i lt 3 i)
list
i
// print list next
int list new int3 list0 24 list1
33 list2 52
for (int i 0 i lt 3 i)
(list)
i
// print list next

48
Pointer Specifications
49
Pointers with Arrays

Are there advantages?

50
Pointer specifications

Operations Object deletion through pointer
Free up memory from dynamically allocated objects
by calling delete on pointer to object
double myDoublePtr new double
delete myDoublePtr
Truck myTruckPtr new Truck()
delete myTruckPtr
int myIntegerArray new int10
delete myIntegerArray

51
Object deletion example
52
Pointer Specifications

Most languages specify that a pointer points to a
particular type
Would it be reasonable for another use of
pointers to be allowed to point to any type?
What would this require for the language to do?

53
Pointer Specifications

Object oriented languages allow a third technique
for pointers
Can point to any type that is a subtype of the
original pointer type
Doesnt hold for primitive type-subtype
double/int relationships

What does this mean the system has to do
for pointers to objects? Runtime
bindings Dynamic method calls?
54
void pointers

In C/C, can use void pointers
Syntax for dealing with pointers where referenced
type is unimportant
Used in malloc, free (old school memory
allocation)
Used in functions where want arbitrary types to
be accepted
Have to cast to appropriate type pointer before
dereferencing or doing arithmetic

55
void pointers Examples

stdlib.h defines qsort (quicksort) as
void qsort(void base, size_t num_elements,
size_t element_size, int (compare)(void const
a, void const b))
Takes an array of any type of object you just
have to make sure you send it how to compare
those two objects.
Compare must return lt 0 if a lt b 0 if a b gt 0
if a gtb

56
void pointers Examples
57
Function Pointers

C/C also allow you to use function pointers
/ function returning pointer to int /
int func(int a, float b)
/ pointer to function returning int /
int (func)(int a, float b)

58
Function Pointers

void qsort(void base, size_t num_elements,
size_t element_size, int (compare)(void const
a, void const b))
qsort(studentArray, numberOfStudents,
sizeof(Student), compareByName)
Array names, function names converted to
addresses

59
Function pointers
60
(No Transcript)
61
Problems with Pointers

Dangling Pointer
When a pointer continues to hold the memory
address of a heap-allocated variable that has
been deallocated.
Why is this a problem?
Memory pointed to could now be in use by another
variable.

62
Dangling Pointers

New variable in old memory spot may have
different type than previous, almost definitely
new value
Type checking is going to OK everything oh,
hes adding two integers through integer pointers
thats fine
But, meaning of underlying data may be different
Writing into that spot could mess up other
variable

63
Dangling Pointers

Code which results in dangling pointers?
int p1 new int
int p2 p1
delete p1
Some languages will set p1 to 0, but wont touch
p2 (not my version of C though it doesnt
even clean up p1).

64
Problems with Pointers

Lost heap dynamic variables
When the address of the variable being pointed to
is lost
Why is this a problem?
Prevents further use of the variable (dont know
how to get to it)
Cant delete and reuse that part of memory
(already allocated)

65
Lost Heap Dynamic Variables

How do you end up with lost heap dynamic
variables?
int p1 new int
p1 new int
Is there a common name for this?
Commonly referred to as memory leak
See memory eaten away

66
Solutions to Pointer Problems

Ways to work around dangling pointers
1 Dont allow the user control (Java)
Requires system-controlled memory management,
garbage collection
2 Safety algorithms
These dont prevent the problem, but prevent the
user from fiddling with memory they shouldnt by
throwing an error

67
Safety Algorithms

If you trust the programmer, dangling pointers
can be resolved if
A programmer always sets all pointers to a
variable to null after the variable is
de-allocated
A system is likely to only set one to zero (the
one through which the deletion occurred), rest
are up to the programmer
Letting the system do it all large overhead of
recording who is pointing to whom

68
Safety Algorithms

A way to work around dangling pointers
Tombstones Each heap-dynamic variable, when
allocated, is also given another memory location
called the tombstone.
This tombstone memory location is a pointer to
the variable.
All user defined pointers to the variable
actually get the address of the tombstone

69
Tombstones
Pointer to Data

Old way
With tombstone

Data
Pointer to Data
Pointer to Data
Data
Tombstone
Pointer to Data
70
Tombstones

When variable is de-allocated, the tombstone
remains and is set to null.
Only one pointer has to be set to zero (the
tombstone)
All of the pointers that were pointing to the
variable all hit the tombstone zero value
If a reference is made through any of those
pointers, they refer to a zero address, which is
an error

71
Tombstones

Tombstones work around the dangling pointer
problem at the expense of
An extra 4 bytes per variable allocated from the
heap
Those extra 4 bytes generally cant be
deallocated.
An extra memory access (another layer of
indirection) is required every time the variable
is used
Not found in any popular modern languages
Maybe it should be, with loads of fast memory
available

72
Lock and Key for Dangling Pointers

When a variable is allocated off the heap
Allocate storage for the structure
Allocate a memory cell for an integer which holds
a lock value
Return pointer to the variable as a pair
(integer key, integer address)
Key value in pointer is set to lock value

73
Lock and Key

When a pointer is copied (with an assignment
statement),
copy the key and address

From initial new (allocate) statement
Pointer to Data
Key
Data
Lock
Pointer to Data
Key
From making a copy of the original pointer
74
Lock and Key

When a pointer is de-referenced
Verify the key stored with the pointer matches
the lock out in memory next to the item being
pointed to.
If the lock and key dont match, throw an error
When would the lock and key not match?

75
Lock and Key

When a variable from the heap is de-allocated,
set the lock to an illegal key value
Overhead
An integer comparison to check lock and key
Extra space to hold the lock which cant be
de-allocated
Extra space to hold the key in each pointer
Implemented in versions of Pascal

76
Heap Management

Heap
Portion of computer memory where space for
dynamically allocated variables is taken from.
Varying levels at which programmer can interact
with heap.
Java everything is handled for you
C - new, delete allow programmer to ask for
memory, but the system still controls how the
heap is managed

77
Heap Management

Look at two different implementations
Heap as a group of fixed, single size cells
More likely to be seen when language system is
requesting from heap Implicit heap-dynamic
languages
Heap as a group of variable sized segments
Required to support programmer requests (array of
arbitrary size of contiguous memory Explicit
heap-dynamic languages
Two primary uses
Obtaining memory (allocation) from heap
Returning memory (de-allocation) to heap

78
Heaps Single Sized Cells

Define a cell as a unit that contains space for
item of interest and a pointer
Often implemented as a circular linked list of
cells
Available heap often called the the free list

Data Goes Here
79
Heaps Single Sized Cells

Allocation
Remove cell from front of free list
De-allocation
Attach released cells to the front of the free
list

Updated free list after allocation
Free list before allocation
80
Heaps Variable Sized Cells

More applicable to most programming languages
needs
General approach
Have AvailableStart pointer initially point to
a single cell that is sized to be all of
available free memory.
Allocation When a request is made
If the cell at the front of the list is large
enough, break the cell into two pieces, one being
the requested size and the other being everything
else
For a while, this technique will work fine
If front cell isnt large enough, what to do?
lets look at deallocation first

81
Heaps Variable Sized Cells

Deallocation
Reclaimed, variable sized cells are added to onto
the list
May check to see if directly adjacent neighbors
can be coalesced together with these cells OR we
might wait to do this only until we need to
Allocation
If front cell isnt large enough, try the next
free block(s) on the list until find one that is
large enough.

82
Heaps Variable Sized Cells

This approach does entail list overhead
Requires searching through lists to check and see
if there is a block of appropriate size available
May hit a point where only have lots of small
blocks sitting around
Requires adjoining small blocks that were from
adjacent parts of memory back together
Any over-allocations also waste space
Does this sound familiar? (CSC 241)
Internal/external fragmentation

83
Heaps Variable Sized Cells

Implementation Questions
Do you
Take the first block that is big enough to handle
the request (first-fit)
Look for the best fit block, which could
require looking at every block?
Costly, tends to leave small leftover blocks
Do you keep the list of different size blocks in
sorted order by size?

84
Heaps Allocation

Costs of heap search are based on number of items
in heap (linear)
Some languages maintain heaps for different size
requests why?
Searching through a smaller list!
Move broken off chunks of a large allocation onto
smaller lists

85
Heaps Compaction

Compaction
Moving items that are already allocated in memory
to different locations
Can free up larger chunks of contiguous space
Costly requires updating all pointers pointing
to a particular spot in memory
Tombstones? Only need to update the tombstones!

86
Heap Management

Approach 1 Reference Counters
Eager Approach Incremental reclamation as soon
as free cells are made free
Requirements
In every cell, additional space has to be
reserved to hold an integer
The integer, the reference count, holds the
number of pointers pointing to the cell
If the reference count ever hits zero, the cells
can be returned to the free list.

87
Reference Counters

The initial allocation of memory and assignment
of the returned pointer sets the reference count
to 1
Reference count management involves overhead
adding code to pointer operations to ensure
counts are updated
Whenever a pointer is connected to the variable,
including via a copy, the reference count is
incremented
Whenever a pointer is disconnected from a
variable, the reference count is decremented
Disconnects explicit re-assigment, local stack
variable disappearance, pointer inside an object
being cleaned up

88
Reference Counters

Reference counter example

null
1
1
1
Remove ListHead pointer Block 1 ref count goes to
0 Return block 1 to free list Block 2 ref count
goes to 0 Return block 2 to free list Block 3 ref
count goes to 0 Return block 3 to free list
ListHead
89
Reference Counters

Reference counters can help work around dangling
pointers
Even if user calls free through one pointer, the
reference counter will see that there are other
pointers directed towards the data
Forces programmer to assign all pointers
elsewhere (to 0?) and then call free before free
actually works, disposing of data

90
Reference Counters

Reference counter concerns
Additional instruction overhead for reference
management (previous slide)
In some languages (LISP) nearly every instruction
causes the system to change pointers around
Increased memory usage
Reference counter on each item allocated
Handling circularly connected cells?

91
Reference Counters

Circularly connected cells
Every cell in the list has a reference counter of
at least 1. When can you delete because of their
own circular references?

Without circular link, setting ListHead pointer
to null would cause a cascade of cleanups With
circular link, sits there with ref counts of
1 There are alternatives, but not as intuitive to
program
ListHead
92
Reference Counters

Reference counts can also help us implement
dangling pointer protection provides a mean for
removing tombstones
If all pointers to tombstones have been moved
elsewhere, the tombstone can be freed.

93
Garbage Collection

Garbage Collection Periodic process
Garbage accumulates, and is cleaned up at regular
intervals or as necessary
Remember, ref counting was incremental cleaned
up as soon as possible
A garbage accumulator has to examine the heap,
find anything allocated but not actively being
used, and free up that memory.

94
Garbage Collection

Every heap cell has an extra bit or field
(indicator) that is exploited by the garbage
collector
3 phase collection process
Initialize
Trace and Mark
Sweep and Clean

95
Garbage Collection

Initialize Every cell in the heap is marked as
garbage in its indicator field
Trace and Mark A trace from every active pointer
in the program is made to see if a cell is
reachable from a valid pointer. If so, the
indicator is set to not garbage.
Active needs a definition
Sweep and Clean Return to the free list any
cells still marked as garbage.

96
Garbage Collection

An element is active if it is
Referenced by a pointer on the function call
stack
Referenced by a pointer from another active part
of the heap

97
Garbage Collection
References from function call stack
References from inside of objects
Basic view of results of mark and trace gc
algorithm for Java (circa 1998) http//java.sun.co
m/developer/technicalArticles/ALT/RefObj/
98
Garbage Collection

GC costs depend on
Total size of heap memory
Initialization
Sweep and clean
Number of active pointers
Trace and Mark

99
Garbage Collection

Within a process, GC is often implemented as a
thread
Stops other parts of program from executing when
it uses CPU
Should not be interrupted itself
If the GC is interrupted, the whole process
should be restarted as the other code executed
may have made changes to memory (which GC worked
hard to gather statistics on).

Why?
Often times, when GC runs, your Java program
hiccups
100
An Application of Garbage Collection Java

Original versions of Java used Trace and Mark on
a large heap
Now allows generation collection
Exploit common properties of programs and object
lifetimes.

http//java.sun.com/docs/hotspot/gc5.0/gc_tuning_5
.html
101
An Application of Garbage Collection Java

Java uses multiple generations to store objects
New objects are stored in the Young generation
Object that exist long enough are moved to the
Tenured generation
Young and tenured are GCed when they fill up.
Exploits infant mortality Many objects are
deleted soon after being allocated
Finally, some objects, known to exist through the
whole program are in the Permanent generation
which never needs to be garbage collected.

Have we seen this general idea before?
Decomposing your work area into smaller pieces?
102
Garbage Collection

Copying GC
Separate heap into two large blocks
block A/block B
Initially all data allocated from blockA
When block A fills up
it is labeled as block B
Copy all directly (fc stack-)pointed to items
from block B into block A
Copy all items pointed to by items in block A
to block A
Allocate from new block A

103
Garbage Collection

Implicitly doing marking (as not-garbage) by
moving
Costs Two large blocks of memory reserved for
each programs heap, one of which is essentially
empty
Benefits
No per-object bloat for mark tag
No separate clean-up phase
Automatic compaction

104
GC Recursion

Is recursion an issue for tracing garbage
collection ?
If its a recursive heap object not pointed to
actively, it will never get set to not garbage
and will thus get cleaned up.
Can realize if youve already marked something
not garbage, so shouldnt get stuck in a loop
with that marking.

105
Structured Data Types

Array (vector) data type
Fixed number of components
Declared by user with size (C arrays A10), or
lower, upper bound (Pascal A-5,5)
Homogeneous in type
Declared by user
Allocated linearly in memory
Managed by system
Big question how is component access
implemented?

106
Structured Data Types

Work under the assumption user can specify lower
bound (such as -5, or 1, or 10)
Zero as lower bound is just an instance of this
assumption, with some nice properties
General formula
address(AI) base (I-LB) E

107
Structured Data Types

address(AI) base ((I-LB) E)
base starting location of array
- Could be on stack or in heap
I index of interest
L lower bound on indices
E size of an element

108
Structured Data Types

address(AI) base ((I-LB) E)
Assume indices are -3, 3
Holds doubles (8 bytes each)
Base is 00032
Calling A1
A1 is actually the 5th element because starts
at -3
Address 00032 ((1 (-3)) 8)
00032 (48) 00064

00032 -3
00040 -2
00048 -1
00056 0
00064 1
00072 2
00080 3
109
Structured Data Types

address(AI) alpha (I-LB) E
is equivalent to
address(AI) alpha (LB E) (IE)
Immediately after allocation, could compute
(alpha (LB E)) once and re-use
Use as a base, offset is index size Called
virtual origin (where A0 would lie)
A0 might not even be valid for accessing!

110
Structured Data Types

C/C
Implementation
of subscripting

111
Structured Data Types

C/C
Implementation
of subscripting

Direct addressing
Offset addressing -24(ebp) is base eax holds I
(index)
112
Structured Data Types

Multi-dimensional arrays
Generalization of single-dimensional (standard)
arrays
Declaration syntax requires size or upper lower
bounds for each dimension
Accessing a single element requires a subscript
entry for each dimension
Accessing a subarray requires entry for only
partial set of dimensions, but need to specify
contiguously and start in first dimension

113
Structured Data Types

Multidimensional arrays
Memory itself is linear, so map n-dimensional
into linear format
Two major memory layouts
Row major
Column major

Example 3x3 2-D array
114
Structured Data Types

Row Major Order

115
Structured Data Types

Column Major Order

Of major languages, only Fortran uses
column-major order
116
Structured Data Types

Statically allocated arrays in C True
contiguous layout

117
Structured Data Types

Dynamically allocated in C Have to also hold
pointer references

Pointers to data Actual data
There are some gaps here 29c?
118
Structured Data Types

Why is knowing order of multi-dimensional arrays
important?
If using pointer operations, what does pointer
arithmetic get you (over 1 or down 1)?
Imagine you need to perform some operation on
each element in the array, order of work on
element unimportant to results
Accessing elements in order that language stores
elements is typically more efficient data
locality
Paging for large arrays?
Cache loading?

119
Structured Data Types

Virtual memory Your program may only have a
(few) page(s) of memory allocated to it
Other pages are stored on disk until needed,
brought in with others swapped out
Large arrays may fill multiple pages.
Sequential access is likely to stay in the same
page.

120
Structured Data Types
for (i 0 i lt 3 i) for (k 0 k lt 3
k) cout ltlt dataik
Page 1
Page 3
Page 2
Page 1
Page 3
Page 2
for (int i 0 i lt 3 i) for (k 0 k lt 3
k) cout ltlt dataki
121
Structured Data Types

Address in a multi-dimensional array
For 2D, general approach is
Declaration A-5,50,4 gt
LB1 -5 UB1 5 LB2 0 UB2 4
address(AIJ) alpha (I-LB1) S (J-LB2)
E
S is the size of a row
E is the size of an element
How do we come up with S?
((UB2-LB21) E)
(number of columns in a row size of column)

122
Structured Data Types

Row Major Order

Declared -1,1-1,1 A00? Holds value 2 S
1-(-1) 1 4 34 gt 12 (each 12 is a
new row) A00 100 (0-(-1)) 12
(0-(-1) 4) gt 100 12 4 116
Generalizes to higher dimensions, have to take
into account size of lower dimension point, row,
plane, cube,
123
Structured Data Types

Can still use same virtual origin trick to cut
out some repeated computations
For 1-d array alpha (LB E)
For 2-d array alpha (LB1 S) - (LB2 E)
Verification For array on previous slide, LB1 is
-1, LB2 is -1, S 12, E is 4
Alpha (-112) - (-1 4) gt alpha 16

124
Back to stack management

Stack management a few final details
Storing old ebp, eip
Return values
Debugging support
Optimization

125
Back to the stack subprograms

Activation Records
Stores the data state of the subprogram as its
executing
Created each time subprogram is called
Removed each time subprogram completes
Note
1 instance of subprogram code segment
Multiple instances of subprogram activation
records

126
Pools of Memory
Address 0
Heap
Intel Architecture
Memory Allocated To Program
ESP
EBP
Executing
Not Used
EIP
Main, Others
Code
Address 2048
127
Subprograms

At a subprogram call, need to store
Old EIP
Which instruction to return back to
Old EBP
Where on the stack to return back to
Implementation
These are themselves stored directly on the stack

128
Subprograms

CALL instruction pushes old EIP on stack
(implicitly), changes ESP
PUSHL EBP pushes old base pointer onto top of
stack, modifying ESP
Updated ESP (top of stack) is set as new EBP
(base for next function)

129
Subprograms

Leaving a function
Reset ESP no longer need local variables
Reset EBP point back to caller method entries on
stack
Reset EIP point back to next instruction in
caller

130
Subprograms

Returning from a subprogram call
LEAVE is a macro for
mov ebp, esp
Copy whats at current base pointer into stack
pointer
pop ebp
Item on top of stack put in ebp
esp drops another 4
RET pops old EIP off stack, jumps back to that
instruction

Essentially undoes all actions from function
setup
131
Subprograms

Note that stack memory not actually scrubbed
between function calls
No re-initialization (setting to all zeros?)
Just a simple pointer replacement (esp,ebp)
That parts no longer in use
Can lead to some tricky debugging situations

132
Subprograms

Second function call laid right on top of
previous call
All variables line up in the same place
Nothing seems strange at all!

133
Subprograms

With this approach
Have a stack of activation records.
Each AR can reference one below it.
There are instruction pointers nestled below each
function call (the line from the calling
function)
Exactly this type of information that GDB/Java
exception handling uses to allow you to trace
program execution viewing function call stack

134
Subprograms
135
Subprograms

Return value
For simple return values, compiler chooses a
register and says
Callee should store the return value in that
register
Caller should look in that register when using
return value.

Share eax for return value
136
Subprograms

Complicated Return Types

Returning a struct of two ints 2 registers
used
137
Subprograms

Sometimes, the overhead of a function call is
more than the cost of the instructions themselves
Overhead instructions to manage function call
Saving and resetting state, parameter passing,
return values
Inline functions
Can tag functions with inline statement
Prompts compiler to use copy rule to replace
call to function with the function code itself
In C, up to compiler to decide on whether or
not it actually implements this (you can suggest,
but it makes final decision)