Title: Type Compatibility
1Type Compatibility
- Another Type Checking issue What are equivalent
types? - Need to be concerned with how type is named
and/or defined - Equivalence by name gt name type compatibility
- Equivalence through definition structure type
compatibility
2Type Compatibility
- Name Type Compatibility
- Two variables have compatible types only if they
are defined in declarations that use the same
type name. - Easy to implement
- Checking of type bindings stored in symbol table
- Safe
- different names, different meanings
3Type Compatibility
- Strict name type compatibility is very
restrictive - The following is an Ada declaration
- type Indextype is 1..100
- count Integer
- index Indextype
- If Ada used strict type name compatibility,
could not assign count to index or vice-versa
4Type Compatibility
- Name type compatibility clearly makes sense in
one direction - index count should be illegal (as Indextype is
a subset of the integers) - count index should be alright, so we may need
to augment name type compatibility to allow this
to happen
5Type Compatibility
- Structure type compatibility
- Two variables have compatible types only if their
types have identical structures - Usually, for aggregates, requires that grouping
type is composed of same types in same order - Same memory layout
- More flexible than name compatibility, but
difficult to implement. - Implementation
- Instead of just checking names in the symbol
table, the entire structure of two types must be
compared.
6Type Bindings
- Type binding also provides us with information on
how much memory is required for a variable - Primitive types specific number of bytes
- int x gt 4 bytes
- Aggregate types sum over (requirements from
grouping multiple primitives or other aggregates) - int x100 gt 400 bytes
7Memory Layouts
Note layout of structs Support additive
movement through structs, arrays
8Type Compatibility
- Checking type structures
- Arrays?
- Same data-type, same length?
- Same data-type, same length, different array
indices? - Same data-type, different length?
- Self-referencing types?
- struct LinkedListNode int data LinkedListNode
next
9Type Compatibility
- Checking type structures
- Are these two types compatible?
- struct person string name int age
- struct vehicle string name int numberOfTires
- Are these two types compatible?
- struct PersonType1 int age string name
- struct PersonType2 string name int age
- How do you differentiate?
- struct person string name int age
- struct vehicle string name int age
- Different names generally mean desire different
abstractions maybe structure compatibility is
too flexible?
10Type Compatibility
- Languages you are using
- In general, Java and C use name type
compatibility - Plain old vanilla C, for all types except structs
and unions, uses structural type compatibility
11C Type Compatibility
- In most cases, plain, original C uses structural
type compatibility - Every struct and union declaration creates a new
type incompatible with other types, even if they
are structurally the same. - Unless structs and unions declared in separate
files do use structural type equivalence. - Any type defined with typedef is equivalent to
its parent type (as typedef is just providing an
alias).
12Example of C structural compatibility vs C name
compatibility
Same program written in C, C enums are
constructions over integers (each enum item maps
to an integer) C structural Theyre both ints!
Feel free to assign away! C name two
different things (Suits and Colors)
Differences arise when compiled using respective
compilers
13Coercion/Casting
- Most modern languages support automatic coercion
between types - Coercion Automatic translation of a type to
another type to fulfill type checking
requirements - A more specific type is translated to the more
general type - int i 2 double v i
- Truck t new Truck() Automobile A t
- Guaranteed that operations defined on more
general type are defined on more specific type,
as all items of specific type are also items of
general type - Doubles are a larger set of numbers than
integers. - Automobiles are a larger set of vehicles than
Trucks.
14Coercion/Casting
- Think of coercion as automated casting.
- Casting Programmer specified translation of
types - Translation from more general to more specific
- double c 2.5 int i (int)c
- Truck t (Truck)myVector.elementAt(0)
- Java containers store Objects, the super-class of
everything (most general)
15Coercion/Casting
- Moving to a more general type widening
conversion - Widening is almost always safe (not losing
information), so usually performed automatically. - Moving to a more specific type narrowing
conversion - Narrowing can lose information (preciseness of
number 2.5 as a double vs 2 as an int), so
usually requires programmer request.
See pg. 323 (Section 7.4) for more info on this
topic
16Coercion/Casting
- Liskov Substitution Principle
- I view it as an argument for why automatic
coercion is safe - Let q(x) be a property provable about objects x
of type T. Then q(y) should be true for objects y
of type S where S is a subtype of T - From Wikipedia based on the idea of
substitutability that is, if S is a subtype of
T, then objects of type T in a program may be
replaced with objects of type S without altering
any of the desirable properties of that program
(e.g. correctness)
http//en.wikipedia.org/wiki/Liskov_substitution_p
rinciple
17Implementation of casting/coercion
Casting instructions are hardware based a good
thing.
18Semantic checks
Why is this check required? Can it raise runtime
errors?
19Reinterpret casts
- Reinterpret
- Static
- Dynamic
- Used in subtype
- cast like Java,
- does runtime check
20Coercion/Casting
- While C/C/Java do support automatic coercion,
not all languages do? - An example?
- ML
21Location Bindings
- Location binding Binding a variable to a
particular location in memory - Also referred to as storage binding
- Two parts of the process, with familiar terms
- Allocation Reserving a block from a pool of
available memory - Deallocation Returning a bound block to the pool
of available memory
22Pools of Memory
- Two areas where memory can be allocated from
- Stack
- Heap
- Stack is allocated in a
- strict order
- Heap is much more
- like a free pool of memory
- Hope they dont collide with
- each other
Heap
Memory Allocated To Program
Stack
Another example (Windows related) http//www.nosta
rch.com/download/greatcode2_ch8.pdf
23Lifetime
- The period for which a variable is bound to a
particular memory address is its lifetime. - There are four common storage/lifetime bindings
for variables - Static Explicit heap-dynamic
- Stack-dynamic Implicit heap-dynamic
24Static Variables
- Static variables
- Bound to memory cells before execution, released
at termination - Compile time binding of storage
- Placed in a spot than can be accessed directly,
rest of code can ask for that spot specifically - Commonly allocated from (around) the heap area
- Used for global variables in languages that
support globals, and for true static variables in
C (where value is persistent across function
calls)
25Static Variables
- Fast to use no runtime overhead to
create/delete, fixed direct addressing - But, if have only static variables in your
language, cant support recursion - In recursion, same variables are used multiple
times, and previous uses are still important and
must be maintained
26Stack Dynamic Variables
- Run time binding of storage
- Allocated when declaration is encountered
- Deallocated when move out of scope of declaration
(no longer visible) - Types are already statically bound via
compilation - Used for allocation of local variables in
subprograms - Usually all local variables, even if declared in
middle of function, are allocated space at start
of function call - Deallocation when subprogram terminates
- Allocated from stack part of memory
- Well talk about stack management more later
(Chapter 10)
27Stack Dynamic Variables
- Important for recursion
- Allows each recursive call to allocate memory
from the stack for the variable instances in that
particular call to the subprogram - Disadvantages
- Overhead of allocation, deallocation at runtime
(every method call) - Not that expensive though compiler can ask for
a whole chunk as it can precompute amount it
needs to ask for (again wait until Chapter 10) - Require indirect addressing (relative position in
stack) - Dont know where your method is being put on the
stack until method starts - Does not allow history sensitive variables like
static does
28Explicit Heap-Dynamic Variables
- Allocated, deallocated explicitly by the
programmer via special instructions - Referenced through pointer/reference variables
- Indirect addressing (2 memory accesses)
- Run time binding of storage
29Explicit Heap-Dynamic Variables
- C - new and delete statements, can use on all
types (scalars, aggregates) - int intnode
- intnode new int
- delete intnode
- Java every object (instance of a class)
- PrintWriter pw new PrintWriter()
- // no delete for Java well see this again
(Chapter 6)
30Explicit Heap-Dynamic Variables
- Advantages
- Useful for dynamic structures (linked lists,
trees) that can adapt to programs data
requirements - Disadvantages
- Multiple memory accesses for pointers
- Difficulty of programming with pointers correctly
- Heap management well come back to this, not
trivial (Chapter 6)
31Implicit Heap-Dynamic Variables
- Bound to heap storage only when assigned values
- Essentially, these are dynamically typed
variables, where all features (type, value,
location) are bound upon assignment - Javascript example again
- list 10.2, 3.5
- list 47 gt reallocated storage?
- list 10.2, 3.5, 28.2 gt reallocation
(bigger) - While flexible, suffers from the usual dynamic
binding problems discussed earlier, as well as
heap management problems mentioned on previous
slide
32Examples
33Constants
- Constants are interesting, two key ways of
implementing - Placement in special read only memory
- Compiler verification wont allow changes after
constant defined - Any guess on what C does?
34ExampleConstant
35Pointers
- Pointer definition
- A data object (variable) whose value is
- The memory location of another data object, or
- Null, a general term for a pointer to nowhere
- Pointers, when available, are at the same level
as other types - Can declare pointer variables
- Can hold pointers in an array
- Can have a pointer as part of an aggregrate
data-structure (ListNode for example)
36Pointer Specifications
- Attributes
- Pointer variable name
- Type of data being pointed to
- Double myDoublePointer
- Truck myTruckObjectPointer
- Values pointer can take on
- Any addressable memory address
- Usually any integer
- (64 bit architectures?)
37Pointer Specifications
- Operations Declaration
- Sets up space for pointer
- Could come from stack stack-dynamic
- (if pointer is a local variable)
- Could come from heap heap-dynamic
- (if requested at runtime)
38Pointer Specifications
- Operations Assignment with object creation
- A very common use of pointers is when objects and
variable are heap-dynamic. - C Syntax
- double myDoublePtr new double
-
- RHS requests allocation of a fixed size data
object - The return value from the new statement is the
address of the data object just created. - Address is stored in pointer variable
Pointer
2012
2000
2004
2.345
2012 2020
Actual Data
39Pointer Specifications
- Operations Assignment with object creation
- double myDoublePtr new double
-
- Data objects created on the RHS are anonymous
- Not bound to a name in the program
- Thus, they can be lost as well
- Creation and assignment can be performed at
anytime during program execution
40Review Question
- Imagine this is a part of a method you are
writing - List l new List()
- Which part(s) are stack dynamic and which are
heap dynamic? - Which parts(s) are allocated at method startup
and which are allocated at statement execution
time?
41Pointer Specifications
- Operations Dereferencing
- The operation that allows the data referenced by
the pointer to be accessed - In C/C, this uses the (asterisk) operator
- Syntactic Sugar As a shortcut to a field/method
of an aggregate datastructure, can use -gt (arrow)
operator - cout ltlt myDoublePointer ltlt endl
- cout ltlt (myAutomobilePointer).getYear() ltlt endl
- cout ltlt myAutomobilePointer-gtgetYear() ltlt endl
42Pointer Specifications
- Operations Assignment with addressing
- Pointers can be used to reference variables that
arent created using new - To obtain address of an arbitrary variable, use
addressing operator. - C/C - use (ampersand)
- int myIntVariable 5
- int myIntPointer myIntVariable
- myIntPointer myIntPointer 1
43Pointer Specifications
- Operations Arithmetic
- Mathematical operations that work directly on the
pointer - int ptr new int
- ptr ptr 1
- Changes the value of the pointer, not what is
being referenced - Doesnt necessarily update address by one byte
(as it syntactically would appear)
44Pointer Specifications
- Pointers reference a particular type
- i.e. an integer pointer references an item 4
bytes long - All fixed size data structures (primitives,
classes, structs, unions, etc) the compiler can
figure out the size beforehand - Pointer arithmetic moves the pointer up by the
size of the data item being pointed to - (ie it moves completely over that item)
45Example
46Pointer Specifications
- Many systems implement arrays using pointers
- int list5 int ptr list
- // int list new int5 (dynamic allocation)
- cout ltlt ptr ltlt endl //print list0
- cout ltlt (ptr 1) ltlt endl //print list1
- //print first 5 items
- int i 0 while (i lt 5) cout ltlt (ptr) ltlt
endl i - void updateArray(int list) // array parameter
47Review Question
- What is the difference in outputs between these
two sets of code? - int list new int3 list0 24 list1
33 list2 52 - for (int i 0 i lt 3 i)
-
- list
- i
-
- // print list next
- int list new int3 list0 24 list1
33 list2 52 - for (int i 0 i lt 3 i)
-
- (list)
- i
-
- // print list next
48Pointer Specifications
49Pointers with Arrays
50Pointer specifications
- Operations Object deletion through pointer
- Free up memory from dynamically allocated objects
by calling delete on pointer to object - double myDoublePtr new double
- delete myDoublePtr
- Truck myTruckPtr new Truck()
- delete myTruckPtr
- int myIntegerArray new int10
- delete myIntegerArray
-
51Object deletion example
52Pointer Specifications
- Most languages specify that a pointer points to a
particular type - Would it be reasonable for another use of
pointers to be allowed to point to any type? - What would this require for the language to do?
53Pointer Specifications
- Object oriented languages allow a third technique
for pointers - Can point to any type that is a subtype of the
original pointer type - Doesnt hold for primitive type-subtype
double/int relationships
What does this mean the system has to do
for pointers to objects? Runtime
bindings Dynamic method calls?
54void pointers
- In C/C, can use void pointers
- Syntax for dealing with pointers where referenced
type is unimportant - Used in malloc, free (old school memory
allocation) - Used in functions where want arbitrary types to
be accepted - Have to cast to appropriate type pointer before
dereferencing or doing arithmetic
55void pointers Examples
- stdlib.h defines qsort (quicksort) as
- void qsort(void base, size_t num_elements,
size_t element_size, int (compare)(void const
a, void const b)) - Takes an array of any type of object you just
have to make sure you send it how to compare
those two objects. - Compare must return lt 0 if a lt b 0 if a b gt 0
if a gtb
56void pointers Examples
57Function Pointers
- C/C also allow you to use function pointers
- / function returning pointer to int /
- int func(int a, float b)
- / pointer to function returning int /
- int (func)(int a, float b)
-
58Function Pointers
- void qsort(void base, size_t num_elements,
size_t element_size, int (compare)(void const
a, void const b)) - qsort(studentArray, numberOfStudents,
sizeof(Student), compareByName) - Array names, function names converted to
addresses
59Function pointers
60(No Transcript)
61Problems with Pointers
- Dangling Pointer
- When a pointer continues to hold the memory
address of a heap-allocated variable that has
been deallocated. - Why is this a problem?
- Memory pointed to could now be in use by another
variable.
62Dangling Pointers
- New variable in old memory spot may have
different type than previous, almost definitely
new value - Type checking is going to OK everything oh,
hes adding two integers through integer pointers
thats fine - But, meaning of underlying data may be different
- Writing into that spot could mess up other
variable
63Dangling Pointers
- Code which results in dangling pointers?
- int p1 new int
- int p2 p1
- delete p1
- Some languages will set p1 to 0, but wont touch
p2 (not my version of C though it doesnt
even clean up p1).
64Problems with Pointers
- Lost heap dynamic variables
- When the address of the variable being pointed to
is lost - Why is this a problem?
- Prevents further use of the variable (dont know
how to get to it) - Cant delete and reuse that part of memory
(already allocated)
65Lost Heap Dynamic Variables
- How do you end up with lost heap dynamic
variables? - int p1 new int
- p1 new int
- Is there a common name for this?
- Commonly referred to as memory leak
- See memory eaten away
66Solutions to Pointer Problems
- Ways to work around dangling pointers
- 1 Dont allow the user control (Java)
- Requires system-controlled memory management,
garbage collection - 2 Safety algorithms
- These dont prevent the problem, but prevent the
user from fiddling with memory they shouldnt by
throwing an error
67Safety Algorithms
- If you trust the programmer, dangling pointers
can be resolved if - A programmer always sets all pointers to a
variable to null after the variable is
de-allocated - A system is likely to only set one to zero (the
one through which the deletion occurred), rest
are up to the programmer - Letting the system do it all large overhead of
recording who is pointing to whom
68Safety Algorithms
- A way to work around dangling pointers
- Tombstones Each heap-dynamic variable, when
allocated, is also given another memory location
called the tombstone. - This tombstone memory location is a pointer to
the variable. - All user defined pointers to the variable
actually get the address of the tombstone
69Tombstones
Pointer to Data
Data
Pointer to Data
Pointer to Data
Data
Tombstone
Pointer to Data
70Tombstones
- When variable is de-allocated, the tombstone
remains and is set to null. - Only one pointer has to be set to zero (the
tombstone) - All of the pointers that were pointing to the
variable all hit the tombstone zero value - If a reference is made through any of those
pointers, they refer to a zero address, which is
an error
71Tombstones
- Tombstones work around the dangling pointer
problem at the expense of - An extra 4 bytes per variable allocated from the
heap - Those extra 4 bytes generally cant be
deallocated. - An extra memory access (another layer of
indirection) is required every time the variable
is used - Not found in any popular modern languages
- Maybe it should be, with loads of fast memory
available
72Lock and Key for Dangling Pointers
- When a variable is allocated off the heap
- Allocate storage for the structure
- Allocate a memory cell for an integer which holds
a lock value - Return pointer to the variable as a pair
(integer key, integer address) - Key value in pointer is set to lock value
73Lock and Key
- When a pointer is copied (with an assignment
statement), - copy the key and address
From initial new (allocate) statement
Pointer to Data
Key
Data
Lock
Pointer to Data
Key
From making a copy of the original pointer
74Lock and Key
- When a pointer is de-referenced
- Verify the key stored with the pointer matches
the lock out in memory next to the item being
pointed to. - If the lock and key dont match, throw an error
- When would the lock and key not match?
75Lock and Key
- When a variable from the heap is de-allocated,
set the lock to an illegal key value - Overhead
- An integer comparison to check lock and key
- Extra space to hold the lock which cant be
de-allocated - Extra space to hold the key in each pointer
- Implemented in versions of Pascal
76Heap Management
- Heap
- Portion of computer memory where space for
dynamically allocated variables is taken from. - Varying levels at which programmer can interact
with heap. - Java everything is handled for you
- C - new, delete allow programmer to ask for
memory, but the system still controls how the
heap is managed
77Heap Management
- Look at two different implementations
- Heap as a group of fixed, single size cells
- More likely to be seen when language system is
requesting from heap Implicit heap-dynamic
languages - Heap as a group of variable sized segments
- Required to support programmer requests (array of
arbitrary size of contiguous memory Explicit
heap-dynamic languages - Two primary uses
- Obtaining memory (allocation) from heap
- Returning memory (de-allocation) to heap
78Heaps Single Sized Cells
- Define a cell as a unit that contains space for
item of interest and a pointer - Often implemented as a circular linked list of
cells - Available heap often called the the free list
Data Goes Here
79Heaps Single Sized Cells
- Allocation
- Remove cell from front of free list
- De-allocation
- Attach released cells to the front of the free
list
Updated free list after allocation
Free list before allocation
80Heaps Variable Sized Cells
- More applicable to most programming languages
needs - General approach
- Have AvailableStart pointer initially point to
a single cell that is sized to be all of
available free memory. - Allocation When a request is made
- If the cell at the front of the list is large
enough, break the cell into two pieces, one being
the requested size and the other being everything
else - For a while, this technique will work fine
- If front cell isnt large enough, what to do?
lets look at deallocation first
81Heaps Variable Sized Cells
- Deallocation
- Reclaimed, variable sized cells are added to onto
the list - May check to see if directly adjacent neighbors
can be coalesced together with these cells OR we
might wait to do this only until we need to - Allocation
- If front cell isnt large enough, try the next
free block(s) on the list until find one that is
large enough.
82Heaps Variable Sized Cells
- This approach does entail list overhead
- Requires searching through lists to check and see
if there is a block of appropriate size available - May hit a point where only have lots of small
blocks sitting around - Requires adjoining small blocks that were from
adjacent parts of memory back together - Any over-allocations also waste space
- Does this sound familiar? (CSC 241)
- Internal/external fragmentation
83Heaps Variable Sized Cells
- Implementation Questions
- Do you
- Take the first block that is big enough to handle
the request (first-fit) - Look for the best fit block, which could
require looking at every block? - Costly, tends to leave small leftover blocks
- Do you keep the list of different size blocks in
sorted order by size?
84Heaps Allocation
- Costs of heap search are based on number of items
in heap (linear) - Some languages maintain heaps for different size
requests why? - Searching through a smaller list!
- Move broken off chunks of a large allocation onto
smaller lists
85Heaps Compaction
- Compaction
- Moving items that are already allocated in memory
to different locations - Can free up larger chunks of contiguous space
- Costly requires updating all pointers pointing
to a particular spot in memory - Tombstones? Only need to update the tombstones!
86Heap Management
- Approach 1 Reference Counters
- Eager Approach Incremental reclamation as soon
as free cells are made free - Requirements
- In every cell, additional space has to be
reserved to hold an integer - The integer, the reference count, holds the
number of pointers pointing to the cell - If the reference count ever hits zero, the cells
can be returned to the free list.
87Reference Counters
- The initial allocation of memory and assignment
of the returned pointer sets the reference count
to 1 - Reference count management involves overhead
adding code to pointer operations to ensure
counts are updated - Whenever a pointer is connected to the variable,
including via a copy, the reference count is
incremented - Whenever a pointer is disconnected from a
variable, the reference count is decremented - Disconnects explicit re-assigment, local stack
variable disappearance, pointer inside an object
being cleaned up
88Reference Counters
- Reference counter example
null
1
1
1
Remove ListHead pointer Block 1 ref count goes to
0 Return block 1 to free list Block 2 ref count
goes to 0 Return block 2 to free list Block 3 ref
count goes to 0 Return block 3 to free list
ListHead
89Reference Counters
- Reference counters can help work around dangling
pointers - Even if user calls free through one pointer, the
reference counter will see that there are other
pointers directed towards the data - Forces programmer to assign all pointers
elsewhere (to 0?) and then call free before free
actually works, disposing of data
90Reference Counters
- Reference counter concerns
- Additional instruction overhead for reference
management (previous slide) - In some languages (LISP) nearly every instruction
causes the system to change pointers around - Increased memory usage
- Reference counter on each item allocated
- Handling circularly connected cells?
91Reference Counters
- Circularly connected cells
- Every cell in the list has a reference counter of
at least 1. When can you delete because of their
own circular references?
Without circular link, setting ListHead pointer
to null would cause a cascade of cleanups With
circular link, sits there with ref counts of
1 There are alternatives, but not as intuitive to
program
ListHead
92Reference Counters
- Reference counts can also help us implement
dangling pointer protection provides a mean for
removing tombstones - If all pointers to tombstones have been moved
elsewhere, the tombstone can be freed.
93Garbage Collection
- Garbage Collection Periodic process
- Garbage accumulates, and is cleaned up at regular
intervals or as necessary - Remember, ref counting was incremental cleaned
up as soon as possible - A garbage accumulator has to examine the heap,
find anything allocated but not actively being
used, and free up that memory.
94Garbage Collection
- Every heap cell has an extra bit or field
(indicator) that is exploited by the garbage
collector - 3 phase collection process
- Initialize
- Trace and Mark
- Sweep and Clean
95Garbage Collection
- Initialize Every cell in the heap is marked as
garbage in its indicator field - Trace and Mark A trace from every active pointer
in the program is made to see if a cell is
reachable from a valid pointer. If so, the
indicator is set to not garbage. - Active needs a definition
- Sweep and Clean Return to the free list any
cells still marked as garbage.
96Garbage Collection
- An element is active if it is
- Referenced by a pointer on the function call
stack - Referenced by a pointer from another active part
of the heap
97Garbage Collection
References from function call stack
References from inside of objects
Basic view of results of mark and trace gc
algorithm for Java (circa 1998) http//java.sun.co
m/developer/technicalArticles/ALT/RefObj/
98Garbage Collection
- GC costs depend on
- Total size of heap memory
- Initialization
- Sweep and clean
- Number of active pointers
- Trace and Mark
99Garbage Collection
- Within a process, GC is often implemented as a
thread - Stops other parts of program from executing when
it uses CPU - Should not be interrupted itself
- If the GC is interrupted, the whole process
should be restarted as the other code executed
may have made changes to memory (which GC worked
hard to gather statistics on).
Why?
Often times, when GC runs, your Java program
hiccups
100An Application of Garbage Collection Java
- Original versions of Java used Trace and Mark on
a large heap - Now allows generation collection
- Exploit common properties of programs and object
lifetimes.
http//java.sun.com/docs/hotspot/gc5.0/gc_tuning_5
.html
101An Application of Garbage Collection Java
- Java uses multiple generations to store objects
- New objects are stored in the Young generation
- Object that exist long enough are moved to the
Tenured generation - Young and tenured are GCed when they fill up.
- Exploits infant mortality Many objects are
deleted soon after being allocated - Finally, some objects, known to exist through the
whole program are in the Permanent generation
which never needs to be garbage collected.
Have we seen this general idea before?
Decomposing your work area into smaller pieces?
102Garbage Collection
- Copying GC
- Separate heap into two large blocks
- block A/block B
- Initially all data allocated from blockA
- When block A fills up
- it is labeled as block B
- Copy all directly (fc stack-)pointed to items
from block B into block A - Copy all items pointed to by items in block A
to block A - Allocate from new block A
103Garbage Collection
- Implicitly doing marking (as not-garbage) by
moving - Costs Two large blocks of memory reserved for
each programs heap, one of which is essentially
empty - Benefits
- No per-object bloat for mark tag
- No separate clean-up phase
- Automatic compaction
104GC Recursion
- Is recursion an issue for tracing garbage
collection ? - If its a recursive heap object not pointed to
actively, it will never get set to not garbage
and will thus get cleaned up. - Can realize if youve already marked something
not garbage, so shouldnt get stuck in a loop
with that marking.
105Structured Data Types
- Array (vector) data type
- Fixed number of components
- Declared by user with size (C arrays A10), or
lower, upper bound (Pascal A-5,5) - Homogeneous in type
- Declared by user
- Allocated linearly in memory
- Managed by system
- Big question how is component access
implemented?
106Structured Data Types
- Work under the assumption user can specify lower
bound (such as -5, or 1, or 10) - Zero as lower bound is just an instance of this
assumption, with some nice properties - General formula
- address(AI) base (I-LB) E
107Structured Data Types
- address(AI) base ((I-LB) E)
- base starting location of array
- - Could be on stack or in heap
- I index of interest
- L lower bound on indices
- E size of an element
108Structured Data Types
- address(AI) base ((I-LB) E)
- Assume indices are -3, 3
- Holds doubles (8 bytes each)
- Base is 00032
- Calling A1
- A1 is actually the 5th element because starts
at -3 - Address 00032 ((1 (-3)) 8)
- 00032 (48) 00064
-
00032 -3
00040 -2
00048 -1
00056 0
00064 1
00072 2
00080 3
109Structured Data Types
- address(AI) alpha (I-LB) E
- is equivalent to
- address(AI) alpha (LB E) (IE)
- Immediately after allocation, could compute
(alpha (LB E)) once and re-use - Use as a base, offset is index size Called
virtual origin (where A0 would lie) - A0 might not even be valid for accessing!
110Structured Data Types
- C/C
- Implementation
- of subscripting
111Structured Data Types
- C/C
- Implementation
- of subscripting
Direct addressing
Offset addressing -24(ebp) is base eax holds I
(index)
112Structured Data Types
- Multi-dimensional arrays
- Generalization of single-dimensional (standard)
arrays - Declaration syntax requires size or upper lower
bounds for each dimension - Accessing a single element requires a subscript
entry for each dimension - Accessing a subarray requires entry for only
partial set of dimensions, but need to specify
contiguously and start in first dimension
113Structured Data Types
- Multidimensional arrays
- Memory itself is linear, so map n-dimensional
into linear format - Two major memory layouts
- Row major
- Column major
-
Example 3x3 2-D array
114Structured Data Types
115Structured Data Types
Of major languages, only Fortran uses
column-major order
116Structured Data Types
- Statically allocated arrays in C True
contiguous layout
117Structured Data Types
- Dynamically allocated in C Have to also hold
pointer references
Pointers to data Actual data
There are some gaps here 29c?
118Structured Data Types
- Why is knowing order of multi-dimensional arrays
important? - If using pointer operations, what does pointer
arithmetic get you (over 1 or down 1)? - Imagine you need to perform some operation on
each element in the array, order of work on
element unimportant to results - Accessing elements in order that language stores
elements is typically more efficient data
locality - Paging for large arrays?
- Cache loading?
119Structured Data Types
- Virtual memory Your program may only have a
(few) page(s) of memory allocated to it - Other pages are stored on disk until needed,
brought in with others swapped out - Large arrays may fill multiple pages.
- Sequential access is likely to stay in the same
page.
120Structured Data Types
for (i 0 i lt 3 i) for (k 0 k lt 3
k) cout ltlt dataik
Page 1
Page 3
Page 2
Page 1
Page 3
Page 2
for (int i 0 i lt 3 i) for (k 0 k lt 3
k) cout ltlt dataki
121Structured Data Types
- Address in a multi-dimensional array
- For 2D, general approach is
-
- Declaration A-5,50,4 gt
- LB1 -5 UB1 5 LB2 0 UB2 4
- address(AIJ) alpha (I-LB1) S (J-LB2)
E - S is the size of a row
- E is the size of an element
- How do we come up with S?
- ((UB2-LB21) E)
- (number of columns in a row size of column)
122Structured Data Types
Declared -1,1-1,1 A00? Holds value 2 S
1-(-1) 1 4 34 gt 12 (each 12 is a
new row) A00 100 (0-(-1)) 12
(0-(-1) 4) gt 100 12 4 116
Generalizes to higher dimensions, have to take
into account size of lower dimension point, row,
plane, cube,
123Structured Data Types
- Can still use same virtual origin trick to cut
out some repeated computations - For 1-d array alpha (LB E)
- For 2-d array alpha (LB1 S) - (LB2 E)
- Verification For array on previous slide, LB1 is
-1, LB2 is -1, S 12, E is 4 - Alpha (-112) - (-1 4) gt alpha 16
124Back to stack management
- Stack management a few final details
- Storing old ebp, eip
- Return values
- Debugging support
- Optimization
125Back to the stack subprograms
- Activation Records
- Stores the data state of the subprogram as its
executing - Created each time subprogram is called
- Removed each time subprogram completes
- Note
- 1 instance of subprogram code segment
- Multiple instances of subprogram activation
records
126Pools of Memory
Address 0
Heap
Intel Architecture
Memory Allocated To Program
ESP
EBP
Executing
Not Used
EIP
Main, Others
Code
Address 2048
127Subprograms
- At a subprogram call, need to store
- Old EIP
- Which instruction to return back to
- Old EBP
- Where on the stack to return back to
- Implementation
- These are themselves stored directly on the stack
128Subprograms
- CALL instruction pushes old EIP on stack
(implicitly), changes ESP - PUSHL EBP pushes old base pointer onto top of
stack, modifying ESP - Updated ESP (top of stack) is set as new EBP
(base for next function)
129Subprograms
- Leaving a function
- Reset ESP no longer need local variables
- Reset EBP point back to caller method entries on
stack - Reset EIP point back to next instruction in
caller
130Subprograms
- Returning from a subprogram call
- LEAVE is a macro for
- mov ebp, esp
- Copy whats at current base pointer into stack
pointer - pop ebp
- Item on top of stack put in ebp
- esp drops another 4
- RET pops old EIP off stack, jumps back to that
instruction
Essentially undoes all actions from function
setup
131Subprograms
- Note that stack memory not actually scrubbed
between function calls - No re-initialization (setting to all zeros?)
- Just a simple pointer replacement (esp,ebp)
- That parts no longer in use
- Can lead to some tricky debugging situations
132Subprograms
- Second function call laid right on top of
previous call - All variables line up in the same place
- Nothing seems strange at all!
133Subprograms
- With this approach
- Have a stack of activation records.
- Each AR can reference one below it.
- There are instruction pointers nestled below each
function call (the line from the calling
function) - Exactly this type of information that GDB/Java
exception handling uses to allow you to trace
program execution viewing function call stack
134Subprograms
135Subprograms
- Return value
- For simple return values, compiler chooses a
register and says - Callee should store the return value in that
register - Caller should look in that register when using
return value.
Share eax for return value
136Subprograms
Returning a struct of two ints 2 registers
used
137Subprograms
- Sometimes, the overhead of a function call is
more than the cost of the instructions themselves - Overhead instructions to manage function call
- Saving and resetting state, parameter passing,
return values - Inline functions
- Can tag functions with inline statement
- Prompts compiler to use copy rule to replace
call to function with the function code itself - In C, up to compiler to decide on whether or
not it actually implements this (you can suggest,
but it makes final decision)
138Subprograms
139Left not inlined Right inlined Required g
-O2