Title: Garbage Collection Introduction and Overview
1Garbage Collection Introduction and Overview
- Excerpted from presentation by Christian Schulte
- Programming Systems Lab
- Universität des Saarlandes, Germany
- schulte_at_ps.uni-sb.de
2Garbage Collection
is concerned with the automatic reclamation of
dynamically allocated memory after its last use
by a program
3Garbage collection
- Dynamically allocated memory
- Last use by a program
- Examples for automatic reclamation
4Kinds of Memory Allocation
static int i void foo(void) int j int
p (int) malloc()
5Static Allocation
static int i void foo(void) int j int
p (int) malloc()
- By compiler (in text area)
- Available through entire runtime
- Fixed size
6Automatic Allocation
static int i void foo(void) int j int
p (int) malloc()
- Upon procedure call (on stack)
- Available during execution of call
- Fixed size
7Dynamic Allocation
static int i void foo(void) int j int
p (int) malloc()
- Dynamically allocated at runtime (on heap)
- Available until explicitly deallocated
- Dynamically varying size
8Dynamically Allocated Memory
- Also heap-allocated memory
- Allocation malloc, new,
- before first usage
- Deallocation free, delete, dispose,
- after last usage
- Needed for
- C, Java objects
- SML datatypes, procedures
- anything that outlives procedure call
9Getting it Wrong
- Forget to free (memory leak)
- program eventually runs out of memory
- long running programs OSs. servers,
- Free to early (dangling pointer)
- lucky illegal access detected by OS
- horror memory reused, in simultaneous use
- programs can behave arbitrarily
- crashes might happen much later
- Estimates of effort
- Up to 40! Rovner, 1985
10Nodes and Pointers
- Node n
- Memory block, cell
- Pointer p
- Link to node
- Node access p
- Children children(n)
- set of pointers to nodes referred by n
n
11Mutator
- Abstraction of program
- introduces new nodes with pointer
- redirects pointers, creating garbage
12Shared Nodes
- Nodes referred to by several pointers
- Makes manual deallocation hard
- local decision impossible
- respect other pointers to node
- Cycles instance of sharing
13Last Use by a Program
- Question When is node M not any longer used by
program? - Let P be any program not using M
- New program sketch
- Execute P Use M
- Hence
- M used ? P terminates
- We are doomed halting problem!
- So last use undecidable!
14Safe Approximation
- Decidable and also simple
- What means safe?
- only unused nodes freed
- What means approximation?
- some unused nodes might not be freed
- Idea
- nodes that can be accessed by mutator
15Reachable Nodes
root
- Reachable from root set
- processor registers
- static variables
- automatic variables (stack)
- Reachable from reachable nodes
16Summary Reachable Nodes
- A node n is reachable, iff
- n is element of the root set, or
- n is element of children(m) and m is reachable
- Reachable node also called live
17Mark and Sweep
- Compute set of reachable nodes
- Free nodes known to be not reachable
18Reachability Safe Approximation
- Safe
- access to not reachable node impossible
- depends on language semantics
- but C/C? later
- Approximation
- reachable node might never be accessed
- programmer must know about this!
- have you been aware of this?
19Example Garbage Collectors
- Mark-Sweep
- Others
- Mark-Compact
- Reference Counting
- Copying
- see Chapter 12 of LinsJones,96
20The Mark-Sweep Collector
- Compute reachable nodes Mark
- tracing garbage collector
- Free not reachable nodes Sweep
- Run when out of memory Allocation
- First used with LISP McCarthy, 1960
21Allocation
node new() if (free_pool is empty)
mark_sweep()
22Allocation
node new() if (free_pool is empty)
mark_sweep() return allocate()
23The Garbage Collector
void mark_sweep() for (r in roots)
mark(r)
24The Garbage Collector
void mark_sweep() for (r in roots)
mark(r)
all live nodes marked
25Recursive Marking
void mark(node n) if (!is_marked(n))
set_mark(n)
26Recursive Marking
void mark(node n) if (!is_marked(n))
set_mark(n)
nodes reachable from n marked
27Recursive Marking
void mark(node n) if (!is_marked(n))
set_mark(n) for (m in children(n))
mark(m)
i-th recursion nodes on path with length i marked
28The Garbage Collector
void mark_sweep() for (r in roots)
mark(r) sweep()
29The Garbage Collector
void mark_sweep() for (r in roots)
mark(r) sweep()
all nodes on heap live
30The Garbage Collector
void mark_sweep() for (r in roots)
mark(r) sweep()
all nodes on heap live and not marked
31Eager Sweep
void sweep() node n heap_bottom while
(n lt heap_top)
32Eager Sweep
void sweep() node n heap_bottom while
(n lt heap_top) if (is_marked(n))
clear_mark(n) else free(n) n
sizeof(n)
33The Garbage Collector
void mark_sweep() for (r in roots)
mark(r) sweep() if (free_pool is empty)
abort(Memory exhausted)
34Assumptions
- Nodes can be marked
- Size of nodes known
- Heap contiguous
- Memory for recursion available
- Child fields known!
35Assumptions Realistic
- Nodes can be marked
- Size of nodes known
- Heap contiguous
- Memory for recursion available
- Child fields known
36Assumptions Conservative
- Nodes can be marked
- Size of nodes known
- Heap contiguous
- Memory for recursion available
- Child fields known
37Mark-Sweep Properties
- Covers cycles and sharing
- Time depends on
- live nodes (mark)
- live and garbage nodes (sweep)
- Computation must be stopped
- non-interruptible stop/start collector
- long pause
- Nodes remain unchanged (as not moved)
- Heap remains fragmented
38Software Engineering Issues
- Design goal in SE
- decompose systems
- in orthogonal components
- Clashes with letting each component do its memory
management - liveness is global property
- leads to local leaks
- lacking power of modern gc methods
39Typical Cost
- Early systems (LISP)
- up to 40 Steele,75 Gabriel,85
- garbage collection is expensive myth
- Well engineered system of today
- 10 of entire runtime Wilson, 94
40Areas of Usage
- Programming languages and systems
- Java, C, Smalltalk,
- SML, Lisp, Scheme, Prolog,
- Perl, Python, PHP, JavaScript
- Modula 3, Microsoft .NET
- Extensions
- C, C (Conservative)
- Other systems
- Adobe Photoshop
- Unix filesystem
- Many others in Wilson, 1996
41Understanding Garbage Collection Benefits
- Programming garbage collection
- programming systems
- operating systems
- Understand systems with garbage collection (e.g.
Java) - memory requirements of programs
- performance aspects of programs
- interfacing with garbage collection (finalization)
42References
- Garbage Collection. Richard Jones and Rafael
Lins, John Wiley Sons, 1996. - Uniprocessor garbage collection techniques. Paul
R. Wilson, ACM Computing Surveys. To appear. - Extended version of IWMM 92, St. Malo.