Title: Derivation And Evaluation of Concurrent Collectors
1 Derivation And Evaluation of
Concurrent Collectors
Martin T. Vechev
University of Cambridge
David F.
Bacon, Perry Cheng and Dave Grove
IBM T.J. Watson Research Center
2Outline
- Motivation and Benefits
- New Generalizations
- Abstract Algorithms
- Practical Algorithms
- Derivations
- Evaluation
3Motivation
- Concurrent Collectors
- Difficult to Construct Correctly
- Initial Errors in Dijkstra and Steele Algorithms
- Difficult to Understand
- Difficult to Implement
- No systematic comparisons, largely folklore
4Contributions
- Generalization Of Existing Mechanisms
- Abstract Collectors Based on Generalizations
- Precise, but inefficient
- New Algorithm
- Derived from the power of generalizations
- Experimental Evaluation Of 4 Concurrent GC
5Benefits of Generalization
Steele
Dijkstra
Hybrid
Yuasa
6Assumptions
- Single Collector Thread
- Multiple Mutator Threads
- Atomic Write Barrier
- Non-Moving
7Why Is It Hard ?
B
D
C
R1
A
Time
GC marks B
8Why Is It Hard ?
R2
B
B
D
D
C
C
R1
R1
A
A
Time
Mutator creates R2
GC marks B
9Why Is It Hard ?
R2
R2
B
B
B
D
D
D
C
C
C
R1
R1
X
R1
A
A
A
Time
Mutator creates R2
Mutator removes R1
GC marks B
10Why Is It Hard ?
R2
R2
R2
B
B
B
B
D
D
D
D
C
C
C
C
R1
R1
X
R1
A
A
A
A
Time
Mutator creates R2
Mutator removes R1
GC reclaims C D live WRONG!
GC marks B
11Wavefront
E
R1
G
B
D
C
F
A
R2
12Protection
Installation
Deletion
Remember Crossing Pointer R1
Remember R2
E
E
R1
R1
G
G
B
B
D
D
C
C
F
F
X
A
A
R2
R2
13New generalizations
- Precise Wavefront
- Shade
- Precise Counting Of Cross Pointers
- Scanned Reference Count (S-RC)
14Collector Progress (Shade)
E
G
B
D
C
F
R2
A
SHADE0
15Collector Progress (Shade)
E
G
B
D
C
F
R2
A
SHADE1
16Collector Progress (Shade)
E
G
B
D
C
F
R2
A
SHADE2
17Collector Progress (Shade)
E
G
B
D
C
F
R2
A
SHADE3
18Shade Observations
- Computed by Collector
- Generalization of the tri-color abstraction
- Different Granularities
- Different Objects
19Scanned Reference Count (S-RC)
E
G
B
D
C
F
A
S-RC0
20Scanned Reference Count (S-RC)
E
G
B
D
C
F
A
S-RC1
21Scanned Reference Count (S-RC)
E
G
B
D
C
F
A
S-RC2
22Scanned Reference Count (S-RC)
E
G
B
D
C
F
X
A
S-RC1
23Scanned Reference Count (S-RC)
E
G
B
D
C
F
A
X
S-RC0
24Scanned Reference Count (S-RC)
E
G
B
D
C
F
A
S-RC0
25Outline
- Motivation and Benefits
- New Generalizations
- Abstract Algorithms
- Practical Algorithms
- Derivations
- Evaluation
26Abstract Algorithms
- Utilize Shade and S-RC
- Installation-Based and Deletion-Based
- Mutator nominates candidates
- Does not mark objects
27Concurrent System Structure
COLLECT() do mark() processNominated()
while (!finished)
MUTATE (obj, field, target) obj.field
target nominate(target)
28Mutator Nominates (Installation)
E
G
B
D
C
F
A
S-RC0
NOMINATED OBJECT BUFFER
29Mutator Nominates (Installation)
E
G
B
D
C
F
A
S-RC1
A
NOMINATED OBJECT BUFFER
30Mutator Nominates (Installation)
E
G
B
D
S-RC1
C
F
A
S-RC1
C
A
NOMINATED OBJECT BUFFER
31Mutator Nominates (Installation)
E
G
B
D
S-RC1
C
F
X
A
S-RC0
C
A
NOMINATED OBJECT BUFFER
32After Mark (Installation)
COLLECT() do mark() processNominated() wh
ile (!finished)
E
G
B
D
S-RC1
C
F
A
S-RC0
C
A
NOMINATED OBJECT BUFFER
33After Find (Installation)
E
COLLECT() do mark() processNominated() wh
ile (!finished)
G
B
D
S-RC1
C
F
A
S-RC0
C
A
NOMINATED OBJECT BUFFER
34Allocation
- In Installation-Based Collectors
- No difference
- In Deletion-Based Collectors
- Remembered Upon Allocation
35Allocation
E
G
B
D
C
F
A
NOMINATED OBJECT BUFFER
36Allocation
E
G
B
D
C
F
A
S-RC1
N
N
NOMINATED OBJECT BUFFER
37Outline
- Motivation and Benefits
- New Generalizations
- Abstract Algorithms
- Practical Algorithms
- Derivations
- Evaluation
38Practical Algorithms
- Stacks
- Non-Barriered Region
- Scanned Object behind wavefront
- S-RC affected
- Stack rescanning
- S-RC and Shade compression (tri-color)
- Reachability Effect
39Compressing S-RC (sticky bit)
E
G
B
D
C
F
A
S-RC0
40Compressing S-RC (sticky bit)
E
G
B
D
C
F
A
S-RC1
41Compressing S-RC (sticky bit)
E
G
B
D
C
F
X
A
S-RC1
42Compressing S-RC (sticky bit)
E
G
B
D
C
F
A
S-RC1
Object A Unreachable but kept Alive
43Compressing Shade
E
G
B
D
C
F
A
SHADE0
44Collector Progress (Shade)
E
G
B
D
C
F
A
SHADE3
45Collector Progress (Shade)
E
G
B
D
S-RC1
C
F
R1
A
SHADE3 PRECISE 1
46Collector Progress (Shade)
E
G
B
D
S-RC1
(NOT decremented)
C
F
R1
X
A
SHADE3 PRECISE 1
47Collector Progress (Shade)
E
G
B
D
C
F
A
SHADE3 PRECISE 1
Object C Unreachable but kept Alive
48Outline
- Motivation and Benefits
- New Generalizations
- Abstract Algorithms
- Practical Algorithms
- Derivations
- Evaluation
49Deriving Dijkstra
INSTALLATION-BASED GC
Stack Regions
RESCANNED STACKS
Compress S-RC to sticky bit for ALL objects
INSTALLATION with 1-bit
Compress Shade to 1-bit for ALL objects
DIJKSTRA
50Deriving Yuasa
DELETION-BASED GC
Stack Regions
RESCANNED STACKS
Compress S-RC to 0-bit for ALL objects
DELETION with 0-bits
NO S-RC needed gt NO rescanning
Deletion with NO rescanning
Compress Shade to 1-bit for ALL objects
YUASA
51Deriving a New Collector (Hybrid)
DELETION-BASED GC
Stack Regions
RESCANNED STACKS
Compress S-RC to sticky bit for Allocated
objects Compress S-RC to 0-bit for Existing
objects
MIXED DELETION
Compress Shade to 1-bit for ALL objects
HYBRID
52Algorithms
Steele
Dijkstra
Hybrid
Yuasa
53Evaluation
- First Systematic Comparison Of Concurrent
Collectors - IBM J9 Production Virtual Machine
- J2ME Profile, microJit
- 512MB RAM, Pentium 4, 3GHz
- Comparison in terms of Execution time And Space
Overhead - Dijkstra, Steele, Yuasa, Hybrid
- Which benchmarks
- SpecJVM98 (-s100)
- Work-Based Incremental Scheme
- Collect 9K for every 6K allocated.
54Maximum Space Usage
javac
mtrt
jack
geomean
db
jess
55Execution Time
javac
mtrt
jack
geomean
db
jess
56Summary
- Generalization Of Existing Mechanisms
- S-RC, Shade
- Abstract Collectors Based on Generalizations
- Precise, but inefficient (S-RC, Shade)
- New Concrete Algorithm
- Combines good properties of Yuasa and Dijkstra
- Suitable for Real-Time Domains
- Experimental Evaluation
57On-Going Work
- More Transformations
- Formal Proof Of Correctness
- Transformations
- Unified Abstract Collector
- Formal Relation Between Algorithms
58(No Transcript)
59IF TIME PERMITS SLIDES
60(No Transcript)
61(No Transcript)
62Abstract Object Layout
Barrier Reference Count
Computed By Mutator
Dont Sweep
Recorded
Marked
Computed By Collector
Shade
DATA
63The Transitive Loss
P2
P2
P2
B
B
B
B
D
D
D
D
C
C
C
C
P1
P1
A
A
A
A
Time
Thread Working
Thread Working
Collector Working
Collector Working
- C was not seen
- Reclaims C is OK
- Reclaims D live!
- Installs P2
- Marks B as live
- Deletes P1
- C becomes
- unreachable
64 Common Structure Example
WriteBarrier(Obj, field, New) if (Phase
Tracing) Old Objfield
Remember(Old) Objfield New
WriteBarrier(Obj, field, New) if (Phase
Tracing) Remember(New)
Objfield New
DIJKSTRA
YUASA