Title: Comparison of JVM Phases on Data Cache Performance
1Comparison of JVM Phases on Data Cache
Performance
- Shiwen Hu and Lizy K. John
- Laboratory for Computer Architecture
- The University of Texas at Austin
2Motivation
- Execution of Java programs consists of distinct
JVM phases - JIT compilation
- Garbage collection
- Execution
- Efficient execution of Java programs necessitates
a comparative study of requirements and
characteristics of JVM phases
3Outline
- Experimental methodology
- Varying cache metrics
- Cache size, set associativity, block size
- Decomposition by miss types
- Time varying cache behavior
- Conclusion
4Methodology
- LaTTe JVM
- An open-source, state-of-the-art JVM
- Memory management of LaTTe JIT compiler
- Reusable initial stack 50KB
- Allocate dynamic stacks when necessary
recyclable - Heap Management
- Large object area indexed by a hash table
- Small object area heads indicating object sizes
5Methodology (Cont.)
- Experimental workloads SPECjvm 98 benchmarks
- Using s10 data set
- Cache simulator
- Based on Cachesim5 from Suns Shade V6 tool suite
- A JVM phase aware cache simulator
- Default configuration
- 64KB, 32B blocks, 4-way set associative
6Breakdown of JVM phases
- JIT compilation and execution phases dominate
- In terms of instruction counts, data references,
and data misses - Garbage collector has the highest miss rates
- Large working set (heap) and pointer-chasings
- But, rarely affects overall cache performance
7Breakdown of JVM phases (Cont.)
8Varying cache size
- Increasing cache size is more effective on JIT
compilation than on garbage collection - Larger working set of garbage collector
- Pointer chasing access pattern of garbage
collector - Stacks of most JIT compilations can be held in
128K cache - Varying effect on execution phase
- More effective on mpegaudio than on db
9Varying cache size (Cont.)
10Varying set associativity
- Increasing set associativity rarely affects JIT
compilation and garbage collection - Negligible conflict misses due to uniform
accesses to heap or stacks - Dominated by capacity misses
- Short lives of JIT objects
- Varying effectiveness on execution phase
- mtrt 52 misses eliminated
- db and javac 13 misses eliminated
11Varying set associativity (Cont.)
12Varying block size
- Effective on JIT compilation and garbage
collection - JIT compilation good spatial locality due to
stack initialization - Garbage collection good spatial locality during
sweep phase - Varying effectiveness on execution phase
- Larger block db, jess, mpegaudio, and mtrt
- Smaller block compress, jack, javac
13Varying block size (Cont.)
14Decomposition by miss types - JIT
- Capacity misses dominate
- Less compulsory misses
- Reusable initial stack
- Overlapped dynamic
- stacks
- Negligible conflict misses
- Splitting cache rarely
- affects miss type
- composition
15Decomposition by miss types - GC
- Fewest compulsory misses in unified cache
- Cache blocks accessed during execution phase
- More compulsory misses
- in split cache
- Uniform heap sweeping
-
16Decomposition by miss types - EXEC
- Relatively more compulsory misses
- Heap objects allocation and initialization
- Variety reveals program
- characteristics
- Splitting cache rarely
- affects miss type
- composition
17Time varying behavior
- Importance of separating JVM activities from
application activities - Java programs execute on JVMs, differing with
C/C programs - Correlating performance results with JVM or
application characteristics is important to
design better JVMs
18Time varying behavior (Cont.)
- JVM specific operations dominate the startup and
end of application execution - Class loadings, method compilations
- Few garbage collections
- Corresponding to burst of cache misses
- Four passes of JIT compilation correspond to four
bursts of cache misses
19Time varying behavior - compress
- Less GC and JIT activities
- Cyclic behavior
- Two phases during execution
20Time varying behavior - mtrt
- More GC and JIT activities
- No cyclic behavior
21Time varying behavior - startup
- Identical behavior during startup
- First 110 million instructions
- Sharing of harness classes among SPECjvm 98
benchmarks prolongs the duration
22Conclusion
- Comparative study of cache performance of
distinct JVM phases - Deterministic characteristics of cache behavior
- JIT compilation traversing intermediate data
structures - Garbage collection large working set and pointer
chasings
23Conclusion (Cont.)
- Near identical cache performance of JIT
compilation among applications - Varying cache behavior during execution phase
reveal characteristics of applications
24Thanks