Title: Java Virtual Machine and Jikes RVM
1Java Virtual Machine and Jikes RVM
- Topics
- Virtual Machine
- Performance
- Jikes RVM
2Virtual machine (Abstract computer)
- JVM (Java Virtual Machine)
- Collection of libraries that runs java code
- CLR (Common Language Runtime)
- Similar to JVM but more target languages
- VMM (Virtual Machine Monitor)
- PVM (Parallel Virtual Machine)
- Network computers-like
3Background
- JVM in Fig1 (layered abstraction)
- Bytecode in Fig 2 (machine language of JVM)
- Stream of bytecode is a sequence of inst, each
has a 1-byte opcode, zero or more operand - Mnemonic assembly language in JVM
Fig 1
Fig 2
4Bytecode example
Method count() 0 aload_0 1 invokespecial 1
4 return Method
void main(java.lang.String) 0 iconst_2 1
istore_1 2 iconst_0 3 istore_2 4
iconst_0 5 istore_3 6 goto 16 9 iload_1
10 iload_3 11 iadd 12 istore_2 13 iinc 3
1 16 iload_3 17 iconst_5 18 if_icmplt 9
21 getstatic 2
24 iload_2 25 invokevirtual 3 println(int) 28 return
- class count
- public static void main(String s)
- int x2, y0, i
- for (i0 i
- y x i
- System.out.println( y )
-
-
5JVM Facts
- Stack machine
- Object oriented
- Method invocation (static, virtual, interface)
- Exceptions
- Type checking
- Multithreading
- Memory management
- Class loading
6Virtual Hardware
- Four basic parts
- register, stack, garbage-collected heap, method
area - 32-bit address, 4GB addressable memory, integer
is 32-bit long - Stack machine (vs. register machine)
- Registers PC, optop, frame, and vars
7JVM performance
- Do not expect Java compiler/Jikes to perform
many clever optimization due to Javas rather
strict sequencing and thread semantics, there is
a little the compiler can safely do, in contrast
to compilers for less strictly defined languages
such C/Fortran.
8(No Transcript)
9(No Transcript)
10(No Transcript)
11JVM Runtime state
- Method area
- like .text, store bytecode (pc)
- Stack
- stack frame storing params and result
- 3 sections local var, execution env, operand
stack - Heap
- created dynamically by new
- operator, runtime env tracks it
12(No Transcript)
13JVM Instruction-set family
- Load and Store
- load, store, push
- Arithmetic and Logic
- add, sub, mul, div, rem, neg, shl, shr, or, xor
- Conversions
- i2l, i2f, i2b, i2c, i2s
- Objects
- new, newarray, getfield, putfield, aload,
astore, instanceof - Stack management
- pop, dup, dup_x, swap
14- Control transfer
- if_icmpeq, ifreq, iflt, ifnull, tableswitch,
cmp, goto, jsr, ret, athrow - Method invocation
- invokevirtual, invokeinterface, invokespecial,
invokestatic - Multiple encodings
- iconst_m1, iconst_0, bipush, sipush, ldc
- Constant pool
- Utf8, Inteter, Float, Class, NameAndType,
Fieldref, MethodRef, InterfaceMethodref
15Jikes RVM Introduction
- Formerly called Jalapeho project at IBM
- Based on JIT (Just-in-time) compilation
- Not a full VM, meant for research
- Nothing to do with source to bytecode compiler
- Implemented in Java (203k Java, Feb 02) with
GNUs CLASSPATH - Runs on AIX/PowerPC and Linux/IA32
16Jikes RVM subsystem
- Two compilers (baseline and optimizing compiler)
and Adaptive Optimization System (AOS). - Major java features
- Bootstrapping, Method invocation, Method
dispatch, interface method invocation, - Exception, Dynamic type checking, Memory memory
management, Garbage collection, - Threads, Class loading etc
17Baseline Compiler
- Based on Just-In-Time compiler
- Two main components
- Generate GC map for safe point
- Generate executable (209 switch statement)
- Generate machine code
- Emulate stack machine
- Faster in compilation but much slower (5) than
optimizing compiler
18Optimizing compiler
- Translating code into IR (Intermediate
Representation) and perform optimization. - Three levels of IRs, multiple optimization levels
(many classical, some novel), 100K. - Template driven, instruction selection, IA32
assembler - Three groups local (basic block), global (method
level) and inter-procedural (across methods). - Interface to adaptive optimization system.
19Transformation
- Four transformations
- - HIR (High IR)
- - LIR (Low IR)
- - MIR (Machine IR)
- - Final assembly to machine code
- Transformation
- - Preserve original meaning
- - Measurable speedup
- - Worth effort
transform
20Bytecode to IRs
21- High level IR (HIR)
- Recognize arch independent
- bytecode and resemble them
- - Use register transfer language
- (JVM stack abstraction)
- - Array bound check
- - Inline of bytecode subroutine
- - Devirtualization
- Low level IR (LIR)
- - Arch independent, Complex HIR operators (new,
call virtual) are expanded - Machine Level IR (MIR)
- Arch specific, defining MIR file (29 format)
- Inst selection, register allocation
- One-to-one mapping between operators and target
ISA
22- Common subexpression
- for (i0) xai aiaj
- t64i t6 4i
- xat6 x at6
-
- t74i at6 t9
- at7t9
- Dead-code
- if (debug) print
- Loop optimization (induction variables)
- while (i
- while (i
-
23HIR (BC2IR)
- Transform to HIR
- Abstract interpretation of bytecode
- Bytecode subroutines inlined
- Translate stack inst to 3-address inst
- On-the-fly dataflow optimizations
- (const folding, branch optimization, unreachable
code elimination, const and type propagation, ) - Build FCFG
- Ref ACM Java Grande 99
24HIR Example
25FCFG example
- Reduced of nodes
- Improve basic block analysis
- Adapt forward/backward analysis
-
26MIR (HIR2LIR)
- Convert to LIR
- Introduce Jikes RVM detail into IR
- VM services
- (allocation, locks, type, checks, write
barriers) - Object model
- (field, array, static layout, method invocation)
- Other low-level expansion
- (switch, compare/branch, exception checks)
27LIR Example
t1 new java.lang.Object return t1
public static Object quickNewScalar(int size,
Object tib, boolean hasFinalizer) throws
OutOfMemoryError Object rrt
VM_Allocator.allocateScalar(size, tib) if (
hasFinalizer ) VM_Finalizer.addElement(ret) retu
rn ret
public static Object allocateScalar(int size,
Object tib) throws OutOfMemoryError
VM_Address regiongetHeapSpaceFast(size) Obje
ct newObj VM_ObjectMode.initializeScalar( re
gion, tob, size) return newObj
28MIR (LIR2MIR)
- Convert to MIR
- Instruction selection rules for each architecture
- Each rule in LIR2MIR.rules is defined by a
four-line record (PRODUCTION, COST, FLAGS,
TEMPLATE) - Ref Bottom-up, rewrite system (Fraser 92,
- Sarkar 01)
29Code optimization techniques
- Three address code
- Keep heavily used var in registers
- Control-flow improvement
- Common subexpression
- Dead-code elimination
- Constant folding
- Loop optimization
- Basic block optimization (flow graph)
- Data Flow analysis and flow graph
30More
- Flow insensitive optimizations (register list)
- Inlining any method with on-the-fly optimization
(guarded inlining) - Load Elimination
- Loop Normalization
- Loop Unrolling
- Global code placement
- Commoning (separate exeception from computation)
- Heap Array SSA
- FCFG forward analysis
- Simple Escape analysis (use register list to
identify object) -
31Final Assembly
- Machine code generation
- GC maps
- Exception tables
- Machine code info for
- Online profiling, stackframe inspection,
debugging, dynamic linking, lazy compilation
32Other interesting work
- GUN Kissme JVM
- Porting the Jikes RVM to Linux/IA32
- Cache performance in JVM
- What missed Java Memory Management, Jikes
specific optimization.