Adaptive Optimization in the Jalape - PowerPoint PPT Presentation

About This Presentation
Title:

Adaptive Optimization in the Jalape

Description:

Adaptive Optimization in the Jalape o JVM. M. Arnold, S. Fink, D. Grove, M. Hind, P. Sweeney ... Extensible adaptive optimization architecture that enables ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 26
Provided by: aco46
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Adaptive Optimization in the Jalape


1
  • Adaptive Optimization in the Jalapeño JVM

M. Arnold, S. Fink, D. Grove, M. Hind, P. Sweeney
Presented by Andrew Cove 15-745 Spring 2006
2
  • Jalapeño JVM
  • Research JVM developed at IBM T.J. Watson
    Research Center
  • Extensible system architecture based on
    federation of threads that communicate
    asynchronously
  • Supports adaptive multi-level optimization with
    low overhead
  • Statistical sampling

3
  • Contributions
  • Extensible adaptive optimization architecture
    that enables online feedback-directed
    optimization
  • Adaptive optimization system that uses multiple
    optimization levels to improve performance
  • Implementation and evaluation of
    feedback-directed inlining based on low-overhead
    sample data
  • Doesnt require programmer directives

4
  • Jalapeño JVM - Details
  • Written in Java
  • Optimizations applied not only to application and
    libraries, but to JVM itself
  • Boot Strapped
  • Boot image contains core Jalapeño services
    precompiled to machine code
  • Doesnt need to run on top of another JVM
  • Subsystems
  • Dynamic Class Loader
  • Dynamic Linker
  • Object Allocator
  • Garbage Collector
  • Thread Scheduler
  • Profiler
  • Online measurement system
  • 2 Compilers

5
  • Jalapeño JVM - Details
  • 2 Compilers
  • Baseline
  • Translates bytecodes directly into native code by
    simulating Javas operand stack
  • No register allocation
  • Optimizing Compiler
  • Linear scan register allocation
  • Converts bytecodes into IR, which it uses for
    optimizations
  • Compile-only
  • Compiles all methods to native code before
    execution
  • 3 levels of optimization

6
  • Jalapeño JVM - Details
  • Optimizing Compiler (without online feedback)
  • Level 0 Optimizations performed during
    conversion
  • Copy, Constant, Type, Non-Null propagation
  • Constant folding, arithmetic simplification
  • Dead code elimination
  • Inlining
  • Unreachable code elimination
  • Eliminate redundant null checks
  • Level 1
  • Common Subexpression Elimination
  • Array bounds check elimination
  • Redundant load elimination
  • Inlining (size heuristics)
  • Global flow-insensitive copy and constant
    propagation, dead assignment elimination
  • Scalar replacement of aggregates and short arrays

7
  • Jalapeño JVM - Details
  • Optimizing Compiler (without online feedback)
  • Level 2
  • SSA based flow sensitive optimizations
  • Array SSA optimizations

8
  • Jalapeño JVM - Details

9
  • Jalapeño Adaptive Optimization System (AOS)
  • Sample based profiling drives optimized
    recompilation
  • Exploit runtime information beyond the scope of a
    static model
  • Multi-level and adaptive optimizations
  • Balance optimization effectiveness with
    compilation overhead to maximize performance
  • 3 Component Subsystems (Asynchronous threads)
  • Runtime Measurement
  • Controller
  • Recompilation
  • Database (31 3 ?)

10
  • Jalapeño Adaptive Optimization System (AOS)

11
  • Subsystems Runtime Measurement
  • Sample driven program profile
  • Instrumentation
  • Hardware monitors
  • VM instrumentation
  • Sampling
  • Timer interrupts trigger yields between threads
  • Method-associative counters updated at yields
  • Triggers controller at threshold levels
  • Data processed by organizers
  • Hot method organizer
  • Tells controller the time dominant methods that
    arent fully optimized
  • Decay organizer
  • Decreases sample weights to emphasize recent data

12
  • Hotness
  • A hot method is where the program spends a lot of
    its time
  • Hot edges are used later on to determine good
    function calls to inline
  • In both cases, hotness is a function of the
    number of samples that are taken
  • In a method
  • In a given callee from a given caller
  • The system can adaptively adjust hotness
    thresholds
  • To reduce optimization in startup
  • To encourage optimization of more methods
  • To reduce analysis time when too many methods are
    hot

13
  • Subsystems Controller
  • Orchestrates and conducts the other components of
    AOS
  • Directs data monitoring
  • Creates organizer threads
  • Chooses to recompile based on data and
    cost/benefit model

14
  • Subsystems Controller
  • To recompile or not to recompile?
  • Find j that minimizes expected future running
    time of recompiled m
  • If , recompile m at level j
  • Assume, arbitrarily, that program will run for
    twice its current duration
  • , Pm is estimated percentage of future time

15
  • Subsystems Controller
  • System estimates effectiveness of optimization
    levels as constant based on offline measurements
  • Uses linear model of compilation speed for each
    optimization level as function of method size
  • Linearity of higher level optimizations?

16
  • Subsystems Recompilation
  • In theory
  • Multiple compilation threads that invoke
    compilers
  • Can occur in parallel to the application
  • In practice
  • Single compilation thread
  • Some JVM services require the master lock
  • Multiple compilation threads are not effective
  • Lock contention between compilation and
    application threads
  • Left as a footnote!
  • Recompilation times are stored to improve time
    estimates in cost/benefit analysis

17
  • Feedback-Directed Inlining
  • Statistical samples of method calls used to build
    dynamic call graph
  • Traverse call stack at yields
  • Identify hot edges
  • Recompile caller methods with inlined callee
    (even if the caller was already optimized)
  • Decay old edges
  • Adaptive Inlining Organizer
  • Determine hot edges and hot methods worth
    recompiling with inlined method call
  • Weight inline rules with boost factor
  • Based on number of calls on call edge and
    previous study on effects of removing call
    overhead
  • Future work more sophisticated heuristic
  • Seems obvious new inline optimizations dont
    eliminate old inlines

18
  • Experimental Methodology
  • System
  • Dual 333MHz PPC processors, 1 GB memory
  • Timer interrupts at 10 ms intervals
  • Recompilation organizer 2 times per second to 1
    time every 4s
  • DCG and adaptive inline organizer every 2.5
    seconds
  • Method sample half life 1.7 seconds
  • Edge weight half life 7.3 seconds
  • SPECjvm98
  • Jalapeño Optimizing Compiler
  • Volano chat room simulator
  • Startup and Steady-State measurements

19
  • Results
  • Compile time overhead plays large role in startup

20
  • Results
  • Multilevel Adaptive does well (and JITs dont
    have overhead)

21
  • Results
  • Startup doesnt reach high enough optimization
    level to benefit

22
  • Questions
  • Assuming execution time will be twice the current
    duration is completely arbitrary, but has nice
    outcome (less optimization at startup, more at
    steady state)
  • Meaningless measurements of optimizations vs.
    phase shifts
  • Due to execution time estimation

23
  • Questions
  • Does it scale?
  • More online-feedback optimizations
  • More threads needing cycles
  • Organizer threads
  • Recompilation threads
  • More data to measure
  • Especially slow if there can only be one
    recompilation thread
  • More complicated cost/benefit analysis
  • Potential speed ups and estimate compilation
    times

24
  • Questions

25
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com