The Java Virtual Machine Internal Architecture and Function - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

The Java Virtual Machine Internal Architecture and Function

Description:

Dereferencing only once to access the instance data from the object referencing ... It hurts performance (by dereferencing twice), but it saves on memory. ... – PowerPoint PPT presentation

Number of Views:1022
Avg rating:3.0/5.0
Slides: 37
Provided by: davidg64
Category:

less

Transcript and Presenter's Notes

Title: The Java Virtual Machine Internal Architecture and Function


1
The Java Virtual MachineInternal Architecture
and Function
  • Catalin Constantin

2
Contents
  • Overview
  • The Architecture
  • Class Loader Subsystem
  • Method Area
  • Method Tables
  • The Heap
  • Object and Class Data Representation
  • Object Representation
  • Local Variables Representation
  • Resolution, Exceptions, and Abrupt Method
    Completion
  • Execution Engine and Execution Techniques
  • The Instruction Set
  • Native methods interaction
  • Execution Order and Optimizations
  • Summary

3
Overview
  • The Java Virtual Machine is the environment for
    running Java programs. It is called a virtual
    machine, because it is an abstract computer
    defined by a specification.
  • Can be defined in three ways
  • Abstract Specification
  • Concrete Implementation
  • Runtime Instance
  • Each Java application runs in its own virtual
    machine, and has exclusive access to the
    structures created by the virtual machine to
    accommodate its runtime.

4
The Architecture
  • JVM Architecture is based on subsystems, memory
    areas, data types and instructions organized as
  • Class Loader Subsystem
  • mechanism for loading types, classes and
    interfaces given fully qualified names.
  • Execution Engine
  • mechanism responsible for execution of the class
    and method instructions.
  • Runtime Data Areas
  • organized memory unit used to store bytecodes
    loaded from class files, object instances, method
    parameters, return values, local variables and
    intermediate results.
  • Native Method Interface
  • not always publicly available to the programmer.
    Some implementations hide this part, while other
    try to emphasize it and make code optimization
    more feasible.
  • Note There are no general purpose registers,
    instead JVM uses a stack to simulate
    register-like operations

5
(No Transcript)
6
Class Loader Subsystem
  • Responsible with
  • Loading finding and importing binary data for
    each type
  • Linking verification, preparation, resolution
  • Initializing invoking java code that performs
    initialization
  • Two kinds of loaders
  • Bootstrap class loader part of the virtual
    machine implementation
  • User-defined class loader part of the running
    application
  • Classes loaded by each class loader are placed in
    separate namespaces
  • Loaders must be able to recognize and load
    classes stored in files that conform to Java
    compiled class format

7
  • Bootstrap class loader
  • Loads trusted classes including the Java API
  • Is unique and has its own namespace
  • User-defined class loader
  • Is not necessarily unique
  • Inherits four gateway methods into the JVM, the
    most important is resolveClass() which accepts a
    reference to a heap object and can dynamically
    determine its type
  • By using namespaces, JVM can load multiple types
    with the same fully qualified name through
    different loaders
  • When it resolves symbolic references from one
    class to another, it requests the referenced
    class from the same loader that imported the
    referencing class

8
Method Area
  • Properties
  • Complex data area
  • Stores information about a loaded type
  • Methods of instantiated objects are kept in the
    method area in fact, not in the heap along with
    other objects content
  • Shared among all running threads (thread safe)
  • when two threads request the same type, only one
    of the requests will actually load the type while
    the other is waiting
  • Not fixed in size
  • Garbage collected
  • The idea here is slightly different from
    collecting unreferenced objects

9
  • Contents
  • Basic Information
  • Fully qualified name
  • Relationship to the superclass
  • Type modifiers (public, abstract, final, etc)
  • Advanced Information
  • Constant Pool
  • Ordered set of constants used by type,
    literals, symbolic references to types, fields
    and methods. It plays a major role in dynamic
    linking. Entries referenced by index much like
    elements of an array
  • Field Information
  • Method Information
  • Static variables
  • Static variables of a loaded class must
    retain changes across multiple calls. The fact
    that two related classes are in the same
    namespace ensures that subsequent accesses to
    static variables is not memory-less
  • References to ClassLoader and Class

10
Method Tables
  • A method table is an array of direct
    references to all the instance methods that may
    be invoked on a class instance, including the
    inherited methods.
  • Properties
  • Allows the virtual machine quick access to
    instance methods
  • Each instantiated object will have a reference to
    the method table associated with the class
  • In conjunction with information stored in the
    heap, plays an important role in dynamic linking
    and polymorphism

11
The Heap
  • Is the location where all class instances and
    arrays (which are also viewed as objects) are
    instantiated.
  • Properties
  • One common heap for each running instance of the
    JVM
  • The JVM has an instruction for allocating space
    on the heap, but has no explicit instruction for
    de-allocating space. Just as an object cannot be
    freed in Java code, it cannot be freed explicitly
    in virtual machine code either
  • The garbage collector is solely responsible for
    eliminating unreferenced objects from the heap

12
Object and Class Data Representation
  • Object Representation
  • The JVM specification is not strict about object
    representation
  • Given an object instance, the JVM must be able to
    quickly locate the instance data and the class
    data
  • Memory allocated for an object in the heap must
    contain a pointer into the method area, where the
    class data is stored.
  • Two most important models are presented next
  • Arrays are represented as objects
  • Local variables representation
  • Local variables are stored in the Java stack
    frame associated with each method
  • Each running thread gets its own Java stack, and
    each method has an active method frame onto the
    stack of the thread context in which it is called
  • Passing parameters is done through the Java stack
  • The storage size is one entry for int, float,
    reference and returnAddress
  • The storage size is two entries for long and
    double, called with the address of the first
    entry
  • Note Variables of type byte, short and char
    are stored as int on the Java stack. The boolean
    type is not directly supported by the JVM, it is
    translated into int

13
Object Representation (model 1)
  • Divide the heap in two parts the handle pool and
    the object pool
  • An object reference is a native pointer to a
    handle pool
  • Each handle pool has two entries
  • A pointer to instance data (in the heap)
  • A pointer to class data (in the method area)
  • Advantage
  • Prevents fragmentation. When an object is moved,
    only one pointer needs to be changed
  • Disadvantage
  • Each time a referencing is made, in fact the
    virtual machine must dereference two pointers.
    One to the handle and another one to the data

14
(No Transcript)
15
Object Representation (model 2)
  • An object reference is a native pointer to a
    bundle of data that contains the object instance
    data and a pointer to class data
  • Advantage
  • Dereferencing only once to access the instance
    data from the object referencing native pointer.
  • Disadvantage
  • Moving objects to prevent fragmentation becomes
    more complicated. When the Java virtual machine
    moves an object into the heap it must update
    every reference to that object anywhere in the
    runtime area where it is used.

16
(No Transcript)
17
  • Object Representation and Casts
  • The main reason the JVM needs to get access from
    an object reference to the class data is for
    resolving attempts to perform casts
  • It must check to see if the type being cast to
    is
  • Either the actual type of the object the cast
    is allowed instantly
  • Or a type of its ancestors the procedure
    involves checking all superclasses up the class
    inheritance tree
  • Note
  • Earlier we noted that an object must have a
    reference to its super class data. Imagine a cast
    this way. The Java virtual machine attempts a
    cast. If the objects real type is the same as
    the type being cast to, then the cast is allowed
    instantly. If the two do not match, the Java
    virtual machine can follow the reference to its
    superclass. It will then check again the type
    cast consistency and so on up to the Object
    class. A successful type cast looks like a direct
    path between the actual type and the cast type up
    the class tree.

18
Most Common Model (2)
19
Similarities and Differences Java vs. C
  • Java object representation is somewhat similar to
    VTBL structure in C.
  • In Java, the objects are represented by instance
    data and a pointer to class data (and implicitly
    method table)
  • In C, the objects are represented by instance
    data and an array of pointers to any virtual
    functions that can be invoked on the object
  • The main difference between Java objects and C
    objects is that, while in C the functions are
    not predominantly virtual, in Java they always
    act like virtual.
  • If Java would adopt the same layout as VTBL of
    C, then it would need to store (redundantly)
    pointers to all instance methods
  • Java can accomplish the same results by only
    storing one pointer to the class data
  • It hurts performance (by dereferencing twice),
    but it saves on memory.

20
Array Representation
  • Java arrays are objects, they are stored in the
    heap, and they are associated with a class type
  • Example A one dimensional array of int
    elements and a two dimensional array of int
    elements have different class types. Symbolically
    they are represented as I and I. A two
    dimendsional array of objects would be
    symbolically represented as Ljava.lang.object
  • Multidimensional arrays are represented as arrays
    of arrays, thus some array elements can be
    considered themselves compatible for other array
    type assignments or casts
  • The length of an array or any of its dimensions
    does not determine the type of the array, it is
    only an instance data (field)

21
Array Representation
22
Local Variables Representation
  • Local variables can be represented in any order
    by the compilers inside the Java stack frame
    associated with a method
  • Some locations on the stack can be reused for
    local variables that temporarily go out of scope.
  • Parameters are also passed using the Java stack,
    and they are pushed onto the stack in the order
    they are encountered from left to right
  • There is one important difference between the
    Java stack frames of class methods (static) and
    instance methods
  • The instance method has in its first entry of
    the stack a reference corresponding to the hidden
    this, used to access the instance data in the
    heap associated with the invoking object

23
  • class Example
  • public static int runClassMethod(int i, long l,
    float f, double d, Object o, byte b)
  • return 0
  • public int runInstanceMethod(char c, double d,
    short s, boolean b)
  • return 0

24
Resolution, Exceptions, and Abrupt Method
Completion
  • Resolution
  • The references to types, fields and methods in
    the constant pool are initially symbolic.
  • When the JVM needs to refer to either one, they
    are still in symbolic form, and the virtual
    machine needs to perform a resolution
  • Resolutions are performed using data from the
    Method Area together with information obtained
    from the class loaders
  • Exceptions
  • JVM uses exception tables to handle exceptions
  • Exception table entries consist of ranges within
    the bytecode of a method that are protected under
    a certain exception
  • Entries contain a starting and ending point, and
    also a pointer to the exception handler
  • Abrupt Method Completion
  • Every unmatched exception causes an abrupt method
    completion
  • The JVM uses the Java frame data in the
    processing of abrupt method completion to restore
    the stack, set the exception message, and
    terminate the running program

25
Execution Engines and Execution Techniques
  • The Execution Engine is part of the core of any
    JVM. Its specification is made up of the
    instruction set and what the implementation
    should no, not how it should do it.
  • Possible implementations can interpret,
    just-in-time compile, natively execute, or a
    combination of these
  • Each thread of a running Java application is a
    distinct instance of the execution engine in
    action
  • Important aspects
  • The Instruction Set
  • Native Method Interaction
  • Execution Order and Optimization

26
The Instruction Set
  • Each instruction is a one-byte opcode followed by
    zero or more operands
  • The opcode indicates the operation and the
    operands supply the data needed by the JVM to
    complete the operation. Information about how
    many operands are needed is built in the nature
    of the opcode itself
  • The execution engine processes one opcode at a
    time
  • When running, the execution engine has direct
    access to the current constant pool, current
    frame, and current operand stack
  • The operand stack is part of the Java stack,
    organized as an array of words, accessed solely
    by push and pop operations, and used as a
    workspace to perform stack based register-like
    operations.
  • All instructions in the JVM are associated with
    mnemonics. The listing of a class file can
    produce an assembly-like language file
  • To be able to understand how the JVM works, we
    can look inside a class file using the javap
    program distributed with any Java 2 SDK

27
Example A class method, primitive types
28
Example B class method, object types
29
Example C instance method, primitive types
30
Native Methods Interaction
  • It is possible for the execution engine to be
    requested a native method
  • Depending on the implementation of the virtual
    machine it may or may not be able to invoke
    native methods
  • The implementations that allow it, provide an
    interface (JNI). The execution engine must be
    able to invoke a native method, wait idle until
    the native method returns, and then continue the
    execution of bytecodes. It also must be able to
    deal with exceptions that come from the native
    method
  • There is a layer of complexity added to this
    running schema, because the native methods
    themselves need to be able to access information
    in the JVM while running native code

31
Execution Order and Optimizations
  • Execution Order
  • Execution engines are responsible to determining
    the next instruction to be executed
  • Generally the flow is straightforward, most
    instructions are executed in order
  • Instructions like goto and return use data to
    specify the next instruction
  • The only abnormal paths of execution are in the
    case of exception handling
  • Optimizations
  • Interpretation first generation JVM
  • Just-in-time compilation second generation JVM
  • Adaptive optimization contemporary trend
  • Native execution A form of JIT and Adaptive
    Optimization

32
Adaptive Optimization
  • Implemented by most modern versions of the JVM,
    like Suns Hotspot virtual machine
  • The advantages of either pure interpretation or
    just-in-time compilation are too extreme if
    implemented in absolute terms
  • A purely interpreted program will be slow at
    runtime, but it does not take extra time to get
    started
  • JIT compilation allows for fast execution, but
    would delay the beginning of execution by the
    time needed to completely compile the bytecode to
    native code
  • In Adaptive Optimization, the JVM takes advantage
    of information available at runtime and attempts
    to combine the bytecode interpretation with
    compilation to native code.

33
  • Based on a clever remark
  • most programs spend 80 to 90 percent of the
    time executing 10 to 20 percent of the code
  • The JVM
  • Begins by interpreting the bytecodes
  • Monitors execution of that code
  • Figures out the hot spot of the code and starts a
    background thread to compile that code to native
    code
  • Avoids premature optimization, which is typical
    to static compilers

34
  • Too good to be true? Correct!
  • There are some issues with this, and depending
    on how well these issues are dealt with, one
    implementation can greatly differ in performance
    from another
  • Known Issues
  • Adaptive optimization does not work well over
    method invocations.
  • Inlining? This can have issues too when we talk
    in terms of polymorphism
  • Going in and out the hot spot

35
Summary
36
Practice
  • Play with javap to determine
  • A) The assembly-like listing of a compiled class
    using javap c
  • B) The method signature (public and protected) of
    a compiled class using javap s
  • C) The complete profile (including Constant Pool)
    of a compiled class javap -verbose
Write a Comment
User Comments (0)
About PowerShow.com