Title: Dynamic Linking and the Java Reflection API
1Dynamic Linking and the Java Reflection API
2Linking
- Well leave Java for a while and talk about
traditional linkers - Linking and dynamic linking in Java is heavily
influenced by the JVM. - Recall that the output of the compiler
(assembler) is object code. - A linker combines several object files into a
single executable file.
3Linker Responsibilities
- A linker has two basic tasks
- Resolve external symbols (including searching
libraries). - Stitch together a complete program with a
coherent memory map. - We are going to assume an operating system with
virtual memory (paging). - Virtual memory permits the linker to use
absolute addressing.
4Object File Format
- An object file contains the un-linked machine
code generated from a program. - Some instructions (e.g., call, mov) make
reference to symbols. - If the assembler/compiler can calculate the
address of the symbol, it will do so. For
example, any variable stored on the stack has a
known address (relative to the stack frame). - If the address is not yet known (typically the
case with functions and global variables), then a
placeholder is written into the machine code. - Every object file contains a map of the
- unresolved symbols
- exported symbols
5A Simple Object File
example program
machine code
ld r0, x ld r1, y add r0, r1, r0 st r0, x call
doit
- unresolved symbols are chained in the machine
code. The bits of machine code that should hold
the true address of x (which is unknown) instead
hold the offset of the next occurrence of x.
symbol maps
6Symbol Resolution
- Given the maps of unresolved symbols and exported
symbols, a linker can track down all global
variables and functions. - variables declared extern in a source file
result in unresolved symbols in the object file,
must be an exported symbol in exactly one other
file. - Libraries are collections of object files that
the linker can search when attempting to resolve
symbols.
7Libraries (static)
- Libraries are simply collections of object files.
To a first approximation, using a library is no
different that including all the object files
into your project. - Except the linker will always use libraries only
after it has tried to resolve the symbol using
regular object files. - When a linker finds a symbol it needs in a
library, the entire object module containing that
symbol is added to the project. - This object module may contain unresolved
symbols! - The linker researches the libraries to resolve
any new symbols.
8Symbol Relocation
- The total size of the program is not known until
the linker has resolved the last symbol. - Hence, the final addresses for variables cannot
be determined until link time. - The linker works in two passes (similar to an
assembler). The resolution pass is followed by a
relocation pass. - Relocation is the process of patching up the
machine code to use the correct, final addresses
for all variables and functions. - Once all the symbols have been resolved, the
linker lays out the object files, determines
addresses for every symbol, and then walks the
chains replacing each link with the actual
address.
9Dynamic Linking
- Libraries and linking have been a standard part
of programming for many decades. - As languages and operating systems became richer,
the fraction of an application made up by the
library code became larger and larger. - For example, hello world in C is about 30
bytes of machine code (in the object file), but
well over 300KBytes for the complete executable. - It seems redundant to store copies of the same OS
and language libraries (e.g., printf) in every
application. - Dynamic linking is a relatively recent solution
to this problem.
10Dynamic Linking
- A dynamic link library (DLL) is essentially a
conventional library. It contains a set of
object files, each with exported symbols. - In a traditional DLL, the linker will peruse the
library during resolution, but will not perform
relocation with any symbol in the library. - Special glue code is linked into the
application in place of the actual library. The
glue code is very short (well see some in a
minute). - When the OS launches the application, the DLL is
loaded into memory (or mapped using virtual
memory) and the glue code will be capable of
accessing the correct machine code.
11Glue Code Using Pointers to Functions
- One way to build the glue code is with a table of
pointers. - The OS will initialize the table when it launches
the application (since the OS loads the actual
library into memory, it knows the actual
addresses where the functions are stored). - The glue code uses the pointers to access the
functions.
12Glue Code For CVI
glue code
- The glue code is linked with the application just
like a conventional library routine. - When the DLL is loaded, terminate_Ptr is set to
address of the actual terminate routine
static void (terminate_Ptr)(void) void
terminate(void) (terminate_Ptr)()
original code (inside DLL)
void terminate(void) char buff2
printf(press ltentergt to terminate)
fgets(buff, 2, stdin)
13Benefits of DLLs
- Reduce the size of the applications on disk when
several applications share the same library. - Allow libraries to be updated without releasing
new applications - applications will use the new version of the
library automatically the next time theyre
launched!
14Java Class Loading
- Java does all linking dynamically.
- .class files are object files in the
traditional sense. There are no executable files
with Java. - Linking is performed when .class files are
loaded. - Linking is performed by the JVM, access to this
functionality is through a ClassLoader
15Loading
- The first step in linking is to load the actual
bytes from the .class file into the JVM. - The JVM System ClassLoader will load all .class
files that are referenced by the program,
starting with the .main method for a class. - The search path for the System ClassLoader is
defined by the CLASSPATH environment variable. - If the class cannot be found, a
NoClassDefFoundError is thrown. - You can write your own ClassLoader (more on this
later).
16Linking
- After the .class file has been read, the next
step is Linking. - Linking (and all subsequent stages) are performed
in response to ClassLoader.resolveClass() being
called. This is a final method implemented by
the JVM internals (i.e., its not Java code). - Linking a class can result in loading (and
linking) additional classes as symbols are
resolved.
17Resolution (Linking phase A)
- Each .class file contains a symbol table.
- The JVMs linker can be implemented to either
aggressively or lazily resolve symbols. - Aggressive resolution is similar to traditional
linkers, load as many .class files as necessary
to resolve all symbols (recursively). - Lazy resolution results in resolving symbols only
when they are used, for example a static method
is called, or an instance of a class is created
with new.
18Verification (Linking phase B)
- Each .class file loaded is verified.
- During verification, the java bytecode is
examined for errors - no invalid opcodes
- branch targets are to the start of an instruction
(recall Java instructions are multiple bytes
long) - methods have reasonable signatures
- some type checking
19Trust, but Verify
- As I understand it, the purpose of verification
is to ensure that the language remains type
sound even in binary form. - If I could write bytecode by hand (instead of
using javac), and if verification were not
performed, then - Potentially I could write applet .class files
that could escape their sandbox.
20Preparation (Linking Phase C)
- After the bytecode is verified, the class is
prepared by - Creating static fields and initializing those
fields to their default values (null for objects,
0 for ints) - Checking that this class did not somehow become
abstract. - If a base class were changed since this subclass
were compiled, one of the methods we used to
inherit may have become abstract in our base
class. - Building the method table (analogous to virtual
function table from C)
21Initialization
- After the class has been loaded and fully linked,
static initializers are run, and initialization
(assignment) is performed for all static fields. - Only after initialization is complete (for all
recursively loaded classes) does resolveClass
finally return.
22Comparing Java and Traditional Dynamic Linking
- Javas natural linking (via the System
ClassLoader) is most similar to DLLs. - Any changes to the library will be reflected in
the application the next time it is run. - In Java, essentially every .class file is part of
the Library. - By implementing your own ClassLoader class, you
can achieve run-time linking.
23The ClassLoader Class
- Imagine a networked application where you want to
download updates over the net (instead of writing
them onto the disk). - We can write our own ClassLoader that defines the
findClass method. - findClass is invoked by the loadClass method.
24Delegation, Parents and LoadClass
- The standard Java ClassLoader design supports the
equivalent of SearchPaths - Each ClassLoader has a parent (except the System
ClassLoader). - loadClass asks the parent ClassLoader to load the
desired class, if the parent cannot find/load the
desired class, then findClass is called. - With this model, the System ClassLoader is always
tried first. Only if that fails will other
loaders be tried.
25Writing findClass
- class MyClassLoader extends ClassLoader
- public Class findClass(String name)
- byte obj_code
- / open socket, and read .class file
- from remote site
- /
- return defineClass(name, obj_code,
- 0, obj_code.length)
-
- resolveClass will be called by loadClass after
this method returns.
26Class? What the heck is Class?
- Run-time loading poses a problem.
- If the class(es) loaded by the application (i.e.,
main) referenced the remote class, then the JVM
would have tried to load these classes with the
System ClassLoader (which would have failed) - If the application never references the remote
class, how do we use this code? i.e., were
never calling any static methods, and never
creating any objects. - We could, in principle, put all of our code
inside static initalizers (yuck). - The Class class and reflection API provide a
better way to use run-time loaded classes.
27Using Class
- Using Class, we can
- determine what methods a class has (including
constructors) - determine what fields a class has
- create instances of this class
- invoke public methods and set public fields.
28ClassLoaders again
- OK, so Class will (somehow) allow us to invoke
methods but what if those methods invoke methods
on other classes. - If our purpose was to update the program
dynamically, wed like all of the new stuff to
come from the remote site. - Java supports this objective by remembering which
ClassLoader was used to load each class. - If, during symbol resolution, new classes must be
loaded, the same ClassLoader will be used.
29Field, Method and Constructor
- In addition to Class there are other classes
included in the Reflection API. - class Field is used to describe and provide
access to data members - class Method is used to describe and provide
access to member funtions - class Constructor is used
30Getting Class objects
- There are no constructors (no public ones,
anyway) defined for Class. - The Class objects are created by the JVM during
linking. - You can obtain a reference to a Class object in
several ways - by calling ClassLoader.defineClass
- by appending .class to the name of the class
- by invoking the static method Class.forName(String
) - by invoking Object.getClass
31Uses of Reflection and ClassLoading
- RMI and serialization.
- Runtime loading
- For example, plug-ins can be discovered and
installed on the fly. - Java applications can discover the interfaces to
code. - Useful for program-builder programs. Do not need
access to source code. - In principle, an application can compile/write
itself!