Friday, May 5, 2023

Reflection for the modern Java programmer

Take advantage of the rich metadata the Java compiler provides to the JVM.


Reflection is one of the most flexible and powerful techniques Java developers have at their command. However, it is frequently misunderstood or misused, and certain aspects of its API and design are rather clunky and, in places, out of date.

Java Programmer, Oracle Java Career, Oracle Java Jobs, Oracle Java Prep, Oracle Java Preparation, Oracle Java Tutorial and Material, Oracle Java

Nevertheless, to become an advanced Java developer, gaining a solid understanding of reflection (also known as the Core Reflection API) is an essential step. Let’s start with an important point of history that sometimes goes unnoticed.

When Java was first released over 25 years ago, the two most dominant programming languages were probably C and C++. As most Java developers know, the syntax of Java is heavily influenced by these languages, especially C++. However, despite the similar syntax, as programming environments go, Java is very different from C++. The nature of this difference is in how programs are built and executed.

◉ C++ compiles to a native machine code binary that can be directly executed.
◉ Java compiles to a portable, intermediate format (bytecode) that requires a runtime container to execute.

The Java runtime execution container, that is, the Java Virtual Machine, provides external support for certain concerns that Java programs delegate to the runtime. The usual example of these concerns is garbage collection: The Java programmer has no control over allocating and recovering memory within the heap. Instead, the reclamation process is under the complete control of the JVM.

This is an example of what is meant by describing the JVM as a managed environment. Java programmers give up control of the precise low-level details of handling memory and in return the JVM manages those (boring, pedantic, exacting) details on their behalf.

When it comes to reflection, there is another aspect of the JVM’s managed execution environment to focus on: the rich runtime type information that is present in every executing Java program. Consider a piece of code like the following:

List<String> ls = getSomeStrings();
System.out.println(ls.size());

Without knowledge at runtime of the inheritance and interface implementation hierarchy of the program, how can the JVM work out which size() method to call? After all, the List interface does not even contain a method implementation.

The details of the hierarchy are contained in the class files because each individual class refers to the types it depends upon. The JVM, during the class loading process, assembles this data into a representation of a graph that describes the class inheritance and interface implementation of every type in the system. This includes JDK types, third-party libraries, and custom, user-defined classes of the application.

This inheritance metadata is always present for every type, and nothing a Java programmer can do will prevent it from being created and available at runtime. This stands in contrast to C++, which by default does not preserve type metadata until runtime. Instead, C++ uses keywords such as virtual and override to allow the programmer to selectively opt in to limited runtime behavior (such as method overriding).

In Java the runtime metadata is always present on every object, although the exact details of that metadata will depend upon the JVM implementation. For the rest of this article, when I’m talking about internals, I will discuss only the HotSpot JVM, which is the implementation used by OpenJDK and Oracle’s JDK.

A discussion of the Core Reflection API is, however, completely applicable to any JVM implementation, as the API is part of the java.* namespace and thus is a standard.

In the HotSpot JVM, the metadata is held in an object header, which separates the metadata into type-specific and instance-specific metadata.

The instance-specific metadata is known as the mark word, and it is used to keep track of various important pieces of information such as whether an object’s intrinsic lock (also known as the monitor or synchronization lock) is being held by a thread. The metadata also contains information used during garbage collection (which is where the name comes from, after the concept of mark sweep garbage collection).

On the other hand, the shared, type-specific metadata is stored in an area called metaspace and every Java object header contains a pointer to the metadata for the class that the object belongs to. This is known as the klass word of the header.

The fact that every object has a pointer to the shared class metadata means that during code execution, the JVM can always traverse the pointer and access the runtime type information of the class that the object belongs to.

Therefore, one way of thinking about reflection is that it’s posing the following question: “What if the execution environment exposed the runtime metadata (which is guaranteed to exist) and allowed Java programmers to access and use it during the execution of their programs?”

A brief history of reflection


In the first version of Java, the Core Reflection API did not exist, but there were already methods on java.lang.Class for accessing some metadata about the runtime types of objects, such as the name of the class. For example, code such as the following would already work in Java 1.0:

Object o = getObject();
System.out.println(o.getClass().getName());

The next release of Java, version 1.1, introduced full reflection capabilities for the first time. The package java.lang.reflect became part of the standard distribution and allowed developers to access a full range of reflective operations. This included such things as calling methods on objects of unknown type that had been reflectively created.

Note that the Core Reflection API predates the arrival of Java’s Collections API (which appeared in Java 1.2). As a result, types such as List do not appear in the Core Reflection API; instead, arrays are used. This makes some parts of the API rather more awkward to use than they otherwise would be.

Of course, Java was not the first language to provide reflective programming capabilities; others, such as Smalltalk, had pioneered the reflective approach. However, Java was the language that really brought reflective programming into mainstream software development.

The Java ecosystem warmly embraced reflection, and many of the most popular and powerful Java frameworks rely upon these capabilities to implement their core functionality.

Reflection in action


Here are some tasks you can do with reflection.

◉ Create an object reflectively.
◉ Locate a method on the object.
◉ Call the located method.

Those tasks are accomplished with code such as the following. In this first example, exception handling is ignored, for now, to improve code clarity.

Class<?> selfClazz = getClass();
Constructor<?> ctor = selfClazz.getConstructor();
Object o = ctor.newInstance();
Method toStr = selfClazz.getMethod("toString");
Object str = toStr.invoke(o);
System.out.println(str);

The code starts from a class object, which you have obtained via getClass() in this example. In practice, this class object can also be obtained from a class loader or, more rarely, as a class literal (such as String.class). Reflection can be combined with class loading to provide the capability for Java programs to work with code that was completely unknown at compile time.

This approach is the basis of plugin architectures and other very dynamic mechanisms that Java can make use of.

No matter how you obtained it, once you have a class object, you may call getConstructor() on it. This method is variadic—that is, it may take a variable number of arguments—to cope with the possibility of multiple constructors for the class.

The example above calls the default (void) constructor; this constructor takes no arguments. The result is a constructor object from which you can obtain an instance of the class, just as if you had called the constructor directly.

From the constructor object, calling newInstance() causes a new object of the appropriate type to be created. This method is also variadic, because in the nondefault case you would need to supply the correct parameters to the call.

By the way, the class object also declares a newInstance() method that can be used to create new objects directly, without creating an intermediate constructor object. (Class.newInstance() is deprecated due to its exception handling behavior; this will be discussed shortly.)

You may also call getMethod() to retrieve a method object from the class object. This represents the capability to call a method that belongs to the class that the method object was created from. Once you have the Method object, you can use invoke() to call the method it represents.

Instance methods need to be called reflectively with the receiver object (the object that the method is being called upon) as the initial argument to invoke(), with any method parameters following. In the example above, o is the receiver and there are no parameters for the call to toString().

It is also possible to access fields reflectively; the API and syntax are very similar to the case of methods. The core class is java.lang.reflect.Field and the fields can be accessed via Class.getField() and other similar methods. Due to the similarity of the code, I’ll skip over this case.

Variadic methods in the Core Reflection API


Here’s a second and slightly more complex example, where the constructors and method calls require arguments.

public record Person(String firstName, String lastName, int age) {}

Class<?> selfClazz = Class.forName("javamag.reflection.ex1.Person");
Constructor<?> ctor = selfClazz.getConstructor(String.class, String.class, int.class);
Object o = ctor.newInstance("Robert", "Smith", 63);
Method toStr = selfClazz.getMethod("equals", Object.class);
Object other = new Object();
Object isEqual = toStr.invoke(o, other);
System.out.println(isEqual);

Here, the calls to getConstructor() and getMethod() are provided with a number of class objects. These represent the type signature of the constructors or methods.

The calls to Constructor.newInstance() and invoke() are, similarly, provided with objects that are the parameters to the calls. These parameters should, of course, match the expected types of the arguments.

In addition to this requirement to use variadic methods to get reflection objects, the type system and nature of Java come in to play in several places within the reflective APIs. These concerns include

◉ Dealing with exceptions
◉ Overloading
◉ Access control
◉ Autoboxing and casting

Each of these concerns can complicate the writing of reflective Java code.

Dealing with exceptions. Until now, I’ve ignored the need to handle exceptions within the reflective code examples. To move into the real world, I’ll show part of the first example with exception handling code, as follows:

try {
    Class<?> selfClazz = getClass();
    Constructor<?> ctor = selfClazz.getConstructor();
    Object o = ctor.newInstance();
    // ...
} catch (NoSuchMethodException e) {
    // Could be thrown by getConstructor()
    e.printStackTrace();
} catch (IllegalAccessException e) {
    // Could be thrown by newInstance()
    e.printStackTrace();
} catch (InstantiationException e) {
    // Could be thrown by newInstance()
    e.printStackTrace();
} catch (InvocationTargetException e) {
    // Could be thrown by newInstance()
    e.printStackTrace();
}

Of these four possible exceptions, NoSuchMethodException can be thrown by getConstructor() if the class does not have a constructor with a signature that matches the class objects passed in. The remaining exceptions could be thrown by Constructor.newInstance() as follows:

◉ IllegalAccessException: Access control is enforced, and the caller does not have the correct access.

◉ InstantiationException: The constructor can be accessed, but an object cannot be created, for example, because the underlying class is abstract.

◉ InvocationTargetException: The execution of the constructor body threw an exception.

The final exception, InvocationTargetException, contains a reference to the original exception that was thrown by the constructor body, and this underlying exception can be retrieved by calling getTargetException(). It is this feature that explains why this approach is preferred over the deprecated Class.newInstance() method: The deprecated method has no way to retrieve the underlying exception.

Of the remaining reflective calls in this example, getMethod() can throw NoSuchMethodException, and invoke() can throw IllegalAccessException and InvocationTargetException.

Overloading. The existence of method overloading in the Java language means that when a method (or constructor) is looked up via getMethod(), passing the name is not sufficient; instead, the method signature, represented as a sequence of class objects (which will be converted to a Class[]) must also be provided.

I consider usage of Class[] (and Object[] when a method is called reflectively) as something of a design flaw, and it does make the API more difficult to work with. This is really just an unfortunate accident of history because the Core Reflection API was added before the Collections libraries existed, so the API must be maintained in its original form for backwards compatibility reasons.

Access control. Another problem relates to access control: The API provides two different methods, getMethod() and getDeclaredMethod(), to access methods reflectively.

The first of these two possibilities, getMethod(), is used for looking up public methods. By contrast, getDeclaredMethod() can be used to find any method declared on a class, even private methods.

By default, reflective code still respects the access control semantics of the Java language, but it is possible to override those semantics because the API provides the setAccessible() method, which can be called on methods, constructors, and fields. Once this method has been called on an accessible object, the access control modifiers will be ignored.

In my opinion, the setAccessible() method is fundamentally dangerous, because it allows programmers to selectively turn off parts of the access control system when they are working reflectively.

It represents a compromise in the reflective subsystem: Sometimes there is just no other way to get the access that is required, but when it’s overused it can cause all sorts of security and safety issues.

It can be argued that the compromise is not too bad, because by the time you have obtained a Method object or another accessible object, the corresponding class has definitely already been loaded, which means that the bytecode of the class has already passed verification by the time you make your reflective call.

Nevertheless, using setAccessible() to gain access to methods that are otherwise inaccessible represents a violation of encapsulation and a usage that the original code author did not intend.

Autoboxing and casting. The Core Reflection API contains several variadic methods for lookup and invocation. For example, Constructor.newInstance() and invoke() both take Object... as their variadic parameter, and this immediately raises the question of what to do about primitive values when you call code reflectively.

First off, the lookup methods (such as getConstructor() and getMethod()) take Class<?> objects as parameters, so you can simply pass the class literals corresponding to primitive types, such as int.class.

It’s worth noting here that almost all class literals are singletons. For example, String.class is the only instance of the type Class<String>. However, there are two instances of Class that are parameterized by each wrapper type (such as Integer): one for the wrapper type and one for the corresponding primitive.

That is, the following code prints false:

Class<Integer> intClz = int.class;
Class<Integer> integerClz = Integer.class;
System.out.println(intClz == integerClz);

The second example above provided a clue as to how this applies to reflective code in practice.

Class<?> selfClazz = Class.forName("javamag.reflection.ex1.Person");
Constructor<?> ctor = selfClazz.getConstructor(String.class, String.class, int.class);
Object o = ctor.newInstance("Robert", "Smith", 63);

This code looks up the primary constructor for the record type and then instantiates an object by passing a primitive value as the second argument. This argument will need to be boxed to match the signature of Constructor.newInstance() (which takes Object[]), so the call is really the following:

Object o = ctor.newInstance("Robert", "Smith", Integer.valueOf(63));

The final argument will then be unboxed in the reflection implementation code prior to the actual call to the constructor being made. This approach is fairly simple once you get used to it, but it is a little clumsy, and it does require extra unnecessary boxing operations.

The return value of reflective calls also requires careful handling; the return type of invoke() is Object, so any return value needs to be downcast to an expected, more useful, type.

This cast operation, of course, may fail with a ClassCastException (which is a runtime exception).

Source: oracle.com

Related Posts

0 comments:

Post a Comment