Friday, June 23, 2023

Method handles: A better way to do Java reflection

Oracle Java, Java Certification, Java Jobs, Java Prep, Java Preparation, Java Tutorial and Materials

Java 18 introduced method handles, which offer a better way to do reflective programming.


What is reflection? Why does it matter? The previous articles in this series (see “Reflection for the modern Java programmer” and “The performance implications of Java reflection”) explored the topic and discussed the implementation of reflection as it shipped in versions of Java up to and including Java 17. This series concludes by building upon this base and explaining the new implementation that ships with Java 18 and later.

Note that to maintain backward compatibility, the Reflection API must be maintained with all its flaws and historical baggage, so I am talking only about internal changes here.

Why is there a new implementation?


Java 17 and earlier implementations of reflection rely on a delegation pattern—specifically a class called DelegatingMethodAccessorImpl. The delegate for this class starts off as a class that relies on native code to perform reflective invocation. However, once a threshold value is passed, the delegate is replaced by a custom class (it’s said that it has been patched out). This custom class is created dynamically at runtime, and this is a relatively expensive operation, which is why it is not performed until an invocation threshold is passed.

This implementation is sometimes referred to as the inflating implementation because the native delegate is inflated to a custom bytecode implementation, removing the need for a native call. This works well enough, but the inflating implementation is rather complex, as it has two separate code paths for reflection—one for native methods and one for the custom class—which are dynamically generated bytecode stubs.

Not only that, but due to the complexities of bytecode verification, the dynamically spun reflective method accessors need to be special-cased by the JVM through the MagicAccessorImpl class. This adds yet more complexity to an already somewhat baroque system.

Overall, these details mean that there is a lot of code to maintain, especially alongside the new MethodHandles API, which also provides similar capabilities for reflection.

This leads to an intriguing possibility: What if the entirety of the inflating implementation could be replaced by an equivalent capability based upon method handles? The relevant plan for this idea is JEP 416: Reimplement core reflection with method handles, which was delivered in Java 18.

The rest of this article will dive deeper into the new implementation and contrast it with the existing inflating implementation, initially by looking at a relevant security mechanism.

Enhanced field filtering


Since the very first release of OpenJDK 7, the reflection subsystem has had the ability to filter out certain fields from being visible to users, even when reflection and setAccessible() are used. This is achieved by maintaining a set of field names in an internal Reflection helper class; these field names are not allowed to appear in the returned value for getDefinedFields().

In Java 7 and Java 8, not many fields are filtered—basically only those required to protect a Java security manager (if one is set) and the filtering mechanism itself.

Java 11 produced the famous warning, “An illegal reflective access operation has occurred,” but the filtering mechanism did not change much from Java 8, except that it moved into a new, nonexported jdk.internal.reflect package in the java.base module.

However, the arrival of Java 12 in March 2019 altered things significantly. The code in jdk.internal.reflect.Reflection was updated to substantially increase the number of fields that are inaccessible to reflective code, as you can see in the following:

/** Used to filter out fields and methods from certain classes from public
    view, where they are sensitive or they may contain VM-internal objects.
    These Maps are updated very rarely. Rather than synchronize on
    each access, they use copy-on-write */
private static volatile Map<Class<?>, Set<String>> fieldFilterMap;
private static volatile Map<Class<?>, Set<String>> methodFilterMap;
private static final String WILDCARD = "*";
public static final Set<String> ALL_MEMBERS = Set.of(WILDCARD);

static {
    fieldFilterMap = Map.of(
        Reflection.class, ALL_MEMBERS,
        AccessibleObject.class, ALL_MEMBERS,
        Class.class, Set.of("classLoader"),
        ClassLoader.class, ALL_MEMBERS,
        Constructor.class, ALL_MEMBERS,
        Field.class, ALL_MEMBERS,
        Method.class, ALL_MEMBERS,
        Module.class, ALL_MEMBERS,
        System.class, Set.of("security")
    );
    methodFilterMap = Map.of();
}

Note that access to these fields is handled differently than access to general fields.

$ j11
openjdk version "11.0.13" 2021-10-19
OpenJDK Runtime Environment Temurin-11.0.13+8 (build 11.0.13+8)
OpenJDK 64-Bit Server VM Temurin-11.0.13+8 (build 11.0.13+8, mixed mode)

$ java javamag.reflection.ex2.ReflectTheReflect
class java.lang.reflect.Method
Hello world
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by javamag.reflection.ex2.ReflectTheReflect (file:/Users/ben/projects/writing/Oracle/Articles/reflection/src/main/java/) to field java.lang.reflect.Method.methodAccessor
WARNING: Please consider reporting this to the maintainers of javamag.reflection.ex2.ReflectTheReflect
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
class jdk.internal.reflect.DelegatingMethodAccessorImpl

$ j12
openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.2+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.2+10, mixed mode, sharing)

$ java javamag.reflection.ex2.ReflectTheReflect
class java.lang.reflect.Method
Hello world
java.lang.NoSuchFieldException: methodAccessor
   at java.base/java.lang.Class.getDeclaredField(Class.java:2416)
   at javamag.reflection.ex2.ReflectTheReflect.main(ReflectTheReflect.java:16)

The practical effect of these changes is that from Java 12 onwards, some fields in some key classes in java.base are completely inaccessible to user code, even when reflection is used. Even Unsafe doesn’t help; dangerous techniques such as the “static offset” approach no longer work, as they require a reflective Field object to obtain the offset.

The field and the method filter are a security requirement but while they do encapsulate the internals, note that they are not directly related to the strong encapsulation provided by modularity (which is what people typically think of when they think of strong encapsulation of Java’s internals).

For example, as you can see from the example above, this change significantly predates the general enforcement of JEP 396: Strongly encapsulate JDK internals by default, which appeared in Java 16 (March 2021). Field filtering also can’t be worked around with options (such as --add-opens), while the encapsulation from modularity typically can be.

(The basics of strong encapsulation can be found in “A peek into Java 17: Encapsulating the Java runtime internals.”)

One result of the enhanced field filter is that the only real way for a Java developer to see the new implementation of reflection is by using an IDE debugger. These tools can access internal JVM data structures at a much deeper level than is possible with Java code, but they are much less convenient than just running some reflective Java code.

Method handles


Method handles are intended to be used in similar circumstances where a programmer might reach for the Reflection API.

However, by gleaning from over 20 years of practical experience since reflection debuted, this second attempt at a reflective invocation capability is intended to provide (among other things) a better API for direct use.

Method type objects. One useful starting point for this better API is to consider that reflection represents the type signature of methods as instances of Class[]. As noted in “Reflection for the modern Java programmer,” this is because the Reflection API predates the Collections API (and, unfortunately, given Java’s extreme focus on backward compatibility, this is an API that cannot be withdrawn).

However, in the MethodHandles API, this is done with the MethodType class instead, which avoids the problem of using arrays as domain objects. MethodType is a class that has immutable instances that are created from a static factory method that is variadic in class objects, as follows:

public static MethodType methodType(Class<?> rtype, Class<?> ptype0, Class<?>... ptypes) {
    // ...
}

For example, the object that represents the compare() method for a string comparator is obtained as follows:

var mtStringComparator = MethodType.methodType(int.class, String.class, String.class);

The first argument to methodType() is the return type of the method—which is, in this case, int. After that, the types of the arguments to the method follow in positional order.

Notice that this MethodType doesn’t include either the method name or the type of the object that the method will be called upon. This actually makes it a more useful abstraction because it can be used in cases such as lambdas (where the method’s name doesn’t matter); therefore, it can be used for both static and virtual methods.

Lookup objects. A second issue that method handles resolve is the question of setAccessible(), which allows you to break the rules of the Java language by allowing you to call code to selectively disable access control.

Having this capability was an explicit nongoal for method handles. Instead, the designers wanted to make sure that encapsulation could be properly protected in the new mechanism.

To achieve this protection, a new concept called a lookup context was introduced. To understand how this works, recall that all Java code executes inside a method, and that every method lives inside a class. Therefore, executing code runs within a class—and so there is a set of methods that the code could call.

For example, the code could call

◉ Any private method of the current class (or an enclosing class)
◉ Any public method of a public class

Similarly, this applies for the package-private and protected methods, according to the rules of the Java language. (The ability to call public methods is also affected by the modules system in recent versions of Java.)

The point of a lookup context is to encapsulate the knowledge of which methods it is legal to call at the point where the lookup object is created. Inaccessible methods are not visible from the lookup context, which removes the distinction that reflection has between getMethod() and getDeclaredMethod().

Because the lookup object encapsulates a capability to look up methods and fields, you need to be careful about sharing the object, especially with untrusted code.

The MethodHandles class also has a publicLookup() method that may be preferable, as it is a minimally trusted lookup that can be freely shared but is restricted in what it can look up.

The most common way to obtain a lookup context is to call the static helper method MethodHandles.lookup(), which returns an object that represents methods accessible from the current class. Once you have the lookup object, you can obtain method handles from it by calling one of its find*() methods, such as findVirtual() or findConstructor().

For example, you can obtain a method handle for toString() as follows:

// Define a method type object corresponding to toString()
// This is the method type for methods that return String and take no parameters
var mt = MethodType.methodType(String.class);
System.out.println("MT: "+ mt);

// Create a lookup object for the current class context
var lk = MethodHandles.lookup();

try {
  // Create a method handle for toString()
  var mh = lk.findVirtual(getClass(), "toString", mt);
  System.out.println("mh.MT: "+ mh.type());

  // ... do something with the method handle

} catch (NoSuchMethodException | IllegalAccessException mhx) {
  throw new RuntimeException(mhx);
}

Because toString() is always public, the lookup should always find the method. However, there’s another point about method types here, which you can see by looking at the output of the following two println() statements:

MT: ()String
mh.MT: (MTMain)String

These show that after an MHs.Lookup.findVirtual() call, the MethodType given by MH.type() will have the receiver type as the first argument type. In other words, lookups don’t require the receiver type to be manually added, but once a method handle has been resolved, the receiver type (assuming there is one) is known.

Invoking a method handle. In the example above, I used findVirtual() because I want to call the correct override of toString() when invoking the method handle. To see this in action, define a simple class, MTMain, with an implementation of toString(), as follows:

public class MTMain {
    @Override
    public String toString() {
        return "MTMain {}";
    }

    public static void main(String[] args) {
        var mh = getToStringHandle();

        var main = new MTMain();
        try {
            var s = mh.invoke(main);
            System.out.println(s);
        } catch (Throwable e) {
            e.printStackTrace();
        }
    }

    public static MethodHandle getToStringHandle() {

        // Performs the lookup shown in the previous example
        // and returns the method handle

    }
}

This, as expected, has the same overall effect as calling System.out.println(main.toString()). However, there are some important differences. Consider the very similar (but reflective) class RefMain.

public class RefMain {
    @Override
    public String toString() {
        return "RefMain {}";
    }

    public static void main(String[] args) {
        var main = new RefMain();
        var mh = getToStringMethod(main);
        try {
            var s = mh.invoke(main);
            System.out.println(s);
        } catch (Throwable e) {
            e.printStackTrace();
        }
    }

    public static Method getToStringMethod(Object o) {
        Class<?> clazz = o.getClass();
        try {
            return clazz.getMethod("toString");
        } catch (NoSuchMethodException e) {
            throw new RuntimeException(e);
        }
    }
}

Here’s the bytecode.

public static void main(java.lang.String[]);
    Code:
       0: new           #9                  // class javamag/reflection/ex1/RefMain
       3: dup
       4: invokespecial #11                 // Method "<init>":()V
       7: astore_1
       8: aload_1
       9: invokestatic  #12                 // Method getToStringMethod:(Ljava/lang/Object;)Ljava/lang/reflect/Method;
      12: astore_2
      13: aload_2
      14: aload_1
      15: iconst_0
      16: anewarray     #2                  // class java/lang/Object
      19: invokevirtual #16                 // Method java/lang/reflect/Method.invoke:(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
      22: astore_3
      23: getstatic     #22                 // Field java/lang/System.out:Ljava/io/PrintStream;
      26: aload_3
      27: invokevirtual #28                 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V

It’s quite similar to the method handle case’s bytecode, up until #15.

public static void main(java.lang.String[]);
    Code:
       0: new           #9                  // class javamag/reflection/mh/MTMain
       3: dup
       4: invokespecial #11                 // Method "<init>":()V
       7: astore_1
       8: aload_1
       9: invokestatic  #12                 // Method getToStringHandle:(Ljava/lang/Object;)Ljava/lang/invoke/MethodHandle;
      12: astore_2
      13: aload_2
      14: aload_1
      15: invokevirtual #16                 // Method java/lang/invoke/MethodHandle.invoke:(Ljavamag/reflection/mh/MTMain;)Ljava/lang/Object;
      18: astore_3
      19: getstatic     #22                 // Field java/lang/System.out:Ljava/io/PrintStream;
      22: aload_3
      23: invokevirtual #28                 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V

Oracle Java, Java Certification, Java Jobs, Java Prep, Java Preparation, Java Tutorial and Materials
Notice there’s something different about the signature of Method.invoke() compared to MethodHandle.invoke() in the bytecode. The reflective case has a signature of

Method java/lang/reflect/Method.invoke:(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;

which exactly corresponds to the following source code in Method.java:

// Method.java
    public Object invoke(Object obj, Object... args)
        throws IllegalAccessException, IllegalArgumentException,
           InvocationTargetException
    {
        // ...
    }

This is quite different from the MethodHandle case, which (from the bytecode) has a signature like the following, which means: “a method that acts on an object of type MTMain and returns Object.”

Method java/lang/invoke/MethodHandle.invoke:(Ljavamag/reflection/mh/MTMain;)Ljava/lang/Object;

However, the source code for invoke() on a method handle is declared as follows:

// MethodHandle.java
public final native @PolymorphicSignature Object invoke(Object... args) throws Throwable;

What’s going on? The answer is in the annotation @PolymorphicSignature—which indicates that invoke() (and a couple of other methods of MethodHandle) is signature polymorphic.

Per the Java Language Specification, a signature polymorphic method is one that can operate with essentially any argument and return types. When the Java source code compiler encounters a call to a signature polymorphic method, the code will compile, regardless of how the method is being called. Effectively, it is as though a signature polymorphic method is not a single method, but an entire family of methods with every possible signature available.

To illustrate this, the following code alludes to the string comparator example shown earlier:

public int compareTwoStrings(String s1, String s2) {
    return 0;
}

private static MethodHandle getStringCompHandle() {
    // MethodType of compareTwoStrings()
    var mt = MethodType.methodType(int.class, String.class, String.class);
    var lk = MethodHandles.lookup();

    try {
        return lk.findVirtual(lk.lookupClass(), "compareTwoStrings", mt);
    } catch (NoSuchMethodException | IllegalAccessException mhx) {
        throw new RuntimeException(mhx);
    }
}

This example uses the lookupClass() method present on the lookup object. Here, this method returns the class object of the current class, but it works from both static and instance methods.

Here’s a bit of code to drive this example.

var mhSc = getStringCompHandle();
try {
    var s = mhSc.invoke(main, "foo", "bar");
    System.out.println(s);
} catch (Throwable e) {
    e.printStackTrace();
}

Sure enough, the bytecode for this snippet shows the signature polymorphic call to invoke().

Method java/lang/invoke/MethodHandle.invoke:(Ljavamag/reflection/mh/MTMain;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/Object;

This is all fine at compile time, but what about at runtime? For example, what happens if you try to invoke the method handle with the wrong arguments, as in the following:

var s = mhSc.invoke(main, "foo");

Well, this will cause an exception and, helpfully, the JVM reports that the method handle you are attempting to call is expecting three arguments—one of type MTMain and two strings—but it was called with only an MTMain and one String.

java.lang.invoke.WrongMethodTypeException: cannot convert MethodHandle(MTMain,String,String)int to (MTMain,String)Object

These examples show that the method handle approach has more information in the bytecode to help the runtime, and it avoids some of the overhead that is present in reflective calls. In particular, the boxing of primitive method arguments and collecting the arguments into an array can be avoided because the shape of the method call is known.

Rebasing reflection on top of method handles


Method handles can be used as a replacement for reflection in modern Java code, because they have essentially the same set of capabilities. There are some differences—method name changes and, in particular, the use of lookup objects—but overall, Java programmers who are already familiar with reflection should not experience any serious difficulties coming to grips with method handles.

All of this raises an intriguing possibility—could method handles be used as the engine to perform introspection operations, with the old Reflection API retrofitted on top of it?

Yes! In fact, JDK 18 did indeed switch over to using method handles to implement reflection, without changing the interface.

The old (Java 1.1) reflection interface must be maintained because Java always tries to retain backward compatibility, but the implementing code can be changed if that constraint holds.

The MethodAccessor interface is maintained, but instead of the MethodAccessorImpl classes that you are already familiar with, java.lang.reflect.Method instead contains evidence of method handles, as you can see in the source code.

public MethodAccessor newMethodAccessor(Method method, boolean callerSensitive) {
    // use the root Method that will not cache caller class
    Method root = langReflectAccess.getRoot(method);
    if (root != null) {
        method = root;
    }

    if (useMethodHandleAccessor()) {
        return MethodHandleAccessorFactory.newMethodAccessor(method, callerSensitive);
    } else {
        if (noInflation() && !method.getDeclaringClass().isHidden()) {
            return generateMethodAccessor(method);
        } else {
            NativeMethodAccessorImpl acc = new NativeMethodAccessorImpl(method);
            return acc.getParent();
        }
    }
}

The method handles implementation is the default—and so useMethodHandleAccessor() will default to true—but the old inflating implementation is temporarily still available. It can be activated via the following switch, but this is purely for compatibility reasons and will be removed in a future JDK release:

-Djdk.reflect.useDirectMethodHandle=false

Unfortunately, demonstrating the existence of the new implementation from within Java code is not very easy. As you saw earlier in the article, recent Java releases have started removing certain fields from the list of defined fields returned by introspection. This has the effect of making these fields impossible to access reflectively, and so you can’t do the sort of tricks shown in “The performance implications of Java reflection” to show the method handles present inside Method objects.

Finally, a word on performance: It might be tempting to pose the question, “How does the new implementation compare with the original, inflating code?” There is no simple, well-defined answer. Many aspects of the reflection subsystem have changed (such as access control, boxing, and volatile access to accessors), and the overall aggregate effect is impossible to reason about.

I could provide some Java Microbenchmark Harness (JMH) benchmarks that purport to show the difference between the two implementations, but everything I wrote in the previous article about microbenchmarks—that they represent individual data points and not a representation of some deeper underlying truth—continues to apply.

Instead, remember the somewhat mundane truth at the heart of performance analysis: If you want to know the performance impact of a particular language or JVM feature on your specific application within a certain range of inputs, you will have to test your specific application.

Conclusion

This series of articles provided an in-depth tour of reflection. As well as discussing the APIs, it covered reflection as a runtime technology that shows the real difference between the statically typed Java language and the dynamic nature of the JVM.

I’ve discussed how reflection is implemented, and I introduced a little of the JVM internals that support the capability. This included showing how recent releases of Java have closed off access to the reflection internals from meddling programmers, which allows the platform developers to make changes more freely.

This final article introduced the new MethodHandles API and showed how this has been used to reimplement the Reflection API. It also showed how it can be used directly as a more convenient and modern way to do introspection.

Source: oracle.com

Related Posts

0 comments:

Post a Comment