Tuesday, May 24, 2022

Bruce Eckel on pattern matching in Java

pattern matching in Java, Core Java, Java Exam Prep, Java Tutorial and Material, Java Career, Java Skills, Java Jobs, Java News, Java Certifications

You can currently use pattern variables for instanceof and for switch. See how they work.

JDK 16 finalized JDK Enhancement Proposal (JEP) 394, Pattern matching for instanceof. Below you can see the old and new ways of doing the same thing.

// enumerations/SmartCasting.java

// {NewFeature} Since JDK 16

public class SmartCasting {

  static void dumb(Object x) {

    if(x instanceof String) {

      String s = (String)x;

      if(s.length() > 0) {

        System.out.format(

          "%d %s%n", s.length(), s.toUpperCase());

      }

    }

  }

  static void smart(Object x) {

    if(x instanceof String s && s.length() > 0) {

      System.out.format(

        "%d %s%n", s.length(), s.toUpperCase());

    }

  }

  static void wrong(Object x) {

    // "Or" never works:

    // if(x instanceof String s || s.length() > 0) {}

    // error: cannot find symbol   ^

  }

  public static void main(String[] args) {

    dumb("dumb");

    smart("smart");

  }

}

/* Output:

4 DUMB

5 SMART

*/

(The {NewFeature} comment tag excludes this example from the Gradle build that uses JDK 8.)

In dumb(), once instanceof establishes that x is a String, you must explicitly cast it to String s. Otherwise, you would be inserting casts throughout the rest of the function. But in smart(), notice the x instanceof String s.

This automatically creates a new variable s of type String. Note that s is available throughout the scope, even within the remainder of the if conditional, as you see in && s.length() > 0. This produces more-compact code.

In constrast, wrong() shows that only && can be used in a pattern-matching if expression. Using || would mean that x is an instanceof String or that s.length() > 0. That would mean x might not be a String, in which case Java could not smart-cast x to create s; therefore, s would not be available on the right side of the ||.

JEP 394 calls s a pattern variable.

Although this feature does clean up some messy if statements, that wasn’t the motivation for adding it. It was included as a building block for pattern matching, as you will see shortly.

By the way, this feature can produce some strange scoping behavior such as the following:

// enumerations/OddScoping.java

// {NewFeature} Since JDK 16

public class OddScoping {

  static void f(Object o) {

    if(!(o instanceof String s)) {

      System.out.println("Not a String");

      throw new RuntimeException();

    }

    // s is in scope here!

    System.out.println(s.toUpperCase());  // line [1]

  }

  public static void main(String[] args) {

    f("Curiouser and Curiouser");

    f(null);

  }

}

/* Output:

CURIOUSER AND CURIOUSER

Not a String

Exception in thread "main" java.lang.RuntimeException

        at OddScoping.f(OddScoping.java:8)

        at OddScoping.main(OddScoping.java:15)

*/

In line [1], s is in scope only if you don’t include the statement that throws the exception. If you comment out throw new RuntimeException(), the compiler tells you it can’t find s in line [1], which is the behavior you normally would expect.

Initially this might look like a bug, but it is designed that way; this behavior is explicitly described in JEP 394. Although this is arguably a corner case, you can imagine how difficult it might be to track down a bug caused by this behavior.

Pattern matching

Now that all the foundational pieces are in place, you can look at the bigger picture of pattern matching in Java.

At the time of this writing, pattern matching for instanceof was delivered in JDK 16 as JEP 394. Pattern matching for switch was a preview feature in JDK 17 as JEP 406, and there was a second preview in JDK 18 as JEP 420. This means pattern matching probably won’t change or change significantly in future versions, but it hasn’t been finalized. By the time you read this, the features shown here could be out of preview and so you might not need to use the compiler and runtime flags specified in the example comments.

Violating the Liskov Substitution Principle

Inheritance-based polymorphism performs type-based behavior but requires those types to be in the same inheritance hierarchy. Pattern matching allows you to perform type-based behavior on types that don’t all have the same interface or are not in the same hierarchy.

This is a different way to use reflection. You still determine types at runtime, but this is a more formalized and structured way to do it than reflection.

If the types of interest all have a common base type and use only methods defined in that common base type, you are conforming to the Liskov Substitution Principle (LSP). In this case, pattern matching is unnecessary, because you can just use normal inheritance polymorphism, as follows:

// enumerations/NormalLiskov.java

import java.util.stream.*;

interface LifeForm {

  String move();

  String react();

}

class Worm implements LifeForm {

  @Override public String move() {

    return "Worm::move()";

  }

  @Override public String react() {

    return "Worm::react()";

  }

}

class Giraffe implements LifeForm {

  @Override public String move() {

    return "Giraffe::move()";

  }

  @Override public String react() {

    return "Giraffe::react()";

  }

}

public class NormalLiskov {

  public static void main(String[] args) {

    Stream.of(new Worm(), new Giraffe())

      .forEach(lf -> System.out.println(

        lf.move() + " " + lf.react()));

  }

}

/* Output:

Worm::move() Worm::react()

Giraffe::move() Giraffe::react()

*/

All methods are neatly defined in the LifeForm interface, and no new methods are added in any of the implemented classes.

But what if you need to add methods that can’t easily be placed in the base type? Some worms can reproduce when they are divided, for example, and a giraffe certainly can’t do that. A giraffe can kick, but it’s hard to imagine how you’d represent that in the base class in a way that it didn’t make the Worm implementation problematic.

The Java Collections library ran into this problem and attempted to solve it by adding optional methods in the base type that were implemented in some subclasses but not in others. This approach conformed to LSP but produced a confusing design.

Java was fundamentally inspired by Smalltalk, a dynamic language that reuses code by taking existing classes and adding methods. A Smalltalk design for a Pet hierarchy might end up looking like the following:

// enumerations/Pet.java

public class Pet {

  void feed() {}

}

class Dog extends Pet {

  void walk() {}

}

class Fish extends Pet {

  void changeWater() {}

}

You can take the basic Pet functionality and extend that class by adding methods as you need them. This is different than what is normally advocated for Java (and shown in NormalLiskov.java), where you carefully design the base type to include all possible methods needed throughout the hierarchy, thus conforming to LSP. Although this is a nice aspirational goal, it can be impractical.

Attempting to force the dynamically typed Smalltalk model into the statically typed Java system is bound to create compromises. In some cases, those compromises might be unworkable. Pattern matching allows you to use Smalltalk’s approach of adding new methods to derived classes while still maintaining most of the formality of LSP. Basically, pattern matching allows you to violate LSP without creating unmanageable code.

With pattern matching, you can deal with the non-LSP nature of the Pet hierarchy by checking for, and writing different code for, each possible type.

// enumerations/PetPatternMatch.java

// {NewFeature} Preview in JDK 17

// Compile with javac flags:

//   --enable-preview --source 17

import java.util.*;

public class PetPatternMatch {

  static void careFor(Pet p) {

    switch(p) {

      case Dog d -> d.walk();

      case Fish f -> f.changeWater();

      case Pet sp -> sp.feed();

    };

  }

  static void petCare() {

    List.of(new Dog(), new Fish())

      .forEach(p -> careFor(p));

  }

}

The p in switch(p) is called the selector expression. Prior to pattern matching, a selector expression could be only an integral primitive type (char, byte, short, or int), the corresponding boxed form (Character, Byte, Short, or Integer), String, or an enum type. With pattern matching, the selector expression is expanded to include any reference type. Here, the selector expression can be a Dog, Fish, or Pet.

Notice that this is similar to dynamic binding within an inheritance hierarchy, but instead of putting the code for different types within the overridden methods, you’ll put it in the different case expressions.

The compiler forced the addition of case Pet because that class can legitimately exist without being a Dog or a Fish. Without case Pet, then, the switch didn’t cover all possible input values. Using an interface for the base type eliminates this constraint but adds a different one. The following example is placed in its own package to prevent name clashes:

// enumerations/PetPatternMatch2.java

// {NewFeature} Preview in JDK 17

// Compile with javac flags:

//   --enable-preview --source 17

package sealedpet;

import java.util.*;

sealed interface Pet {

  void feed();

}

final class Dog implements Pet {

  @Override public void feed() {}

  void walk() {}

}

final class Fish implements Pet {

  @Override public void feed() {}

  void changeWater() {}

}

public class PetPatternMatch2 {

  static void careFor(Pet p) {

    switch(p) {

      case Dog d -> d.walk();

      case Fish f -> f.changeWater();

    };

  }

  static void petCare() {

    List.of(new Dog(), new Fish())

      .forEach(p -> careFor(p));

  }

}

If Pet is not sealed, the compiler complains that the switch statement does not cover all possible input values. In this case, it’s because the interface Pet could be implemented by anyone else in any other file, breaking the exhaustive coverage by the switch statement. By making Pet sealed, the compiler can ensure that the switch covers all possible Pet types.

Pattern matching doesn’t constrain you to a single hierarchy the way inheritance polymorphism does; that is, you can match on any type. For example, to do this, you can pass Object into the switch as follows:

// enumerations/ObjectMatch.java

// {NewFeature} Preview in JDK 17

// Compile with javac flags:

//   --enable-preview --source 17

// Run with java flag: --enable-preview

import java.util.*;

record XX() {}

public class ObjectMatch {

  static String match(Object o) {

    return switch(o) {

      case Dog d -> "Walk the dog";

      case Fish f -> "Change the fish water";

      case Pet sp -> "Not dog or fish";

      case String s -> "String " + s;

      case Integer i -> "Integer " + i;

      case String[] sa -> String.join(", ", sa);

      case null, XX xx -> "null or XX: " + xx;

      default -> "Something else";

    };

  }

  public static void main(String[] args) {

    List.of(new Dog(), new Fish(), new Pet(),

      "Oscar", Integer.valueOf(12),

      Double.valueOf("47.74"),

      new String[]{ "to", "the", "point" },

      new XX()

    ).forEach(

      p -> System.out.println(match(p))

    );

  }

}

/* Output:

Walk the dog

Change the fish water

Not dog or fish

String Oscar

Integer 12

Something else

to, the, point

null or Object: XX[]

*/

When passing an Object parameter to switch, a default is required by the compiler—again, to cover all possible input values (except for null, for which a case is not required by the compiler, even though it can happen).

It’s possible to combine the null case with a pattern, as seen in case null, XX xx. This works because an object reference can be null.

Source: oracle.com

Related Posts

0 comments:

Post a Comment