Friday, November 18, 2022

Build smarter Java types with records and enums


A type defines a set of values. Historically developers haven’t been very good at using encapsulation to ensure that objects stay within a type’s set of values. In response, this article introduces a functional approach to Java type design using Java’s new record keyword to guarantee that each constructed object is a legal value.

Oracle Java, Java Certification, Oracle Java Prep, Oracle Java, Oracle Java Tutorial and Materials

In this approach, your code improves dramatically because now you validate the object in one place, at construction. Because record fields are automatically final, an object cannot be morphed into an illegal value. Such a typed object never needs to be rechecked by any function that receives it as an argument or returns it as a result.

I’ll begin with the following simple utility that makes the examples cleaner and easier to read by providing an abbreviation for console output:

// util/Show.java
package util;

public class Show {
  public static void show(Object msg) {
    System.out.println(msg);
  }
}

Say you want to test and display invalid objects. Normally you would throw an exception in the case of an invalid object, but you want to see the output from the rest of the program after displaying the object. Here’s the utility.

// util/Check.java
package util;
import static util.Show.show;

public class Check {
  public static boolean
  valid(Boolean exp, String errInfo) {
    if (!exp) {
      show("Type failure: " + errInfo);
      return false;
      // Should actually throw an
      // exception, but this allows you
      // to see complete example output.
    }
    return true;
  }
  public static boolean
  range(Boolean exp, Object errInfo) {
    return valid(exp,
        errInfo + " out of range");
  }
}

Here, range() expects a compound Boolean expression that tests whether your object is within a particular range.

Star ratings


Consider how you might manage star ratings for surveys and feedback mechanisms that allow users to provide a rating between 1 and 10 stars. An int representing the number of stars might seem like the most straightforward solution.

// example1/Starred.java
// 1 to 10 stars for giving feedback
package example1;
import util.Check;
import static util.Show.show;

public class Starred {
  static int f1(int stars) {
    Check.range(
      0 < stars && stars <= 10, stars);
    return stars * 2;
  }
  static int f2(int stars) {
    Check.range(
      0 < stars && stars <= 10, stars);
    return stars + 4;
  }
  public static void main(String[] args) {
    int stars1 = 6;
    show(stars1);
    show(f1(stars1));
    show(f2(stars1));
    int stars2 = 11;
    show(f1(stars2));
    stars1 = 99;
    show(f2(stars1));
  }
}

/*
6
12
10
Type failure: 11 out of range
22
Type failure: 99 out of range
103
 */

There are three problems with this approach.

◉ f1() and f2() accept an int representing the number of stars. Anyone who writes such a function must remember to validate that argument, and an unknown future programmer maintaining or extending the code must perform the check and also understand that check.

◉ f1() and f2() look like they might be returning stars but because the return type is just int, there’s no way to be sure. If they are returning stars, they are not testing the return expressions; so, the resulting star values will sometimes be outside the range of 1 to 10.

◉ If the meaning of stars changes, all the code that validates this concept-in-the-form-of-an-int must be modified and debugged.

Instead of int, what you need is a new type that specifically defines the behavior of stars. This new type should have its own name, so it is distinguished from int. Enter the object-oriented promise of encapsulation.

// example2/Encapsulation.java
// Encapsulation with validation checks
package example2;
import util.Check;
import static util.Show.show;

class Stars {
  private int n;
  static void validate(int s) {
    Check.range(0 < s && s <= 10, s);
  }
  Stars(int n) {
    validate(n);
    this.n = n;
  }
  int n() { return n; }
  Stars f1(Stars stars) {
    n = n % 5 + stars.n * 2;
    validate(n);
    return this;
  }
  Stars f2(Stars stars) {
    n = n % 5 + stars.n + 2;
    validate(n);
    return this;
  }
  static Stars f3(Stars s1, Stars s2) {
    return new Stars(s1.n + s2.n);
  }
  @Override public String toString() {
    return "Stars(" + n + ")";
  }
}

public class Encapsulation {
  public static void main(String[] args) {
    Stars[] s = {
      new Stars(1), new Stars(3),
      new Stars(2), new Stars(6),
      new Stars(11),
    };
    show(Stars.f3(s[1], s[3]));
    var s1 = s[1];
    show(s1);
    show(s1.f1(s[2]));
    show(s1.f2(s[3]));
    show(s1.f2(s1));
    show(s1.f2(s1));
  }
}
/*
Type failure: 11 out of range
Stars(9)
Stars(3)
Stars(7)
Stars(10)
Type failure: 12 out of range
Stars(12)
Type failure: 16 out of range
Stars(16)
*/

Things seem much better. The stars concept is now represented by a Stars class. The number of stars is the private int n, which means only Stars methods can modify and validate n. By encapsulating n, you can protect it from outside meddling. Methods that modify n are restricted to those defined within the class, making it easier to track down functions that change n into an invalid value. Everything about the concept is neatly packaged and controlled within the class.

This approach is a big part of the promise of object-oriented programming, and it is arguably better. It genuinely helps. But it did not turn out to be a panacea. Consider the Stars class: Each method must perform validate() checks to ensure n is within its proper range of values. Although the idea of “stars have a value from 1 to 10” is restricted to this class, the idea is still spread throughout the class. And if you are concerned about performance, running those validate() checks everywhere seems inefficient.

There’s a bigger problem, too: Because objects can be modified (mutated), it can be difficult to understand the behavior of an object—as you can see from the output.

What about immutability?


Consider the functional-programming practice of making everything immutable. What if the value of n is initialized correctly and then never changes? In that case, a method that accepts a Stars argument no longer needs to validate it; the method knows the Stars value can’t be created incorrectly, and it cannot be mutated into an invalid state. This is the concept of making illegal values unrepresentable: If it’s impossible to create an illegal value of that type, you can safely ignore the issue of whether an object of that type is correct.

Java lets you create such types using the record keyword, introduced in Java 16. Using record instead of class produces numerous benefits, but here I am focusing on immutability.

// example3/RecordValidation.java
// Creating a type using JDK 16 records
package example3;
import java.util.HashMap;
import util.Check;
import static util.Show.show;

record Stars(int n) {
  Stars {
    Check.range(0 < n && n <= 10, n);
  }
}

public class RecordValidation {
  static Stars f1(Stars stars) {
    return new Stars(stars.n() * 2);
  }
  static Stars f2(Stars stars) {
    return new Stars(stars.n() + 4);
  }
  static Stars f3(Stars s1, Stars s2) {
    return new Stars(s1.n() + s2.n());
  }
  public static void main(String[] args) {
    Stars[] s = {
      new Stars(1), new Stars(3),
      new Stars(4), new Stars(6),
      new Stars(11),
    };
    show(s[1]);
    show(f1(s[1]));
    show(f1(s[3]));
    show(f2(s[2]));
    show(f2(s[3]));
    show(f3(s[1], s[3]));
    show(f3(s[3], s[3]));

    // Records can be keys in hashed data
    // structures because they define
    // equals() and hashCode():
    var m = new HashMap<stars, string="">();
    m.put(s[1], "one");
    m.put(s[2], "two");
    show(m);
  }
}
/*
Type failure: 11 out of range
Stars[n=3]
Stars[n=6]
Type failure: 12 out of range
Stars[n=12]
Stars[n=8]
Stars[n=10]
Stars[n=9]
Type failure: 12 out of range
Stars[n=12]
{Stars[n=3]=one, Stars[n=4]=two}
 */
</stars,>

The record Stars statement automatically creates an object field for its argument, int n. That’s not all: The record also creates a constructor that takes int n and assigns it to the object field. The compact constructor seen here allows you to validate the constructor arguments before they are assigned to the object fields.

The fields generated from the record arguments are automatically final—that’s how record works—so they cannot be modified. Thus, when a record object is created correctly, it cannot later be mutated into an incorrect state.

The benefit is that f1(), f2(), and f3() no longer need to perform validity checks, because they know that a Stars object is always created correctly and it always stays that way. Notice that the meaning of a Stars object as being in the range of 1 to 10 is now defined in only one spot: within the compact constructor.

As an aside, the immutability of a record also means that, together with the automatic definition of equals() and hashCode() in that record, the record can be used as a key in a data structure such as HashMap.

Automatic argument assignment


It’s important to be aware that the automatic assignment of arguments to corresponding fields of the same name does not happen until after the compact constructor is finished.

// watchout/RecordConstructor.java
package watchout;
import static util.Show.show;

record Stars(int n) {
  Stars {
    show("In compact constructor:");
    show("n: " + n + ", n(): " + n());
    // show("this.n: " + this.n);
    // Variable 'this.n' might not have
    // been initialized
    x();
  }
  void x() {
    show("n: " + n + ", n(): " + n());
    show("this.n: " + this.n);
  }
}

public class RecordConstructor {
  public static void main(String[] args) {
    var s = new Stars(10);
    show("After construction:");
    s.x();
  }
}
/*
In compact constructor:
n: 10, n(): 0
n: 0, n(): 0
this.n: 0
After construction:
n: 10, n(): 10
this.n: 10
*/

Within the compact constructor, n refers not to the field but to the argument, while n() accesses the field (this.n). The Java compiler is smart enough to emit an error message if you access this.n within the constructor, but it doesn’t produce the same error message for n().

Inside method x(), n may only refer to this.n. The output shows that the field n has not been initialized (beyond the automatic default behavior of zeroing memory) when you are inside the compact constructor.

What if you assign to this.n within the compact constructor?

package watchout2;

record Stars(int n) {
  Stars {
    // this.n = 42;
    // Cannot assign a value to final variable 'n'
  }
}

In an ordinary class, you can assign to a final within the constructor, but a compact constructor prevents that—this makes sense because that’s an essential part of what a record does. But it’s important to note that field initialization doesn’t happen until after the compact constructor is called, so object validation within a compact constructor is limited to checking the constructor arguments, not the initialized fields.

According to Brian Goetz, the reason for this design decision was to allow validation or normalization on proposed component values. For example, normalization can include clamping (if the value is greater than x, clamp to x), rounding, and related transforms. That’s why the values are written after you’ve had a chance to normalize the parameters.

There’s a way to guarantee that the type is correct without testing it everywhere. An immutable record with a validation constructor has suddenly made the entire correctness issue vanish; anywhere this new type is used, you know it is correct.

Automatically applying correctness


It gets better: When you compose a record from another record, this guarantee is automatically applied. For example, you can create a record called Person using more than one record as component parts, as follows:

// example4/People.java
// Composing records using records
package example4;
import util.Check;
import static util.Show.show;

record FullName(String name) {
  FullName {
    show("Checking FullName " + name);
    Check.valid(name.split(" ").length > 1,
      name + " needs first and last names");
  }
}

record BirthDate(String dob) {
  BirthDate {
    show("TODO: Check BirthDate " + dob);
  }
}

record EmailAddress(String address) {
  EmailAddress {
    show("TODO: Check EmailAddress " + address);
  }
}

record Person(FullName name,
              BirthDate dateOfBirth,
              EmailAddress email) {
  Person {
    show("TODO: Check Person");
  }
}

public class People {
  public static void main(String[] args) {
    var person = new Person(
      new FullName("Bruce Eckel"),
      new BirthDate("7/8/1957"),
      new EmailAddress("mindviewinc@gmail.com")
    );
    show(person);
  }
}
/*
Checking FullName Bruce Eckel
TODO: Check BirthDate 7/8/1957
TODO: Check EmailAddress mindviewinc@gmail.com
TODO: Check Person
Person[
  name=FullName[name=Bruce Eckel],
  dateOfBirth=BirthDate[dob=7/8/1957],
  email=EmailAddress[address=mindviewinc@gmail.com]
]
 */

Each record component knows how to validate itself. When you combine FullName, BirthDate, and EmailAddress into the record Person, the creation of each argument performs its validity check, so you create a Person from three other validated objects. Afterwards, Person performs its own validation test within its compact constructor.

By the way, a record can be used only for composition; you cannot inherit from a record, so the complexities of inheritance are fortunately not an issue.

Another approach: Enums


I’ve been using a record with constructor validation to define a set of legitimate objects and calling this set of values a type. If a type comprises a small number of values that can be predefined, there’s a second way to define a type: an enum. The code below expands upon the BirthDate from the previous example by creating types for Day, Month, and Year, where Month is defined as an enum.

// example5/DateOfBirth.java
// An enum is also a type, and is preferable
// when you have a smaller set of values.
// "Leap years are left as an exercise."
package example5;
import util.Check;
import static util.Show.show;

record Day(int n) {
  Day {
    Check.range(0 < n && n <= 31, this);
  }
}

enum Month {
  JANUARY(31),
  FEBRUARY(28),
  MARCH(31),
  APRIL(30),
  MAY(31),
  JUNE(30),
  JULY(31),
  AUGUST(31),
  SEPTEMBER(30),
  OCTOBER(31),
  NOVEMBER(30),
  DECEMBER(31),
  // Only needed for this example:
  NONE(0);
  final int maxDays;
  Month(int maxDays) {
    this.maxDays = maxDays;
  }
  public static Month number(int n) {
    if (Check.range(1 <= n && n <= 12,
        "Month.number(" + n + ")"))
      return values()[n - 1];
    return NONE;
  }
  void checkDay(Day day) {
    Check.range(day.n() <= maxDays,
      this + ": " + day);
  }
}

record Year(int n) {
  Year {
    Check.range(1900 < n && n <= 2022, this);
  }
}

record BirthDate(Month m, Day d, Year y) {
  BirthDate {
    m.checkDay(d);
  }
}

public class DateOfBirth {
  static void test(int m, int d, int y) {
    show(m + "/" + d + "/" + y);
    show(new BirthDate(
      Month.number(m), new Day(d), new Year(y)
    ));
  }
  public static void main(String[] args) {
    test(7, 8, 1957);
    test(0, 32, 1857);
    test(2, 31, 2022);
    test(9, 31, 2022);
    test(4, 31, 2022);
    test(6, 31, 2022);
    test(11, 31, 2022);
    test(12, 31, 2022);
    test(13, 31, 2022);
  }
}
/*
7/8/1957
BirthDate[m=JULY, d=Day[n=8], y=Year[n=1957]]
0/32/1857
Type failure: Month.number(0) out of range
Type failure: Day[n=0] out of range
Type failure: Year[n=0] out of range
Type failure: NONE: Day[n=32] out of range
BirthDate[m=NONE, d=Day[n=32], y=Year[n=1857]]
2/31/2022
Type failure: FEBRUARY: Day[n=31] out of range
BirthDate[m=FEBRUARY, d=Day[n=31], y=Year[n=2022]]
9/31/2022
Type failure: SEPTEMBER: Day[n=31] out of range
BirthDate[m=SEPTEMBER, d=Day[n=31], y=Year[n=2022]]
4/31/2022
Type failure: APRIL: Day[n=31] out of range
BirthDate[m=APRIL, d=Day[n=31], y=Year[n=2022]]
6/31/2022
Type failure: JUNE: Day[n=31] out of range
BirthDate[m=JUNE, d=Day[n=31], y=Year[n=2022]]
11/31/2022
Type failure: NOVEMBER: Day[n=31] out of range
BirthDate[m=NOVEMBER, d=Day[n=31], y=Year[n=2022]]
12/31/2022
BirthDate[m=DECEMBER, d=Day[n=31], y=Year[n=2022]]
13/31/2022
Type failure: Month.number(13) out of range
Type failure: NONE: Day[n=31] out of range
BirthDate[m=NONE, d=Day[n=31], y=Year[n=2022]]
*/

The Day represents a day of the month, and the most general thing you can know about it is that it must be greater than zero and less than or equal to 31. Of course, that’s not enough of a constraint to ensure correctness for any specific month (such as June), but it’s a good starting point. Notice how effortless it is to define Day as a record with a constructor test, because you know that this rule will be followed throughout your code without having to explain or enforce it.

Because there are only 12 months, it makes sense to predefine each of them as an enum. The constructor stores the maximum number of days for a month inside that Month’s element. The Month elements are immutable, so you need to make only one of each.

If you use the static method number() to look up a Month that is out of range, number() returns NONE rather than throwing an exception. This way, you can see all the messages instead of halting the program with an exception. (If you throw exceptions, you don’t need NONE.)

The checkDay()method verifies that a particular Day is within range for this Month. It is used in BirthDate once you have both a Month and a Day.

The enum values are created and checked before they can be used, so you’ll encounter fewer surprises when you use this approach; you won’t get an exception at some later time the way you can when someone tries to create an invalid record. Also, your IDE can fill the values of an enum.

As an exercise, try representing Month using a record instead of an enum. Also, you may have noticed that leap years are not accounted for, which means there is a problem with February. That is also left as an exercise for you.

Source: oracle.com

Related Posts

0 comments:

Post a Comment