Friday, September 8, 2023

Quiz yourself: Collectors, comparators, and type inferencing in Java

Quiz Yourself, Oracle Java Career, Oracle Java Skills, Oracle Java Jobs, Oracle Java Prep, Oracle Java Preparation, Oracle Java Preparation


Your colleague is working on an application that must find a most-frequently used word in a Stream<String>, and each element of that stream is a single word. The stream is provided by the reference strm.

Which of these pipeline expressions can perform this task? Choose two.

A.

strm.collect(Collectors.groupingBy(Function.identity(),
Collectors.counting())).entrySet().stream().sorted((e1,
e2)-> e2.getValue().compareTo(e1.getValue())).findFirst()

B.

strm.collect(Collectors.groupingBy((String s) -> s, 
Collectors.counting())).entrySet().stream().sorted(
Map.Entry.comparingByValue()).findFirst()

C.

strm.collect(Collectors.groupingBy(a -> a,
Collectors.counting())).entrySet().stream().sorted(
Map.Entry::comparingByValue).findFirst()

D. 

strm.collect(Collectors.groupingBy(Function.identity(), 
Collectors.counting())).entrySet().stream().sorted((e1, 
e2)-> e2.getValue() - e1.getValue()).findFirst()

E.

strm.collect(Collectors.groupingBy(___ -> ___, 
Collectors.counting())).entrySet().stream().max(Comparator.
comparing(e -> e.getValue()))

Answer. This question investigates aspects of the Collectors utilities, ordering using a Comparator, type conversion rules, and type inferencing in Java’s generics system.

All but one of the question’s code fragments take the following four-step approach to the problem, with variations in the implementation details:

Step 1. Take each word and use the groupingBy collector to build a map in which each key is a word from the stream, and the value accumulates the number of occurrences of the word in the stream. The groupingBy method is a factory that builds a collector, and in the form used here it takes two arguments. The first derives a key for the resulting map from the object in the stream. In this case, the object in the stream is the word you want to use as the key, so no change is necessary. The second argument is Collectors.counting(), which is used as a downstream collector that modifies the value stored against the key in the map to be a count of the number of times the key has been seen, instead of being a list of the objects that produced that key.

Step 2. The code then extracts a stream from that map. The Map interface cannot directly provide a Stream, but it provides access to a Set<Map.Entry> using the entrySet method. A Map.Entry<K, V> is a single key-value pair (a tuple of K and V, essentially, although Java does not provide tuples at a language syntax level). A Set does allow drawing a Stream directly, which is the next step.

Step 3. Sort the stream of entries. This must be done in descending order of the value part of the entry since that’s the count of occurrences of the word represented in the key.

Step 4. The reason that descending order is essential is that the final step is to pull the first entry object from the resulting stream using a findFirst method. If the stream were sorted in ascending order, you’d get the least-used word rather than the most-used one. It’s worth noting that if two or more words tie for top place in usage, you’ll get one of them with no control over which one you get. But you can ignore that possibility in this question; notice the specification says, “a most-frequently used word,” rather than “the most-frequently used word.” Therefore, you don’t need to worry about a tie.

Option A is a correct implementation of this approach. The first argument to the groupingBy factory method should be a function that takes the stream object and returns the key. As noted, in this situation the key is the same as the stream object, and Function.identity() is a factory for a Function object that returns its own argument unchanged. The second argument is the counting factory, and that’s exactly what was described above. Next, the sorting is performed using Comparator explicitly coded as a lambda expression. The effect is to compare the value of the two entry objects, but because the code is written as e2 … compareTo … e1, the ordering will be reversed and, therefore, descending.

Option B is incorrect. Although it is syntactically valid, the stream is sorted in ascending order; therefore, it will return an entry with a least-frequently used word. You could correct this code by reversing the ordering of the comparator, as follows:

strm.collect(Collectors.groupingBy((String s) -> s, 
Collectors.counting())).entrySet().stream().sorted(Map.Entry.<String, 
Long>comparingByValue().reversed()).findAny();

Notice the use of <String, Long> before comparingByValue().reversed(). The code fails to compile without this because the Java compiler cannot infer the generic type parameters in this situation. Therefore, this syntax is used to specify the types explicitly.

Another difference in the approach taken by option B, which is valid in isolation, is the replacement of Function.identity() with an explicit lambda: (String s) -> s. This is valid, and in this code has the same effect of returning its argument unchanged. The argument type is redundant, but it is valid because the objects in the stream are of String type.

Option C also changes Function.identity() to an explicit lambda: a -> a. As with option B, this has the same effect and is not a problem.

Looking at the second argument to groupingBy—the downstream collector—you know this must be an object of Collector type. However, Map.Entry::comparingByValue is a method reference; if the code were compilable in this location, it would create an object equivalent to a lambda expression such as x -> Map.Entry.comparingByValue(x). That would be a Function that returns a Comparator. However, Function is not valid at this point, because the code needs an actual Comparator object. From this you can see that option C is incorrect.

Option D is also incorrect. As already mentioned, Collectors.counting() returns a collector that counts the number of occurrences of elements. The total is accumulated as a Long. Later in the hand-coded comparator, Long is subtracted from Long, and this gives a Long result (which could, under appropriate conditions, be autounboxed to a long primitive). However, the return type of the comparator’s compare method must be int, and it’s not legal to convert a Long (or even a long) to an int without an explicit cast. The following code includes the necessary cast and would work correctly:

(e1, e2)-> (int)(e2.getValue() - e1.getValue())

Option E changes two things. First, it uses an explicit lambda instead of Function.identity(). You saw this in two previous examples, but here the variable name is odd. It’s ___ (which is a triple underscore) rather than something more normal such as s or a. It turns out that this is actually a valid identifier in Java. A single underscore is a reserved keyword (since Java 9), but a leading underscore followed by more characters (even if they’re just more underscores) or an underscore embedded in the middle of a name is legal. It’s worth noting that using a triple underscore as a variable name would likely get some raised eyebrows in a code review, so don’t tell anyone we said it’s a good idea. We definitely are not suggesting that it’s anything other than a curiosity!

Another difference in this example is the use of the max terminal operation rather than ordering followed by findFirst. This works correctly and might be more efficient because it doesn’t require the processing of the second stream to build a structure containing references to all the elements; instead, only a reference to the largest element so far needs to be kept. In any case, option E is correct.

Here are three interesting side notes.

First: You can read about the difference between Function.identity() and a -> a in our earlier quiz “Quiz yourself: Mixing and matching Java primitives with generics in a stream.”

Second: In the case of English, the most-frequently used word is generally “the,” but in any given body of text, that might not be the case. Of course, this quiz’s code could be used to analyze text in any language.

Third: The stem of the question mentions that the items listed are expressions. As such they’re not complete Java code but must be used in some larger context, perhaps to provide a value to be assigned to a variable or as an argument to a method invocation. However, a general guide for the exam is that if code is valid as shown but incomplete, you should assume there’s enough supporting code around it to allow it to work if it can do so; you’re being asked solely about the validity of what’s shown, not about what you can’t see. The Java SE 17 Developer exam (1Z0-829) has a section called “Assume the following” under the expandable tab “Review exam topics.” Among other things, it specifies the following:

If sample code does not include package or import statements, and the question does not explicitly refer to these missing statements, then assume that all sample code is in the same package, or import statements exist to support them.

Conclusion. The correct answers are options A and E.

Source: oracle.com

Related Posts

0 comments:

Post a Comment