Monday, November 15, 2021

Primitive data types in Java are a matter of precision

Oracle Java Tutorial and Material, Oracle Java Exam, Oracle Java Exam Prep, Oracle Java Certification, Oracle Java Preparation, Core Java

Why you can assign a char to an int, but you can’t assign a double to a float



In Java, you can assign a char to an int, but you can’t assign a double to a float. And Java doesn’t stop you from performing a mathematical assignment or other operation that might lose precision.

This article explores those topics—and explains the difference between data type storage versus effective storage and why float and double variables store values in two parts: a significand and an exponent.

Primitive type assignments


Generally, Java permits assignments of numeric primitive data types based on whether the values will work reliably—that is, if the values will fit into their destination.

◉ If the entire range of values that can be represented by the type of a given expression can be represented by the type of a given destination type, the assignment from the expression to the destination type is permitted.
◉ If an expression might represent values that are outside the range of the type of a destination, such an assignment is rejected.

In this way, the compiler rejects an assignment where there could be a loss of gross value, that is, where the new value might not fit.

Figure 1 shows the ranges of the primitive data types, their effective storage, and their range of possible values.

Oracle Java Tutorial and Material, Oracle Java Exam, Oracle Java Exam Prep, Oracle Java Certification, Oracle Java Preparation, Core Java
Figure 1. The ranges of Java’s primitive types

The following things should be clear:

◉ A Boolean cannot be assigned to or from a numeric expression at all.
◉ For the main integer numeric types, assignment is possible from smaller to larger, that is, from byte to short to int to long.
◉ For floating-point types, an assignment from float to double is possible.
◉ Assignment from any of the integer types to either of the floating-point types is fine also.
◉ You can see that float and double do some magic to be able to store a vastly bigger range in the same amount of storage.
◉ Perhaps a little surprising is the fact that assignment between short and char cannot be performed in either direction. This is because a char can represent values greater than the maximum value of a short (65,535 > 32,767), while a short can represent negative values, which are not representable by a char.

You can look up the formal specification of these rules (and more) in Java Language Specification section 5.1, “Kinds of conversion,” specifically subsections 5.1.2 and 5.1.3.

Let’s see how these rules work in practice.

You can’t assign a double value to a float. From Figure 1, it’s clear that a double can represent a much greater range of values than a float, so the following assignment is not permitted, and compilation will fail:

double d = 5.0D;
float f = 4.0F;
f = d;

Of course, you might happen to know that in this case, the actual value stored in the double is small enough to be represented properly by the float. If you are 100% confident that the value about to be assigned will never overflow the capacity of the destination, you can use a cast to persuade the compiler to let you perform the assignment. In this case, the resulting code would look like the following:

f = (float)d;

You can assign a char value to an int. Every value that can be represented by a char can be represented perfectly by an int. Therefore, the compiler permits the following assignment, the result is reliably accurate, and everyone is happy.

char c = '1';
int i = 2;
i = c;

Can you assign a long value to a float? The range of a float is much greater than the range of a long and, therefore, the compiler allows the following assignment:

long l = 3L;
float f = 4.0F;
f = l;

But wait a moment: A long has 8 bytes of effective storage, while a float has only 4 bytes. How can you assign an 8-byte value to a 4-byte space? Shouldn’t that operation fail?

This assignment succeeds because there is no loss of gross value. There is only a loss of precision. The float will contain the entire value from the long, although it might not store that value completely accurately. In fact, the float may be only an approximation of the long’s value. The Java compiler doesn’t care about that.

At this point, you’ve seen that assignments that might result in a completely wrong value—based on the ranges representable by the types—will be rejected by the compiler, yet you can force the compiler’s hand by using a cast. A cast operation is safe if you are sure the value will fit.

You’ve also seen that an assignment that risks a loss of precision, but not a loss of gross value, is permitted. Let’s dig into that further and consider how a 4-byte float value can store a larger range than an 8-byte long value, but perhaps only as an approximation.

Significands, exponents, and loss of precision


It’s easy to understand losing the fractional part when you assign (through a cast) a float to an int, but what does it mean to lose precision when you assign, for example, a long to a float? The answer lies in how the float manages to have a wider range than the long, despite using only half the storage.

Floating-point numbers in Java (in fact, in most computer languages) don’t count in units of 1 the way that integer values do. Instead, floating-point numbers count in what might be called a variable chunk size. (Chunk is a term used in this article, but it’s not the correct technical term.)

Although it’s possible that a floating-point value is counted in chunks of 1, it might be counted with chunk sizes of 1/16 or of 1/512. This behavior is how floating-point values handle fractions.

But similarly, the number might be counted in chunks of 256. This is how floating-point values can be a huge number and, thus, floating-point values have much greater ranges than might be expected.

In essence, a floating-point number’s storage is split into two parts. One part stores the number of chunks (technically called the significand) and the other part indicates the size of each chunk (called the exponent).

Because Java works with binary numbers, everything is represented in powers of 2.

A floating-point value is indicated as significand * 2 ^ exponent. Or to write it out in more conventional mathematics format

Floating point value = significand x 2exponent

With that background on floating-point number representation, let’s get back to the idea of loss of precision. It turns out that for a float value in Java, the significand (the chunk count) uses 23 bits, and the exponent is 8 bits. The extra bit is for the sign to indicate whether the floating-point number is positive or negative.

Thus, the largest value a float significand can represent without making an approximation is 16,777,216. This is much lower than the largest value of an int and very much lower than the upper limit for a long.

If you try to use a float value to store a number bigger than that, the exponent, that is, the chunk size, must be increased. Initially the count size is twos, and then after 16,777,216 twos, the count is by fours. In other words, it starts out as follows:

Floating point value = significand x 21 (up to 16,777,216)

Floating point value = significand x 22 (up to 216,777,216)

Floating point value = significand x 24 (up to 416,777,216)

And so on.

However, the significand remains a 23-bit value, which limits its precision. This is what separates floating-point values from integers, which have smaller range but greater precision.

There are immediately demonstrable consequences to this. First, if you try to assign to a float the literals 16,777,216; 16,777,217; 16,777,218; and 16,777,219, you see that it takes on the values 16,777,216 (accurate); 16,777,216 (rounded down by one); 16,777,218 (correct); and 16,777,220 (rounded up by one).

You can demonstrate this easily by casting those literal values to floats, and then casting them back to ints, like this.

System.out.println((int)((float)(16777216)));
System.out.println((int)((float)(16777217)));
System.out.println((int)((float)(16777218)));
System.out.println((int)((float)(16777219)));

Another effect that is perhaps even more significant is what happens in this loop. Try to guess how many times the following will print the string incrementing. Then copy the following code and find out if you were right.

float count = 33554432;
while (count < 33554435) {
    System.out.println("incrementing");
    count = count + 2;
}

Double-precision floating-point numbers have essentially the same behavior, although the boundary numbers are different. Where a float has a 23-bit significand, a double has a 53-bit significand—allowing it to accurately represent the full range of values of a standard 32-bit int. And where a float uses 8 bits for the exponent, a double uses 11 bits.

The point here is that loss of precision means just that: When you assign an int to a float or a long to either a float or a double, you might end up with an approximation of your original number. This approximation might have practical consequences for your code. (Side note: double is the default data type for floating-point numbers in Java.)

If you’re interested in more detail, Java (at least in strictfp mode) uses the IEEE Standard for Floating-Point Arithmetic (IEEE 754) for floating-point representations. Explaining strictfp and IEEE 754 is not specific to Java and is far beyond the scope of this article; a Wikipedia page provides more details.

Effective storage for numeric data types


This discussion is now complete, right? Well, no. Look back at Figure 1. The second column is labeled Effective storage. Why not simply use the title Storage? The difference is significant because the physical storage space used for a variable isn’t specified by the language or the virtual machine specification.

For example, on modern 64-bit Intel or Arm processors, it’s possible that the hardware cannot efficiently address single bytes or perhaps even 4-byte words. In this situation, a particular implementation is free to allocate more storage than is strictly needed for the data.

What the Java specification mandates is that the integral data types must behave as if they are two’s complement binary numbers with the specified amount of storage.

For floating-point values, the Java specification mandates that the behavior must be exactly compliant with the IEEE 754 specification only if the class or method carries the modifier strictfp. The bottom line is that although the table describes what the numerical behavior will be, you simply cannot assume that allocating 1,000 int variables will reliably allocate 4,000 bytes of memory.

This discrepancy can be even more startling with Boolean values. It’s possible that an array of 64 Booleans might be packed into a single 8-byte word, but it’s also possible (though perhaps not very likely) that each individual Boolean might actually take 8 bytes. The reality is likely somewhere in between, but neither the language specification nor the virtual machine specification mandates this behavior. Practically speaking, you probably don’t care in most cases.

Source: oracle.com

Related Posts

0 comments:

Post a Comment