Wednesday, December 1, 2021

Java on Arm: The AArch64 hardware, software, cloud, and JDK

Learn about the hardware and software that can help bring your AArch64 development project to a new level in the cloud or on the desktop.

Download a PDF of this article

[The software world is buzzing about the Arm 64-bit architecture, and of course, Java programmers are no different. The good news is that as a Java developer, you don’t need to worry about the deployment hardware as long as a first-class JVM is available, and that’s certainly the case for AArch64. It’s business as usual: Sure, you might expect a performance gain; but at the end of the day, it’s your plain old Java bytecode.

Still, many developers want to know more about processor architecture, the hardware, and of course, the performance of their Java applications on those machines. Enjoy. —Ed.]

The technology of chip designer Arm Holdings is at the heart of most smartphones worldwide—but Arm chips can be found in many devices beyond mobile devices. Indeed, the Arm platform has been a thriving alternative to x86 chips from Intel and AMD in desktops and servers for a couple of years.

You’ve heard praise for Arm’s speed and productivity; this new silicon is easier on thermals and batteries while showing equal or higher efficiency than x86 processors. And still, if you are an enterprise Java developer who wants to switch your back-end Java operations to Arm entirely or partially, there are no proper and concise instructions for such migration.

This article serves as a step-by-step guide for x86-to-Arm migration, where you’ll get to choose

◉ A new AArch64-based machine

◉ The operating system

◉ The software stack

◉ Either an x86 emulation or an Arm-native JDK

The wide sea of Arm

In 2018, Arm Holdings announced a public roadmap for the upcoming years with separate mobile and infrastructure domains. All infrastructure production will now fit into the so-called Neoverse, which is a concept describing the company’s infrastructure to encapsulate all cloud-to-edge workloads and technologies inside the ecosystem.

The Neoverse includes three generations of processors and three lineups: E-, V-, and N-series, as you can see in Figure 1. The important change for the IT industry is that from now on, Arm chip design for external manufacturers is practically focused in three directions. Cloud server processor cores are relatives of mobile chipset cores. For instance, the Neoverse N1 core model is similar to Cortex-A76, and the next generation will share a design with X cores. Neoverse N2 adopts the ARMv9.0-A specification with numerous enhancements, which is an extension of the current 8.x architecture.

Java on Arm, Hardware, Software, Cloud, and JDK, Core Java, Oracle Java Certification, Oracle Java Tutorial and Materials, Oracle Java Career, Oracle Java Preparation
Figure 1. The Arm computing roadmap (source: Arm Holdings)

Hardware for Java developers


The first option for AArch64-targeted development is to continue to use your current x86-based computer. Unless you are working at a very low level, such as when building Java hardware compatibility kits, there is almost never any need to run emulation because Java is “write once run anywhere.” So just develop your Java artifacts on one platform and then test and deploy them on another. You never need to stop using your favorite IDE, build system, and other tools.

In some cases, such as when testing or performance tuning, it is easy to use remote AArch64 machines via the Secure Shell (SSH) protocol. You can prepare all native AArch64 artifacts, including Java Native Interface (JNI) binaries and container images, using cross-compilation and container cross-builds on an x86 developer’s machine. Since it’s not always convenient to build and test your code in different environments, you can consider Arm hardware options for a developer—such as buying a shiny new machine.

The second possible (and very topical) choice for your soon-to-be-Arm project is Apple silicon. This concept consists of three components.

◉ M1, the first Apple-designed system on a chip (SoC) for Apple computers, which is intended to replace x86 processors

◉ Specific macOS products that use M1 processors; several were announced in November 2020, and more appeared in October 2021, with a choice of laptop and desktop configurations

◉ A hardware/software combination between M1 and the new macOS Big Sur (released November 2020) and macOS Monterey (October 2021) that brings extremely high performance and efficiency for native AArch64 applications

The only downside to the Apple configuration is an almost imperative use of macOS. Yes, you can run Linux or Windows on AArch64 Macs, but be prepared for a drop in performance. The first option is to run a virtual machine where Arm is emulated on Arm (the operating system is different). You’ll pay a virtualization and emulation penalty, though not as big as with emulation of x86 on Arm, which is also possible.

There is also an option of installing the system on Apple’s bare metal. There are two drawbacks: This is a work in progress for Linux, and native drivers are required, unlike with the emulation environment. Therefore, as a Java developer, you’ll need the software stack and JDK native to macOS—but more on that later.

Finally, when you don’t need all the power in the world, look into low-end processors for affordable and energy-efficient laptops. They are not very effective for software development, but they certainly have growth potential. I’m speaking of desktop-class chipsets: Samsung Exynos, MediaTek Helio, Huawei Kirin, and certain Qualcomm Snapdragon series.

These SoC systems, which are prevalent in smartphones, have started taking bites out of the laptop market inside the products of major vendors such as HP, Lenovo, and ASUS. Typically, vendors’ interest in Arm is long battery life, which explains why it is mostly seen in laptops and notebooks.

Choosing an operating system and JDK


The next step is to pick an operating system and the JDK that would be compatible with the chosen device and most relevant to your project.

The first AArch64 porting project was JEP 237: Linux/AArch64 port, which was part of Java 9. This JEP still greatly affects present-day work on Java as a programming language.

In Java 11, plenty of optimizations in the port were implemented in JEP 315: Improve Aarch64 intrinsics. These improvements are CPU-specific and, thus, help all operating systems. Thanks to contributions from Arm, there are also some optimizations for the future hardware, such as Scalable Vector Extension (SVE) vectorization code.

Due to Java 16’s Windows/AArch64 (JEP 388) and Java 17’s macOS/AArch64 (JEP 391) OpenJDK ports, you have full-fledged Java options on almost all popular operating systems.

Operating systems for the cloud. My clear, unequivocal recommendation is to run Linux for your cloud workloads and for hosting a JVM.

When you choose an OS image for a cloud JVM, it is better to have one from a vendor that clearly confirmed its support for AArch64 architecture along with other hardware platforms. Fortunately, major Linux distribution providers do that now. The same is true for cloud provider-branded images such as ones from Oracle. Such an OS can be a container host or run JVMs directly. Docker and Kubernetes work well on Arm servers.

When you choose an operating system and JVM combination for a container, it is sometimes useful to minimize the size of the image. As a result of engineering efforts from the BellSoft team and the OpenJDK community, JDK 16 received support for the tiny Alpine Linux with JEP 386. The use of musl (as the C standard library) natively without a glibc layer shows many benefits for cloud-native Java. One example is smaller images, which can lead to higher overall software performance and lower development costs.

JDK 11 musl binaries have been available for a while; thanks to the community’s efforts, JDK 8 musl binaries appeared shortly after JEP 386 was completed.

Profiling tools such as async-profiler work on Linux AArch64 either with JVMs started directly or with containers.

Using the very same package dependencies, you can use all Linux variants mentioned above to produce and run GraalVM native images, such as by using Spring Native.

Software for AArch64 Java development


Remember that you can continue using your familiar environment, and that includes the OS. Whether it’s Windows, Linux, or macOS on x86, keep your JDK, IDE, and other tools set up for Java development. For cross-compilation of C/C++ code, you have amd64-to-aarch64 64-bit cross-compilers in all the above cases.

However, instead making it, say, Windows to Linux can be tricky. Here, containers come to help. If you still don’t use containers to have a predictable, native build environment, maybe it is the time to start. Because you have a Linux x86 build container, it won’t be difficult to make it cross-compile to Linux AArch64, which is the target for the cloud deployment.

Windows has not been left out from the Arm trend, as you can see with JDK 16’s JEP 388 port. The Windows port for Arm is also extended from the Linux/AArch64 port, has optimized intrinsics from JEP 315, and is ready for production.

Needless to say, Windows support and integration to the mainline JDK means more Windows-based binaries for Arm 64-bit processors. As mentioned earlier, there are many Arm-based laptops, and they mostly have or support Windows 10 on Arm. As for JDK binaries, multiple vendors provide compatible builds. No full-fledged native assemblies of IntelliJ IDEA or NetBeans are out yet. However, other Java tools can be found for Windows.

Some Arm-based laptops with Windows 10 on Arm can be reimaged to run Linux. Moreover, the market also offers developer boards powerful enough to be used as a developer machine with Linux and modern GUIs.

If you’ve picked an Apple M1-based computer for your project, I suggest going for macOS, especially because the JEP 391 macOS/AArch64 port is targeted to JDK 17. It will likely share some code with the Windows port of the JDK, which is expected to reduce the macOS-specific AArch64 code further down the line.

Using Docker on Macs that have Apple silicon can be quite convenient for building native images suitable for Arm servers from your Java applications.

Native build versus emulation


Why is the Apple silicon native build better with macOS than with Linux? When BellSoft first released native Liberica JDK builds for M1 in March 2021, it took several DaCapo benchmarks using default JVM parameters on two implementations: macOS-x86_64 on Rosetta 2 (a special translation environment bundled with Big Sur) and macOS-aarch64. The answer to the question posed here is evident in Figure 2. The native build provides a major difference in performance—up to 2.5 times.

Java on Arm, Hardware, Software, Cloud, and JDK, Core Java, Oracle Java Certification, Oracle Java Tutorial and Materials, Oracle Java Career, Oracle Java Preparation
Figure 2. DaCapo benchmarks in milliseconds (less is better) with JDK 17 EA build 25, MacBook Air M1, and macOS 11.4

On-the-fly emulation of a different CPU architecture consumes a significant part of computer resources. Also, as you can see in this article about AArch64 port optimization, there are many cases where OpenJDK HotSpot (and other compilers and runtimes) does special performance tricks that are highly platform-dependent.

The cloud ecosystem for Java


Just like Linux and JDK, many projects require porting to work efficiently, or to just work in general, on specific hardware. The software also needs to be built and distributed. For example, if you need the NGINX load balancer, simply check whether an AArch64 version of the appropriate package resides in the repository used by the OS package manager (most likely it does).

If you work with big data via Hadoop, there may be native accelerator libraries, such as the Hadoop Native Library (HNL) or the Intel Intelligent Storage Acceleration Library (ISA-L), which may even be third-party libraries. Ensure that platform-specific binaries are properly installed along with the main software installation, which can autodiscover them or might require extra explicit configuration. If you don’t add them to your ecosystem, the software continues to work but has worse performance. This is especially true for older Java versions.

The same situation applies to libraries that come as dependencies to your projects. For instance, consider the Netty network input/output client/server framework. Netty has native epoll transport (netty-transport-native-epoll). How can you check whether such libraries are effective in the Arm cloud, and how can you use them? Go to Maven Central and look at artifacts. In the case of Netty, platform-specific native libraries are packaged inside dedicated .jar artifacts in META-INF/native, for example, netty-transport-native-epoll-4.1.65.Final-linux-aarch_64.jar. There are a few such artifacts with different classifiers specifying different build and runtime targets.

What about Docker multiplatform images? Today it is easy to cross-build a container image, for example, for AArch64 or x86 systems. A Docker plugin called Buildx has a --platform option you can use to specify multiple platforms, including the two mentioned; see the documentation. My “Java in Docker on Apple Silicon” blog post shows an example of making a cross-build for x86 on Apple silicon. Remember that your parent image must be available for the desired CPU architecture. Ready-to-use parent images with Liberica JDK on top of a few different OS layers are provided at BellSoft’s public repository.

Cross-compilation and publishing for Java Native Interface


Take a look at this problem from the other side: Your project might be a component that could be embedded into another project. In this case, you should produce and publish binaries that include a native part that is, by definition, platform-specific.

For instance, consider a Maven project with one JNI library where some native methods are implemented, and you also have an x86 development machine (or container) with Linux. The project is built by Maven; the resulting artifacts are published in the Maven repository. What happens now is that the project is going to be targeted both to x86 and AArch64. So, it’s good to produce native artifacts doubling as Maven dependencies with a proper CPU architecture classifier. The code that loads native libraries like JNI libraries should do it correctly on all target platforms.

To create a dual-target application, you need a regular compilation toolchain and a cross-toolchain to compile and link both the x86 and AArch64 shared libraries on x86.

Use your system capabilities to install a native toolchain and an appropriate package manager for Linux systems, such as Advanced Package Manager (APT). For example, on Ubuntu Linux, type the following:

apt-get install gcc

Verify that the installation is valid, and that it has the correct version, with this command.

gcc -v

Some systems support cross-toolchain installation via a package manager. Others require separate downloads. The package manager can be used on Ubuntu Linux, as follows:

apt-get install gcc-aarch64-linux-gnu

Verify that the installation is valid, and that it has the correct version, with this command.

aarch64-linux-gnu-gcc -v

Next, ensure your configuration for native-maven-plugin has gcc as the compiler and linker for building x86, and that is has aarch64-linux-gnu-gcc for building AArch64 parts.

For x86, check the configuration for the following:

<compilerExecutable>gcc</compilerExecutable>
<linkerExecutable>gcc</linkerExecutable>

For AArch64, check the configuration for the following:

<compilerExecutable>aarch64-linux-gnu-gcc</compilerExecutable>
linux-gnu-gcc</linkerExecutable>

You can (but don’t have to) assemble binaries for different CPU targets in distinct profiles. Next, the Java code that loads the JNI library can look like the following:

package com.bellsoft.demo;

public class HelloWorld {
  native void helloFromC();

  static {
       System.loadLibrary("helloworld-" + System.getProperty("os.name") + "-" + System.getProperty("os.arch"));
  }

  public static void main(String[] args) {
      HelloWorld hw = new HelloWorld();
      hw.helloFromC();
  }
}

Here’s the C method.

#include <jni.h>
#include <stdio.h>

JNIEXPORT void JNICALL Java_com_bellsoft_demo_HelloWorld_helloFromC
      (JNIEnv *env, jobject obj) {
  printf("Hello world!\n");
  return;
}

You can have a fallback Java implementation for Hadoop and Netty; it looks similar to the above.

Here is how to build the artifact.

mvn compile

Check that everything works on both platforms.

java -Djava.library.path=./native-linux-amd64/target:./native-linux-aarch64/target -jar ./core/target/helloworld-core-1.0-SNAPSHOT.jar
Hello world!

The resulting .so files can be published as standalone artifacts or be packaged inside .jar files, similar to what Netty does. Cross-builds and regular builds can be defined in the same profile or in different profiles and, thus, performed together or performed according to the environment and parameters. No cross-build can be defined, but platform-specific artifacts still can be built and published with the right classifiers when the build process is executed a few times on machines with different CPU architectures. Choose what is best for your product and for your continuous integration.

Source: oracle.com

Related Posts

0 comments:

Post a Comment