Choosing the best runtime switches can greatly affect the performance of your workload.
Java developers don’t need to worry about the deployment hardware as long as a first-class JVM is available, and that’s certainly the case for Arm processors. Whether you’re writing the code or running the code, it’s business as usual: At the end of the day, it’s plain old Java bytecode.
Of course, many Java developers and systems administrators want to know more, and there are several excellent resources, especially two posts from Arm.
Java performance on Ampere A1 Compute
Arguably the most important post for the deployment of processor-intensive applications is a November 2021 post from Arm senior performance engineer Shiyou (Alan) Huang. His post, “Improving Java performance on OCI Ampere A1 Compute instances,” begins with the following:
Oracle Cloud Infrastructure (OCI) has recently launched the Ampere A1 Compute family of Arm Neoverse N1-based VMs and bare-metal instances. These A1 instances use Ampere Altra CPUs that were designed specifically to deliver performance, scalability, and security for cloud applications. The A1 Flex VM family supports an unmatched number of VM shapes that can be configured with 1-80 cores and 1-512GB of RAM (up to 64GB per core). Ampere A1 compute is also offered in bare-metal configurations with up to 2-sockets and 160-cores per instance.
In this blog, we investigate the performance of Java using SPECjbb2015 on OCI A1 instances. We tuned SPECjbb2015 for best performance by referring to the configurations used by the online SPECjbb submissions. Those Java options may not apply to all Java workloads due to the very large heap size and other unrealistic options. The goal here is to see the best scores we can achieve on A1 using SPECjbb. We compared the performance results of SPECjbb2015 over different versions of OpenJDKs to identify a list of patches that improve the performance. As SPECjbb is a latency-sensitive workload, we also presented the impact of Arm LSE (Large System Extensions) on the performance in this blog.
Huang’s paper presents two metrics to evaluate the performance of a JVM: max-jOPS for throughput and critical-jOPS for critical throughput under service-level agreements (SLAs).
One of Huang’s charts, reproduced below as Figure 1, shows the SPECjbb scores from OpenJDK 11 to 16 using Arm’s tuned configurations.
0 comments:
Post a Comment