Pros and Cons for Using GraalVM Native-Images
Native image is a utility for converting Java applications into fully compiled binary code which is called native-image.
Join the DZone community and get the full member experience.Join For Free
Java is becoming very progressive with a new release policy and we receive regularly every 6 month new features, enhancements or just previews to test it out and write them back our feedback that can be taken into account in further development.
The second line might be even more interesting. It's GraalVM Project that especially contains a new C2 JIT Compiler and multi-language support based on Truffle Framework. There is one more technology that deserves our attention — GraalVM Native-Image. Native image is a utility for converting Java applications into fully compiled binary code which is called native-image. The process of creating a native-image is called ahead-of-time compilation.
You may also like: Learning to Use the GraalVM
In this article, I would like to discuss the advantages and disadvantages of native-image and then make a follow-up with the next separated article that will be dedicated to Profiling Java Native Images. There is already a lot of articles on how to compile our code into native-image, for example, GraalVM Native Image. Let's look at it from a performance point of view.
When to Prefer Native-Image Over a Regular Java App?
This might be very interesting mainly for two types of applications: command-line apps and serverless functions. Both types have actually a lot in common. They are usually smaller and can be started multiple times in a very short period of time and their lifetime is pretty short.
This is the reason why we need a very quick startup and not to wait for provisioning all JVM features that might not be actually used (e.g. we need to make our code hot to trigger JIT compiler).
What makes the Startup Faster?
- No classloading
- All classes have been already loaded, linked and even partially initiated. However, it means that only classes and methods that were traced during the image-build process are included in binary and can be used at runtime.
- No interpreted code
- The generated native code doesn't have to be fully efficient because we don't use profile-guided optimizations that are part of the C2 compiler (GraalVM Enterprise contains a feature that collected profile from a previous run can be included into a generation of native image). However, we still don't have to initialize an interpreter and interpret our bytecode.
- No burnt CPU for profiling and JIT-ing, much simple GC to start (SerialGC)
- We don't have to start JIT Compiler and JIT our code to make it performant.
- Generating Image Heap during the native-image build
- We've already mentioned that the native application is partially initiated. That means that we can run the Initialization process for some specific classes at build-time (run class static blocks) to prepare some part of the heap and speed up the startup. Please, read an article from Christian Wimmer about Class Initialization GraalVM Native Image.
Lower Memory Footprint
If we compile our code into the native-image then we can throw away a lot of stuff from our executable. Those JVM features become unused because they don't have to be there to make our code more efficient. However, keep in mind that the RSS memory of the native-image process is reduced but only about JVM-related stuff and class metadata, Heap Memory stays absolutely the same.
Due to the nature of the GC currently used in native images, it could become awkward to use larger heaps the GC pauses can impact our latency. But if we keep the application small, we can benefit from the efficiency of native-image GC (currently implemented as a generational scavenger).
What makes the Lower Memory Footprint:
- No metadata for loaded classes
- We still need to have compiled code in our non-heap memory. It's much more space-efficient then to keep all metadata for dynamically loaded classes in Metaspace (no memory reclaiming process in Metaspace needed)
- No profiling data for JIT Optimizations, no Interpreter Code, no JIT structures
- JVM collects profiling data about our application to figure out what kind of optimizations could be applied. This is not needed because our bytecode is already in native-code. Therefore, we can throw away the entire Segment Code Cache that contains profiling data and interpreters. The optimized methods are stored in a binary file in a different way.
Native-Image and Bright Future of Performance
The very last note of this section is dedicated to a new feature — Isolates. Very briefly, Isolates is a technology that is able to partition our heap into smaller independent "heaps". It could be very efficient, e.g. in case of request processing.
If the processing allocates a lot of objects on the heap, we can put the request (actually, the thread where the request runs on) into the separated isolate. The big advantage is that the isolate can be very quickly allocated and thrown away without doing any GC (we know that all objects became garbage when the thread left the given isolate because those objects don't have to be referenced from other isolates).
This technology is still under development but if you want to know more, read Isolates from Christian Wimmer.
What Is the Price to Pay With the Native-Image?
Native-image is a fantastic tool for smaller applications that can help us with startup and memory footprint. However, we need to pay a price for that and adjust our applications to be compliant for ahead-of-time compilation. We can find a set of exhaustive supported and unsupported features in SubstrateVM Limitations. I just briefly point out the most painful parts:
- No JVMTI, Java Agents, JMX, JFR support
- This is probably the biggest gap between applications running on JVM and native-image. Native-image does not support any well-known profilers connecting via JVMTI or JMX and we need to count on kernel features to figure out what happens in our application.
- Java Agents commonly used for instrumenting classes are also not supported (very often used for measuring execution time or allocation rate). Even Java Flight Recorder along with Java Mission Control is priceless in this situation.
- Efficient only for smaller heap
- Due to the SerialGC, we are not supposed to use some bigger heaps, we could badly impact our latency.
- Generated native code is not fully efficient
- The JIT will always be more efficient because it has access to an application runtime profile and can speculate and just insert traps for deoptimizations. We can mitigate the difference between AOT and JIT by using a Profile-Guided Optimization feature that is available in GraalVM Enterprise Edition.
- Obscurity with reflection
- I am not a fan of reflection at all but a lot of frameworks are based on it and they must be adjusted or use additional configuration files that define what classes and methods are supposed to end up in a binary file. I can recommend a very detailed page about SubstrateVM Reflection.
- No thread dump and heap dump support
- In the next article, I'll show the way how to get at least some information about threads using Linux Kernel features. In terms of heap dumps, there is (not so convenient) a way how to do it but also only in GraalVM Enterprise Edition. Please, read the Generation Native Heap Dumps.
Thank you for reading my article and please leave comments below. If you would like to be notified about new posts, then start following me on Twitter!
Stay tuned for the follow-up article called Profiling Java Native Images.
Opinions expressed by DZone contributors are their own.