Multi-release JARs - Good or bad idea?

December 19, 2017

With Java 9 came a new feature of the Java runtime called multi-release jars. For us at Gradle, it’s probably one of the most controversial additions to the platform. TL/DR, we think it’s a wrong answer to a real problem. This post will explain why we think so, but also explain how you can build such jars if you really want to.

Multi-release JARs, aka MRJARs, are a new feature of the Java platform, included in the Java 9 JDK. In this post, we will elaborate on the significant risks of adopting this technology and provide how one can produce and consume multi-release JARs with Gradle, if desired.

In a nutshell, multi-release jars allow you to package several versions of the same class, for consumption by different runtimes. For example, if you run on JDK 8, the Java runtime would use the Java 8 version of the class, but if you run on Java 9, it would use the Java 9 specific implementation. Similarly, if a version is built for the upcoming Java 10 release, then the runtime would use it instead of the Java 9 and default (Java 8) versions.

Use Cases for multi-release JARs

Optimized runtime. This answers a problem that lots of developers have faced in real world: when you develop an application, you don’t know in what runtime it’s going to be executed. However, you know that for some runtimes you can implement optimized versions of the same class. For example, imagine that you want to display the Java version number that your application is currently executed on. For Java 9, you can use the Runtime.getVersion method. However, this is a new method only available if you run on Java 9+. If you target more runtimes, say, Java 8, then you need to parse the java.version property. So you end up with 2 different implementations of the same feature.
Conflicting APIs : Another common use case is to handle conflicting APIs. For example, you need to support 2 different runtimes, but one has deprecated APIs. There are currently 2 widely used solutions to this problem:
- The first one is to use reflection. One could for example define a VersionProvider interface, then 2 concrete classes Java8VersionProvider and Java9VersionProvider, the right one being loaded at runtime (note that funnily to be able to choose between the 2 you might have to parse the version number!). A variant of this solution is just to have a single class, but different methods, accessing and calling different methods by reflection.
- A more advanced solution would be to use method handles for this, if it is technically applicable. Most likely, you would see reflection as both painful to implement and slow, and you would most likely be right.

Well known alternatives to multi-release JARs

The second solution, easier to maintain and reason about, is to provide 2 different jars, aimed at 2 different runtimes. Basically, you would write the 2 implementations for the same class in your IDE, and it’s the build tool responsibility to compile, test and package them correctly into 2 different artifacts. This is the approach that some tools like Guava or Spock for example have been using for years. But it’s also what some languages like Scala need. Because there are so many variants of the compiler and the runtime, that binary compatibility is almost impossible to maintain.

But there are more reasons to prefer separate jars:

a jar is just packaging
- it’s an artifact of the build that happens to package classes, but not only: resources would typically be bundled into a jar too. Packaging, as well as processing resources, has a cost. What we’re trying to do with Gradle is to improve the performance of builds, and reduce the amount of time a developer has to wait to see results of compilation, tests, and in general the whole build process. By forcing to build a jar too early in the process, you create a redundant point of synchronization. For example, to compile downstream consumers, the only thing the consumer needs is the .class files. It doesn’t need the jar, nor does it need the resources in the jar. Similarly, to execute tests, all Gradle needs is the class files, plus the resources. There’s no need to actually create the jar to execute tests. The jar is only needed once an external consumer will require it (in short, publishing). But as soon as you consider the artifact as a requirement, then you’re blocking some tasks from running concurrently, and you’re slowing down the whole build. While for small projects this might not be an issue, for enterprise scale builds, this is a major blocker.
more importantly, as an artifact, a jar shouldn’t carry information about dependencies.
- There’s absolutely no reason why the runtime dependencies of your Java 9 specific class would be the same as the Java 8 one. In our very simplistic example they would, but for larger project this is wrong modeling: typically, users would import a backport library of a Java 9 feature, and use it to implement the Java 8 version of the class. However, if you package both versions in the same jar, then you’re mixing things that don’t have the same dependency trees into a single artifact. It means, typically, that if you happen to run on Java 9, you’re bringing a dependency that you would never ever use. Worse, it can (and will) pollute your classpath, possibly creating conflicts for consumers.

Eventually, for a single project, you can produce different jars, aimed at different usages:

one for the API
one for Java 8 runtime
one for Java 9
one with native bindings
…

Abuse of the classifier leads to inconsistent things being referred to using the same mechanism. Typically, the sources or javadocs jars are posted as classifiers, but don’t really have any dependency.

we don’t want to create a mismatch depending on how you get your classes. In other words, using multi-release jars have the side effect that consuming from a jar and consuming from a class directory are no longer equivalent. There’s a semantic difference between the 2, which is terrible!
depending on the tool that is going to create the jar, you may produce inconsistent jars! The only tool so far that guarantees that if you package the same class twice in a jar, both of them have the same public API, is the jar tool itself. Which, for lots of good reasons is not necessarily used by build tools, or even users. A jar, in practice, is just an envelope. It’s a zip in disguise. So depending on how you build it, you would have different behavior, or you could just produce wrong artifacts and never notice.

Better ways to manage separate JARs

The main reason developers don’t use separate jars is that they are impractical to produce and consume. The fault is on build tools, which, until Gradle, have dramatically failed at handling this. In particular, developers who have used this solution had no other choice than relying on the very poor classifier feature of Maven to publish additional artifacts. However, classifiers are very bad at modelling the complexity of the situation. They are used for a variety of different aspects, from publishing sources, documentation, javadocs, to publishing variants of a library (guava-jdk5, guava-jdk7, …) or different usages (api, fat jar, …). And in practice, there’s no way to indicate that the dependency tree of a classifier is not the one of the project itself. In other words, the POM is broken, as it represents both how the component is built, and what artifacts it produces. Say that you want to produce 2 different jars: one classic jar, and one fat jar that bundles all dependencies. In practice Maven would consider that the 2 artifacts have equal dependency trees, even if it’s plain wrong! It’s super obvious in this case, but the situation is exactly the same with multi-release jars!

The solution is to handle variants properly. That’s what we call variant-aware dependency management, and Gradle knows how to do it. So far this feature has only been enabled for Android development, but we’re currently developing it for Java and native too!

Variant-aware dependency management is the idea that modules and artifacts are different beasts. With the same source files, you can target different runtimes, with different requirements. For the native world it has been obvious for years: we compile for i386 and amd64, and there’s no way you can mix the dependencies of a i386 library with the ones of arm64! Transposed to the Java world, it means that if you target Java 8, you should produce a java 8 version of your jar, with classes targeting the Java 8 class format. This artifact would have metadata attached so that Java 8 consumers know what dependencies to use. And if you target Java 9, then the Java 9 dependencies would be selected. It’s as simple as that (well, in practice it’s not because the runtime is only one dimension of the variants, and you can combine multiple).

Of course, nobody has ever done this before because it’s complex to handle: Maven would for sure never let you do such complex thing. But Gradle makes it possible. And the good news is that we’re also developing a new metadata format that will let consumers know which variant they should use. Simply said, the build tool needs to deal with the complexity of compiling, testing, packaging, but also consuming such modules. For example, say that you want to support Java 8 and Java 9 as runtimes. Then, ideally, you need to compile 2 versions of your library. Which means 2 different compilers (to avoid using the Java 9 APIs while targeting Java 8), 2 different class directories, and 2 different jars in the end. But also, you will probably want to test the 2 different runtimes. Or, you might want to build the 2 jars, but still want to test what the behavior of the Java 8 version is when executed on a Java 9 runtime (because, it may happen in production!).

We’ve made significant progress towards modelling this, and even if we’re not ready yet, it explains why we are not so keen on using multi-release jars: while they fix a problem, they are fixing it the wrong way, and Maven Central is going to be bloated with libraries that do not declare their dependencies properly!

How to create a multi-release JAR with Gradle

It’s not ready so what should I do? The good news is that the path to generate correct artifacts is the same. Until this new feature is ready for the Java ecosystem, you have 2 different options:

do it the old way, using reflection or distinct jars.
use multi-release jars, (being aware that you may take the wrong decision here, even if there are good use cases)

Whatever solution you choose, separate jars route or the multi-release jars, both use the same setup. Multi-release jars are only the wrong (default) packaging: they should be an option, not a goal. Technically, the source layout is the same for both separate jars and external jars. This repository explains how you can create a multi-release jar with Gradle, but here is how it works in a nutshell.

First, you must understand that we as developers often have a very bad habit: we tend to run Gradle (or Maven) using the same Java version that the artifacts you want to produce. Sometimes it’s even worse, when we use a more recent version to run Gradle, and compile using an older API level. But there’s no good reason to do this. Gradle supports cross-compilation. It allows you to explain where a JDK is found, and fork compilation to use this specific JDK to compile a component. A reasonable way to setup different JDKs is to configure the path to the JDKs through environment variables, which is what we are doing in this file. Then we only need to configure Gradle to use the appropriate JDK based on the source/target compatibility. It’s worth noting that starting from JDK 9, it’s no longer necessary to provide older JDKs to perform cross-compilation. A new option, -release, does exactly that. Gradle will recognize this option and configure the compiler accordingly.

The second key concept is the notion of source set. A source set represents a set of sources that are going to be compiled together. A jar is built from the result of the compilation of one or more source sets. For each source set, Gradle will automatically create a corresponding compile task that you can configure. This means that if we have sources for Java 8 and sources for Java 9, then they should live in separate source sets. That’s what we do by creating a Java 9 specific source set that will contain the specialized version of our class. This matches reality, and doesn’t force you to create a separate project like Maven would require. But more importantly, it allows us to precisely configure how this source set is going to compile.

Part of the challenge of multiple versions of a single class is that it’s very rare that such a class is totally independent from the rest of the code (it has dependencies onto classes that are found in the main source set). For example, its API would use classes that don’t need to have Java 9 specific sources. Yet, you don’t want to recompile all those common classes, nor do you want to package Java 9 versions of all those classes. They are really shared and should stay separate. This is what this line is about: it will configure the dependency between the Java 9 source set and the main source set, making sure that when we compile the Java 9 specific version, all common classes are on compile classpath.

The next step is really simple: we need to explain to Gradle that the main source set is going to target Java 8 language level, and that the Java 9 source set is going to target Java 9 language level.

All the steps we have described so far allow you both approaches described previously: publishing separate jars, or publishing a multi-release jar. Since this is the topic of this blog post, let’s see how we can now tell Gradle that we will only generate a multi-release jar:

jar {
  into('META-INF/versions/9') {
     from sourceSets.java9.output
  }

  manifest.attributes(
     'Multi-Release': 'true'
  )
}

This configuration block does 2 separate things: bundle the Java 9 specific classes into the META-INF/versions/9 directory, which is expected in a MRJar add the multi-release flag to the manifest.

And that’s it, you’ve built your first MRJar! However we’re not done yet, unfortunately. If you are familiar with Gradle, you would know that if you apply the application plugin you can also run the application directly with a run task. However, because as usual Gradle tries to perform the minimal amount of work to do what you need, the run task is wired to use the class directories as well as the processed resources directories. And for multi-release jars, that’s a problem, because you need the jar now! So instead of relying on this plugin we have no choice but creating our own task, which is another reason why not use multi-release jars.

Last but not least, we said we probably also want to test the 2 versions of our class. For this, you have no choice but using forked VMs, because there’s no equivalent to the -release flag for the Java runtime. The idea, here, is that you write a single unit test, but it’s going to be executed twice: once with Java 8, the other with Java 9 runtime. This is the only way to make sure that your substituted classes work properly. By default, Gradle only creates a single test task, and it will also use the class directories instead of the jar. So we need to do two things: create a Java 9 specific test task configure both test tasks so that they use the jar and specific Java runtimes

This can be achieved simply by doing this:

test {
   dependsOn jar
   def jdkHome = System.getenv("JAVA_8")
   classpath = files(jar.archivePath, classpath) - sourceSets.main.output
   executable = file("$jdkHome/bin/java")
   doFirst {
       println "$name runs test using JDK 8"
   }
}

task testJava9(type: Test) {
   dependsOn jar
   def jdkHome = System.getenv("JAVA_9")
   classpath = files(jar.archivePath, classpath) - sourceSets.main.output
   executable = file("$jdkHome/bin/java")
   doFirst {
       println classpath.asPath
       println "$name runs test using JDK 9"
   }

}

check.dependsOn(testJava9)

Now if you run the check task, Gradle will compile each source set using the proper JDK, build a multi-release jar, then run unit tests using this jar on both JDKs. Future versions of Gradle will help you do this in a more declarative way.

Conclusion

In conclusion, we’ve seen that multi-release jars address a real problem that a significant number of library designers face. However, we think this is the wrong solution to the problem. Correct modeling of dependencies, as well as coupling of artifacts and variants, and not forgetting performance (ability to execute more tasks concurrently) make them a poor man’s solution to a problem we are fixing the right way, using variant-aware dependency management. However, we reckon that for simple use cases, knowing that variant-aware dependency management for Java is not yet completed, it may be convenient to produce such a jar. In that case, and only in that case, this post helped you understand how you can do this, and how the philosophy of Gradle differs from Maven in this case (source set vs project).

Finally, we don’t deny that there are cases where multi-release jars do make sense: applications for which the runtime is not known in advance, for example, but those are exceptional and should be considered as such. Most issues are for library designers: we’ve covered common problems they face, and how multi-release JARs attempt to solve some of them. Modeling dependencies correctly as variants improves performance (via finer-grained parallelism) and reduces maintenance overhead (avoiding accidental complexity) over the use of multi-release JARs. Your situation may dictate that MRJARs be used; rest assured that it’s still supported by Gradle. See this mrjar-gradle example project to try this today.