“Batteries included” distribution using Maven shaded JAR’s

“Batteries included” distribution using Maven shaded JAR’s

God of WarJAR’s that act like WAR’s?

One of the nice things about web application development on the JVM is the WAR file format.  With a WAR, libraries and other dependencies can be neatly packaged up with your code into a single file… ready for deployment.  You can deploy multiple applications to the same server, and each one basically gets its own clean and isolated classpath(*).

(*) Okay, you can still run into issues with the servlet engine’s common classpath.  But still, the JEE world has nowhere near the dependency-hell of Perl CGI back in the day, where all applications shared the same libraries.  It’s even better than the options offered by more popular scripting languages today, such as having to configure a Python virtualenv or Ruby RVM to give each application a clean environment).

batteriesHowever, what about non-web applications on the JVM (including Scala, Groovy, etc)?  Batch processing applications, desktop GUI applications, etc?

Traditionally, non-web application are built as JAR files, containing only their own compiled code (and perhaps some static resources such as property files).  The “batteries are not included”, in terms of all your other library dependencies.  You must tell the JVM at runtime, rather than build time, what other libraries should be in the classpath:

java -jar MyApp.jar -classpath "dependencyA.jar:dependencyB.jar:dependencyC.jar"

Kinda clunky, huh?  Of course, you would put this in shell script (or Windows batch file)… but now you have yet another file in the mix, on top of all these the multiple JAR files!  It’s a shame that all this couldn’t be bundled up, WAR file style, into a single executable JAR file that users could run with a double-click in most environments.

Eclipse runnable JAR screenshotEclipse method

These days, you can.  The Eclipse IDE, starting with the “Galileo” release, features “Runnable JAR file” as a project export option.  This wizard performs the basic task of populating “META-INF/MANIFEST.MF”, so that the correct “main” class will be called when the JAR is double-clicked or called from the shell without specifying.

However, the much cooler role of this wizard is to bundle all of your library dependencies into a single monolithic JAR with yourEclipse runnable JAR second step screenshot own code.  By selecting the “Package required libraries into generated JAR” option in the second step, the wizard will basically:

  1. Unzip each of your dependency JAR’s
  2. Copy all their contents, along with your own compiled classes and “META-INF/MANIFEST.MF”, to a single location (hopefully seeing a useful error message if there is a collision!)
  3. Zip it all back up into a unified new JAR

The result is a single executable JAR file, with batteries included.  All of your required library classes are present, without having to manually declare a classpath.

Maven method

Obviously, this approach is Eclipse-specific.  That may be fine for small projects, not intended to be shared with non-Eclipse developers or built on a continuous-integration server.  However, if you want to build a batteries-included JAR distribution outside of Eclipse, then you’ll want a standardized build system.  As luck would have it, the most widely adopted “enterprisey” build system(*) has a standard plugin for doing the same thing with much more flexibility.

(*) I know a lot of people hate Maven due to its steep learning curve.  Similar plugins may or may not exist for Buildr, Gradle, SBT, or other systems that haven’t been around long enough to show you their warts. However, while Maven might be superseded in the years ahead… as of 2012, I don’t know how you can develop in the JVM world without being at least basic Maven-literacy.  The principle here will probably translate elsewhere.

What Eclipse calls a “Runnable JAR file” is known in the Maven world as a “shaded” JAR.  Assuming that your project is structured as a Maven project, you would add a snippet like this underneath <project><build><plugins>:

...
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
	<artifactId>maven-shade-plugin</artifactId>
	<version>1.7</version>
	
	<executions>
        <execution>
		    <phase>package</phase>
			<goals>
				<goal>shade</goal>
			</goals>
			<configuration>
				<transformers>
					<transformerimplementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
						<mainClass>com.mypackage.MyApp</mainClass>
					</transformer>
				</transformers>
			</configuration>
		</execution>
	</executions>
</plugin>
...

Under the <configuration> section, note the <transformer> element with an “ManifestResourceTransformer” attribute.  This element expects a <mainClass> child element, which contains the name of the class which starts your application (i.e. the one with the “static void main” method).

The shade plugin is bound to the “package” phase of the Maven lifecycle.  So when you execute:

mvn package

… Maven will generated monolithic “shaded” JAR under your project’s “/target” directory.  You will actually find two JAR files there:

original-MyApp-1.0.0.jar
MyApp-1.0.0.jar

As should be obvious by the file sizes, the one with “original” in the filename is the ordinary (non-“shaded”) JAR file that would be built in any normal Maven build.  The file without the “original” prefix is the “shaded” JAR, suitable for distribution as-is with all dependencies bundled.  You can run your application with a command like this:

java -jar MyApp-1.0.0.jar

… or even use a platform-native wrapper such as Launch4j to package up your app as an executable.

Advanced Cases

When you build a “shaded” JAR file… you are basically unzipping all your dependency JAR’s, copying their contents to the same directory root, and zipping that back up.  What about cases where more than one dependency contains a file with the same name?

Well, one of those will overwrite the other(s), which may not be what you want to happen.  In particular, I’ve discovered that this causes problems when you bundle up Apache CXF into a shaded JAR (see this StackOverflow question for more detail).

The Maven “shade” plugin comes with the concept of “transformers”, which enable you to merge conflicting files together in the monolithic JAR rather than having one overwrite the other. There are different types of transformers… with the most common being:

  • org.apache.maven.plugins.shade.resource.AppendingTransformer — simply appends one text file onto the end of another
  • org.apache.maven.plugins.shade.resource.XmlAppendingTransformer — appends XML together while keeping the format sane

Now when you change your pom.xml snippet to look like this:

...
<plugin>
	<groupId>org.apache.maven.plugins</groupId>
	<artifactId>maven-shade-plugin</artifactId>
	<version>1.7</version>

	<executions>
		<execution>
			<phase>package</phase>
			<goals>
				<goal>shade</goal>
			</goals>
			<configuration>
				<transformers>
					<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
						<mainClass>com.ppc.sts.poster.Poster</mainClass>
					</transformer>
					<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
						<resource>META-INF/spring.handlers</resource>
					</transformer>
					<transformer implementation="org.apache.maven.plugins.shade.resource.XmlAppendingTransformer">
						<resource>META-INF/extensions.xml</resource>
					</transformer>
				</transformers>
			</configuration>
		</execution>
	</executions>
</plugin>
...

… your “shaded” JAR file will contain a “META-INF/spring.handlers” file… made up of all the “META-INF/spring.handlers” files found in all your dependencies, appended together as one.

Likewise, you will have one composite “META-INF/extensions.xml“.  Maven will assume that all the versions it finds contain XML data, and will append their contents together in a way that hopefully keeps the XML layout sane.

Presto!  One monolithic artifact to deliver, with batteries included!