Monday, February 13, 2012

Bundler + Maven for your JRuby projects!

I recently came across a blog post describing the first version of an integration between Bundler and maven ( http://jpkutner.blogspot.com/2011/04/ease-of-bundler-power-of-maven.html ). Please do take a look at his post for a complete understanding of what he did but to summarize:

Gemfile Snippet:

gem "mvn:org.slf4j:slf4j-simple", "1.6.1"

Ruby File Snippet (outside a Rails console context):

require 'java' require 'rubygems' require 'bundler/setup'
require 'mvn:org.slf4j:slf4j-simple'
logger = org.slf4j.LoggerFactory.getLogger("world")
...

This looked really promising but there were a few things that didn't quite look right, namely this code line:
require 'mvn:org.slf4j:slf4j-simple'
It didn't look "ruby"ish to me and knowing the way file systems work, it's hard to create a file/folder with ":" in the name so the "gem" file name would differ from the require line which seems to break convention. Reading further and looking at his modified Bundler source, I saw that this was using the maven_gemify library that is included in JRuby (try require 'rubygems/maven_gemify' at the jirb prompt). This was really cool in that something was already written to integrate the two; however, inspecting further, a few things I didn't like:
  • It used a custom written/third party maven plugin to resolve/download dependencies which is what Maven does out of the box.
  • It packaged the jars into the gem directly
My second point could be somewhat moot as I am relatively new to Maven and I haven't worked with deploying a Java project that relies on Maven. My issue with packaging the jars directly into the gem is that in my development environment, I have both a local Maven repository and a local Ruby gems repository. I thought that rather than downloading the same jar that may exist in my Maven repo again, why not have the ruby gem simply point to (i.e. require) the jar located in in my Maven repo?

Upon reading the followup to the Joe Kutner's blog (http://jpkutner.blogspot.com/2011/09/bundlermaven-workaround.html), he stopped development of his integration in favor of a workaround that, while I am sure works, didn't quite satisfy my desire for a clean integration between Maven and Bundler. While writing this post, I also stumbled across http://codingiscoding.wordpress.com/2012/02/08/ruby-bundler-maven-gemfile-maven-plugin/ which seems to reverse what I am proposing (i.e. calling Bundler from Maven).

Revamped maven_gemify library

The first thing I did was revamp the maven_gemify library to eliminate this third party dependency and rely on the local maven repository. This was done using a fairly brute force approach by creating a temp pom.xml that defines the repositories/dependencies and then programmatically invoke the dependency plugin to obtain the locations of the jars. Finally, it writes the wrapper ruby script to require these jars and packages this file into a gem. When gemifying a maven dependency, you specify the name as "mvn::" and pass the version as the second argument. The corresponding ruby gemname will replace the characters "." and ":" with _. An example is "mvn:org.apache.lucene:lucene-core" becomes "org_apache_lucene_lucene-core".

Bundler Integration

The hardest part of this was the bundler integration; however, after a few iterations on it (thanks to the help of what Joe Kutner did with his attempts), I was able to isolate the changes to a few files (namely lib/bundler/dsl.rb and lib/bundler/source.rb with the creation of lib/bundler/maven_gemify2.rb). There are two ways to pull in Maven dependencies:
  1. gem "mvn:org.apache.lucene:lucene-core","3.5.0", :mvn=>"default"
  2. mvn "default" do
    gem "mvn:org.apache.lucene:lucene-core","3.5.0"
    end
The only difference between the two approaches is that in #2, you can specify several Maven dependencies from the same repository. To avoid having to specify the default Maven repository, I allow for the keyword "default" to represent the standard Maven repo URL otherwise, you can specify the URL of the repository.

Note: Maven-based gems will be auto-required just as regular gems are upon Rails load or require 'bundler/setup'

Summary

Gemfile Snippet:

gem "mvn:org.apache.lucene:lucene-core", "3.5.0", :mvn=>"default"

Ruby File Snippet (outside a Rails console context):

require 'java'
require 'rubygems'
require 'bundler/setup'
require 'org_apache_lucene_lucene-core' #Notice the more "rubyish" way of requiring the gem. The directory name of the gem's contents is the same.
d=org.apache.lucene.store.SimpleFSDirectory.new(java.io.File.new("."))
...

Next Steps
  • Grab a copy of my modified Bundler ( https://github.com/anithian/bundler) and give it a shot. The rake install will generate the gem and you can install it directly. I have some examples in the README file as well as those above. The Maven dsl will only work with JRuby but the modified Bundler should still behave properly without JRuby.
  • Testing of the integration with more cases. I couldn't get the rspecs to run properly on my system and couldn't tell if it was a JRuby thing or a Windows thing.
  • Validation that what I have proposed in both the integration and the revamped maven-gemify plugin. Some feedback on how others properly deploy Java applications using Maven would be helpful. Does the application's classpath point to jars in the Maven repo?
  • Right now, when re-executing Bundler on your project, something is amiss in the use of the lock file and the internal cache isn't checked for Maven dependencies so it'll go through the gemification process each time. Since my maven_gemify2 library relies on Maven's dependency plugin directly, the jars won't re-download (as it would have already downloaded to your repo) but still it would be good to make the re-execution behavior consistent.