[Maven shade] Problems with dependencies in multi module project...

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Maven shade] Problems with dependencies in multi module project...

Niels Basjes
Hi,

In many of my projects I create the functionality and the wrappers for
various processing systems in separate modules within the same project.
So I have a multi module project ( for example
https://github.com/nielsbasjes/yauaa ) that effective consists of
1) a module with a library that holds "the functionality" ( in this case
'analyzer' )
2) a set of modules each with a different framework specific "UDF" wrapper
(i.e. Hadoop, Pig, Hive, Nifi, Flink, etc.)

The library has dependencies on things like Guava, Antlr and Spring which
are common causes of versioning problems within downstream projects that
want to use this.

So in the library I use maven shade to include and relocate these
dependencies into a different package (In this case nl.basjes.shaded. ) to
"get them out of the way".

Now I got a bug report that apparently these dependencies are now included
TWICE in the Hive UDF and therefore still cause problems: Once relocated
and once via transitive dependency.

Apparently the dependency-reduced-pom.xml (which has the right
dependencies) is not included in the shaded jar ( Is a known problem
https://issues.apache.org/jira/browse/MSHADE-36 )

As a quick workaround I created a shell script that replaces the pom.xml in
the jar file with the dependency-reduce-pom.xml
So the jar of the library seems good now.

However ...

I found that if I build the project from the root (i.e. mvn clean install )

In the library I get the expected dependencies in the shaded form

    $ unzip -l analyzer/target/yauaa-5.12-SNAPSHOT.jar  | fgrep
org/springframework/core/io/ResourceLoader.class
          494  2019-08-23 12:26
nl/basjes/shaded/org/springframework/core/io/ResourceLoader.class

But in the Hive UDF I see this:

   $ unzip -l udfs/hive/target/yauaa-hive-5.12-SNAPSHOT-udf.jar  | fgrep
org/springframework/core/io/ResourceLoader.class
          494  2019-08-23 12:26
nl/basjes/shaded/org/springframework/core/io/ResourceLoader.class
          487  2019-02-13 05:32
org/springframework/core/io/ResourceLoader.class

Apparently the multi module build looks at the dependencies defined in the
library pom.xml instead of the pom.xml that is in the jar file (or even the
dependency-reduced-pom.xml).

I verified this by first doing a mvn install (i.e. the final library jar
goes into ~/.m2 ), then go into the subdirectory of the udf and do a mvn
package I get this:

$ unzip -l target/yauaa-hive-5.12-SNAPSHOT-udf.jar  | fgrep
org/springframework/core/io/ResourceLoader.class
      494  2019-08-23 12:26
nl/basjes/shaded/org/springframework/core/io/ResourceLoader.class


I have done experiments with marking these dependencies with various scopes
and even with optional.
All of these had various unwanted effects (in one case the build succeeded
but IntelliJ gave a lot of errors).

My question to you: What is the correct pattern to handle this?

--
Best regards / Met vriendelijke groeten,

Niels Basjes