Shading all modules in a project

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Shading all modules in a project

Nick Dimiduk
Hello,

Over in Apache HBase, we have an interesting problem that I'm hoping the
shade plugin can help us solve. It all started with a performance
enhancement to reduce garbage by creating a subclass of a protobuf type
[0]. It works fine for us most of the time. However, when using this code
from a Hadoop mapreduce job, things get strange. Due to the permissions on
the parent object and the way Hadoop uses "fat" jars, we now have
classloader troubles [1]. We were able to work around the issue by
documenting the required classpath constructions. However, this is not
possible for all users. This is where the shade plugin comes in.

I gave shade a shot [2], though I admit I'm not terribly familiar with it,
nor the inner workings of maven. As far as I can tell, the trouble is the
plugin invocation is limited to the single module in our multi-module
project. I think what's happening is shade is doing what I want for the
hbase-protocol module, but I also need it to do bytecode rewriting for the
modules that consume this module as well. Luckily, this detail is not part
of our public API so it should be transparent to our users.

My patch is posted on [2]. Any pointers/advice is greatly appreciated.

Thanks in advance,
Nick

[0]: https://issues.apache.org/jira/browse/HBASE-9867
[1]: https://issues.apache.org/jira/browse/HBASE-10304
[2]: https://issues.apache.org/jira/browse/HBASE-11118
Reply | Threaded
Open this post in threaded view
|

Re: Shading all modules in a project

Nick Dimiduk
Ping. Any advice on this?


On Mon, May 12, 2014 at 9:01 AM, Nick Dimiduk <[hidden email]> wrote:

> Hello,
>
> Over in Apache HBase, we have an interesting problem that I'm hoping the
> shade plugin can help us solve. It all started with a performance
> enhancement to reduce garbage by creating a subclass of a protobuf type
> [0]. It works fine for us most of the time. However, when using this code
> from a Hadoop mapreduce job, things get strange. Due to the permissions on
> the parent object and the way Hadoop uses "fat" jars, we now have
> classloader troubles [1]. We were able to work around the issue by
> documenting the required classpath constructions. However, this is not
> possible for all users. This is where the shade plugin comes in.
>
> I gave shade a shot [2], though I admit I'm not terribly familiar with it,
> nor the inner workings of maven. As far as I can tell, the trouble is the
> plugin invocation is limited to the single module in our multi-module
> project. I think what's happening is shade is doing what I want for the
> hbase-protocol module, but I also need it to do bytecode rewriting for the
> modules that consume this module as well. Luckily, this detail is not part
> of our public API so it should be transparent to our users.
>
> My patch is posted on [2]. Any pointers/advice is greatly appreciated.
>
> Thanks in advance,
> Nick
>
> [0]: https://issues.apache.org/jira/browse/HBASE-9867
> [1]: https://issues.apache.org/jira/browse/HBASE-10304
> [2]: https://issues.apache.org/jira/browse/HBASE-11118
>
Reply | Threaded
Open this post in threaded view
|

Re: Shading all modules in a project

james northrup
i have had better luck in recent years with maven-dependency-plugin
creating a classpath pointing to the ~/.m2 artifacts versus uberjar, shade,
and assembly.  not every production app can build a maven app in place to
run, but where it can a) startup can be faster using test "target/" to skip
compile on rerun and b) dep version collisions and mismatches are easier to
trace into using said classpath


On Wed, May 21, 2014 at 9:15 AM, Nick Dimiduk <[hidden email]> wrote:

> Ping. Any advice on this?
>
>
> On Mon, May 12, 2014 at 9:01 AM, Nick Dimiduk <[hidden email]> wrote:
>
> > Hello,
> >
> > Over in Apache HBase, we have an interesting problem that I'm hoping the
> > shade plugin can help us solve. It all started with a performance
> > enhancement to reduce garbage by creating a subclass of a protobuf type
> > [0]. It works fine for us most of the time. However, when using this code
> > from a Hadoop mapreduce job, things get strange. Due to the permissions
> on
> > the parent object and the way Hadoop uses "fat" jars, we now have
> > classloader troubles [1]. We were able to work around the issue by
> > documenting the required classpath constructions. However, this is not
> > possible for all users. This is where the shade plugin comes in.
> >
> > I gave shade a shot [2], though I admit I'm not terribly familiar with
> it,
> > nor the inner workings of maven. As far as I can tell, the trouble is the
> > plugin invocation is limited to the single module in our multi-module
> > project. I think what's happening is shade is doing what I want for the
> > hbase-protocol module, but I also need it to do bytecode rewriting for
> the
> > modules that consume this module as well. Luckily, this detail is not
> part
> > of our public API so it should be transparent to our users.
> >
> > My patch is posted on [2]. Any pointers/advice is greatly appreciated.
> >
> > Thanks in advance,
> > Nick
> >
> > [0]: https://issues.apache.org/jira/browse/HBASE-9867
> > [1]: https://issues.apache.org/jira/browse/HBASE-10304
> > [2]: https://issues.apache.org/jira/browse/HBASE-11118
> >
>



--
Jim Northrup  *  (408) 837-2270 *
Reply | Threaded
Open this post in threaded view
|

Re: Shading all modules in a project

Nick Dimiduk
In reply to this post by Nick Dimiduk
In general, I agree with you regarding the perils of this "fat jar"
approach. However, due to legacy reasons, this is a feature that we must
support. Give this context, is there any advice regarding how to use the
shade plugin to rewrite all references to the moved class in dependent
submodules? Is this a feature the share plugin supports? Assuming this is a
reasonable use of the shade plugin, we have a willingness to contribute a
patch.

Thank you for the replies.
Nick

On Wed, May 21, 2014 at 10:44 AM, Martin Gainty <[hidden email]> wrote:

>
>
>
> > From: [hidden email]
> > Date: Wed, 21 May 2014 09:15:51 -0700
> > Subject: Re: Shading all modules in a project
> > To: [hidden email]
>
> >
> > Ping. Any advice on this?
> >
> >
> > On Mon, May 12, 2014 at 9:01 AM, Nick Dimiduk <[hidden email]>
> wrote:
> >
> > > Hello,
> > >
> > > Over in Apache HBase, we have an interesting problem that I'm hoping
> the
> > > shade plugin can help us solve. It all started with a performance
> > > enhancement to reduce garbage by creating a subclass of a protobuf type
> > > [0]. It works fine for us most of the time. However, when using this
> code
> > > from a Hadoop mapreduce job, things get strange. Due to the
> permissions on
> > > the parent object and the way Hadoop uses "fat" jars
> MG>Uberjar or KitchenSinkJar or FatJar or SealedJar forces the situation of
> MG>whenever any of the dependencies in fatjar change that you are packing
> into fatjar you will need to trigger a repackage of FatJar
>
>
> > >we now have
> > > classloader troubles [1]. We were able to work around the issue by
> > > documenting the required classpath constructions. However, this is not
> > > possible for all users. This is where the shade plugin comes in.
> > >
> > > I gave shade a shot [2], though I admit I'm not terribly familiar with
> it,
> > > nor the inner workings of maven. As far as I can tell, the trouble is
> the
> > > plugin invocation is limited to the single module in our multi-module
> > > project. I think what's happening is shade is doing what I want for the
> > > hbase-protocol module
> MG>yes but every time the original (non-shaded) project you are shading is
> updated
> MG>then you will need a mechanism to trigger that update to your shaded
> jar (to grab those deltas)
>
>
> > >but I also need it to do bytecode rewriting
> > >for the
> > > modules that consume this module as well. Luckily, this detail is not
> part
> > > of our public API so it should be transparent to our users.
> > >
> > > My patch is posted on [2]. Any pointers/advice is greatly appreciated.
> MG>break the fat jars into discrete individual maven artifacts is what i
> would suggest
> MG>
>
> > >
> > > Thanks in advance,
> > > Nick
> > >
> > > [0]: https://issues.apache.org/jira/browse/HBASE-9867
> > > [1]: https://issues.apache.org/jira/browse/HBASE-10304
> > > [2]: https://issues.apache.org/jira/browse/HBASE-11118
> > >
>
Reply | Threaded
Open this post in threaded view
|

Re: Shading all modules in a project

james northrup
anecdotally i can only suggest that where assembly and shade might fail,
proguard might be slightly smarter, if scarier.


On Wed, May 21, 2014 at 12:20 PM, Nick Dimiduk <[hidden email]> wrote:

> In general, I agree with you regarding the perils of this "fat jar"
> approach. However, due to legacy reasons, this is a feature that we must
> support. Give this context, is there any advice regarding how to use the
> shade plugin to rewrite all references to the moved class in dependent
> submodules? Is this a feature the share plugin supports? Assuming this is a
> reasonable use of the shade plugin, we have a willingness to contribute a
> patch.
>
> Thank you for the replies.
> Nick
>
> On Wed, May 21, 2014 at 10:44 AM, Martin Gainty <[hidden email]>
> wrote:
>
> >
> >
> >
> > > From: [hidden email]
> > > Date: Wed, 21 May 2014 09:15:51 -0700
> > > Subject: Re: Shading all modules in a project
> > > To: [hidden email]
> >
> > >
> > > Ping. Any advice on this?
> > >
> > >
> > > On Mon, May 12, 2014 at 9:01 AM, Nick Dimiduk <[hidden email]>
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > Over in Apache HBase, we have an interesting problem that I'm hoping
> > the
> > > > shade plugin can help us solve. It all started with a performance
> > > > enhancement to reduce garbage by creating a subclass of a protobuf
> type
> > > > [0]. It works fine for us most of the time. However, when using this
> > code
> > > > from a Hadoop mapreduce job, things get strange. Due to the
> > permissions on
> > > > the parent object and the way Hadoop uses "fat" jars
> > MG>Uberjar or KitchenSinkJar or FatJar or SealedJar forces the situation
> of
> > MG>whenever any of the dependencies in fatjar change that you are packing
> > into fatjar you will need to trigger a repackage of FatJar
> >
> >
> > > >we now have
> > > > classloader troubles [1]. We were able to work around the issue by
> > > > documenting the required classpath constructions. However, this is
> not
> > > > possible for all users. This is where the shade plugin comes in.
> > > >
> > > > I gave shade a shot [2], though I admit I'm not terribly familiar
> with
> > it,
> > > > nor the inner workings of maven. As far as I can tell, the trouble is
> > the
> > > > plugin invocation is limited to the single module in our multi-module
> > > > project. I think what's happening is shade is doing what I want for
> the
> > > > hbase-protocol module
> > MG>yes but every time the original (non-shaded) project you are shading
> is
> > updated
> > MG>then you will need a mechanism to trigger that update to your shaded
> > jar (to grab those deltas)
> >
> >
> > > >but I also need it to do bytecode rewriting
> > > >for the
> > > > modules that consume this module as well. Luckily, this detail is not
> > part
> > > > of our public API so it should be transparent to our users.
> > > >
> > > > My patch is posted on [2]. Any pointers/advice is greatly
> appreciated.
> > MG>break the fat jars into discrete individual maven artifacts is what i
> > would suggest
> > MG>
> >
> > > >
> > > > Thanks in advance,
> > > > Nick
> > > >
> > > > [0]: https://issues.apache.org/jira/browse/HBASE-9867
> > > > [1]: https://issues.apache.org/jira/browse/HBASE-10304
> > > > [2]: https://issues.apache.org/jira/browse/HBASE-11118
> > > >
> >
>



--
Jim Northrup  *  (408) 837-2270 *