Re: Reproducible Builds for war files and Tomcat expectations

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Reproducible Builds for war files and Tomcat expectations

Hervé BOUTEMY
Le vendredi 17 avril 2020, 22:20:11 CEST Michael Osipov a écrit :

> Am 2020-04-17 um 22:00 schrieb [hidden email]:
> > Reproducible Builds is now implemented in many plugins: it's time to work
> > on reproducible war files.
> >
> > I created MWAR-432 issue and implemented classical Reproducible jar output
> > in corresponding branch.
> >
> > But in our discussion in november [1], an issue was reported for unchanged
> > timestamp in SNAPSHOTs war regarding Tomcat detection algorithm of
> > changed content.
> If you are referring to Tomcat's ETag calculation, it uses both
> timestamp and file size. The weak ETag will change as soon as the fize
> size changes. Of course, this is a problem because if the file changes,
> but the content length does not, the ETag won't.
>
>  From a Tomcat perspective this does not matter because every deployment
> has its own class loader and in-memory cache.
>
>  From a client's one, yes. The client will receive a 304 and read from
> cache although the file has been changed.
>
> > Should we just disable reproducible wars for SNAPSHOTs and enable it only
> > for releases? Should we add a boolean option for people to decide whether
> > they want reproducible SNAPSHOTs?
> It should be a user choice.
>
> > Is there any idea on how we could manage output timestamp for SNAPSHOTs?
>
> You could do baselining:
>
> * All files which have been changed before output timestamp, will have
> output timestamp
> * All other files will retain their timestamp.
>
> A changed file will be immediately reflected by its new timestamp.
that means "not reproducible", then if we're at that level, let's just not
apply any timestamp calculations: it's just disabling reproducible output for
SNASPHOTs

>
> Please also ask on [hidden email], I might have missed other use cases.
good idea, thank you

>
> Michael
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Reproducible Builds for war files and Tomcat expectations

Hervé BOUTEMY
Le samedi 18 avril 2020, 22:59:42 CEST Romain Manni-Bucau a écrit :

> > If you tell me every servlet container uses timestamp of entries in wars
> > to
> > change their behaviour, I'm ok and will stop talking about Tomcat only,
> > but
> > more about "servlet containers" (or better if you propose another term): I
> > can
> > understand how this is useful for client HTTP cache of static content. If
> > there is also such handling in a servlet container for classes in war, or
> > jars
> > in wars, I'm eager to learn (and learn for jars in wars if it's only the
> > .class timestamp in jars that is used, or also the timestamp of the jars
> > in
> > the war)
>
> Yes all do, even java -jar mywebapp.jar.
> I also know several batches or standalone apps using that to log the
> deployed version (very useful in prod where date helps more than numbers
> for CD).
ok
perhaps adding a paragraph or a Q/A on https://maven.apache.org/guides/mini/
guide-reproducible-builds.html would be the best strategy, to describe the
different ways of configuring timestamps in archives, depending on usage

>
> > > Now, technically, a new phase in release plugin adding the timestamp
> > > (and
> > > eol?) in the pom in prepare phase works I think and is (almost) portable
> > > for the jar, war and ear packagings for users, no? I can envision an
> > > <enforceReproducible> in release plugin maybe.
> > >
> > > Am I missing something?
> >
> > I just don't understand how what you describe is different from what was
> > done
> > in maven-release-plugin in MRELEASE-1029 [1]
> > And I don't understand what "enforceReproducible" can mean in release
> > plugin
> > (enforcing is really hard, believe me: I still don't fully understand
> > maven-
> > site-plugin 3.9.0 release was not reproducible, because of maven-invoker-
> > plugin interaction during reference release build)
>
> Enforcing the timestamp and not portable params helps. Must be hardcoded in
> the pom of the tag imo otherwise it wouldnt exist and be really
> reproduc*a*ble.
ok, I understand about checking fixed timestamp in archive, not really what you
call "non portable params".
And sorry, but even checking fixed timestamps won't always work: in the case of
maven-shade-plugin, the algorithm used when shading from libraries was not to
override timestamp but keep original

>
> > thanks for the interest and strong feedback: it's hard but useful
> >
> > Regards,
> >
> > Hervé
> >
> > [1] https://maven.apache.org/guides/mini/guide-reproducible-builds.html
> >
> > Le samedi 18 avril 2020, 19:21:07 CEST Romain Manni-Bucau a écrit :
> > > Hervé, can you clarify why you mention so strongly tomcat?
> > >
> > > Just out of my head tomcat, undertow, vertx, netty, camel, all cxf
> > > transports (including standalone, talend sdk, wildfly, tomee,
> > > meecrowave,
> > > restlet, quarkus, helidon, jetty, resin, play, akka-http, xnio and much
> > > more are affected at war and jar levels so why would tomcat be specific?
> > > It is just impacting any application using timestamps of what is
> >
> > packaged.
> >
> > > To rephrase it: all impl are quite equivalent.
> > >
> > > Now, technically, a new phase in release plugin adding the timestamp
> > > (and
> > > eol?) in the pom in prepare phase works I think and is (almost) portable
> > > for the jar, war and ear packagings for users, no? I can envision an
> > > <enforceReproducible> in release plugin maybe.
> > >
> > > Am I missing something?
> > >
> > > Le sam. 18 avr. 2020 à 15:44, Hervé BOUTEMY <[hidden email]> a
> > >
> > > écrit :
> > > > SNAPSHOTs and releases are similar from certain points of view and
> > > > different
> > > > from others: what interests me is that we vote on releases, which are
> >
> > the
> >
> > > > only
> > > > official states we publish (as a source-release.zip tarball
> >
> > disconnected
> >
> > > > from
> > > > scm), but we don't do it on SNAPSHOTs, which are only intermediate
> >
> > states
> >
> > > > we
> > > > only keep on scm.
> > > >
> > > > But I understand that from a Tomcat perspective, hosting SNAPSHOTs or
> > > > releases
> > > > is quite similar.
> > > >
> > > > Perhaps this question of Reproducible Builds and the impact of fixed
> > > > timestamp
> > > > in jar/wars would be something that would require some Tomcat-specific
> > > > documentation (that perhaps other servlet containers don't implement
> >
> > the
> >
> > > > same
> > > > way), that we could point to in maven-war-plugin?
> > > >
> > > > We should probably switch this discussion to Tomcat ML, users or dev.
> > > >
> > > > Regards,
> > > >
> > > > Hervé
> > > >
> > > > Le samedi 18 avril 2020, 12:34:01 CEST Romain Manni-Bucau a écrit :
> > > > > Snapshot or releases are not different until you force a pre-release
> > > > > step
> > > > > to hardcode the timestamp in the pom which would defeat release
> >
> > process
> >
> > > > > until done in release plugin IMHO.
> > > > > Using scm meta is not that bad in that aspect.
> > > > >
> > > > > Side note: this is not a war plugin issue and affects jar plugin too
> > > >
> > > > cause
> > > >
> > > > > they regularly host we resources too (through standard servlet
> >
> > layout or
> >
> > > > > not).
> > > > >
> > > > > So overall sounds like a transversal archiver fix and not by type to
> >
> > me.
> >
> > > > > Le sam. 18 avr. 2020 à 12:14, Hervé BOUTEMY <[hidden email]>
> >
> > a
> >
> > > > > écrit :
> > > > > > keeping SNAPSHOTs reproducible with calculated value for timestamp
> >
> > is
> >
> > > > an
> > > >
> > > > > > option that can be chosen by some people
> > > > > > assuming SNAPSHOTs are not reproducible seems a reasonable option:
> > > > > > requirement
> > > > > > for reproducible SNAPSHOTs is really different from requirement
> > > > > > for
> > > > > > releases
> > > > > >
> > > > > > IMHO, the misc strategies available for war files, against the way
> > > > > > some
> > > > > > web
> > > > > > servers use timestamp, will require dedicated documentation in
> > > >
> > > > maven-war-
> > > >
> > > > > > plugin
> > > > > >
> > > > > > and one key war-specific feature: disable reproducible output for
> > > > > > SNAPSHOTs,
> > > > > > which is one strategy that can be chosen (Jira issue yet to be
> > > > > > created)
> > > > > >
> > > > > > Will require help from others to write this doc before the
> > > > > > maven-war-plugin
> > > > > > release.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Hervé
> > > > > >
> > > > > > Le samedi 18 avril 2020, 10:30:21 CEST Romain Manni-Bucau a écrit
:

> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I got the same issue for my current work webapps (not at archive
> > > >
> > > > level
> > > >
> > > > > > but
> > > > > >
> > > > > > > docker image level - I'm using jib and it enforces a "constant"
> > > > > >
> > > > > > timestamp).
> > > > > >
> > > > > > > I solved it by using last commit timestamp as file timestamp.
> >
> > Indeed
> >
> > > > it
> > > >
> > > > > > can
> > > > > >
> > > > > > > miss a "strict" cache hit but globally it is a good compromise
> >
> > and
> >
> > > > guess
> > > >
> > > > > > it
> > > > > >
> > > > > > > can be reused for reproducible builds.
> > > > > > > In any case, a project using a scm will be reproducible only
> > > >
> > > > regarding a
> > > >
> > > > > > > commit so it sounds like the least bad compromise.
> > > > > > > wdyt?
> > > > > > >
> > > > > > > Romain Manni-Bucau
> > > > > > > @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> > > > > > > <https://rmannibucau.metawerx.net/> | Old Blog
> > > > > > > <http://rmannibucau.wordpress.com> | Github <
> > > > > >
> > > > > > https://github.com/rmannibucau>
> > > > > >
> > > > > > > | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
> > > > > > >
> > > > > > > <
> >
> > https://www.packtpub.com/application-development/java-ee-8-high-performanc
> >
> > > > > > e
> > > > > >
> > > > > > > Le sam. 18 avr. 2020 à 10:08, Hervé BOUTEMY <
> >
> > [hidden email]>
> >
> > > > a
> > > >
> > > > > > > écrit :
> > > > > > > > Le vendredi 17 avril 2020, 22:20:11 CEST Michael Osipov a
> >
> > écrit :
> > > > > > > > > Am 2020-04-17 um 22:00 schrieb [hidden email]:
> > > > > > > > > > Reproducible Builds is now implemented in many plugins:
> > it's
> >
> > > > time
> > > >
> > > > > > to
> > > > > >
> > > > > > > > work
> > > > > > > >
> > > > > > > > > > on reproducible war files.
> > > > > > > > > >
> > > > > > > > > > I created MWAR-432 issue and implemented classical
> > > > > > > > > > Reproducible
> > > > > > > > > > jar
> > > > > > > >
> > > > > > > > output
> > > > > > > >
> > > > > > > > > > in corresponding branch.
> > > > > > > > > >
> > > > > > > > > > But in our discussion in november [1], an issue was
> >
> > reported
> >
> > > > for
> > > >
> > > > > > > > unchanged
> > > > > > > >
> > > > > > > > > > timestamp in SNAPSHOTs war regarding Tomcat detection
> > > >
> > > > algorithm of
> > > >
> > > > > > > > > > changed content.
> > > > > > > > >
> > > > > > > > > If you are referring to Tomcat's ETag calculation, it uses
> >
> > both
> >
> > > > > > > > > timestamp and file size. The weak ETag will change as soon
> > > > > > > > > as
> > > > > > > > > the
> > > > > >
> > > > > > fize
> > > > > >
> > > > > > > > > size changes. Of course, this is a problem because if the
> >
> > file
> >
> > > > > > changes,
> > > > > >
> > > > > > > > > but the content length does not, the ETag won't.
> > > > > > > > >
> > > > > > > > >  From a Tomcat perspective this does not matter because
> > > > > > > > >  every
> > > > > >
> > > > > > deployment
> > > > > >
> > > > > > > > > has its own class loader and in-memory cache.
> > > > > > > > >
> > > > > > > > >  From a client's one, yes. The client will receive a 304 and
> > > > > > > > >  read
> > > > > >
> > > > > > from
> > > > > >
> > > > > > > > > cache although the file has been changed.
> > > > > > > > >
> > > > > > > > > > Should we just disable reproducible wars for SNAPSHOTs and
> > > >
> > > > enable
> > > >
> > > > > > it
> > > > > >
> > > > > > > > only
> > > > > > > >
> > > > > > > > > > for releases? Should we add a boolean option for people to
> > > >
> > > > decide
> > > >
> > > > > > > > whether
> > > > > > > >
> > > > > > > > > > they want reproducible SNAPSHOTs?
> > > > > > > > >
> > > > > > > > > It should be a user choice.
> > > > > > > > >
> > > > > > > > > > Is there any idea on how we could manage output timestamp
> >
> > for
> >
> > > > > > > > SNAPSHOTs?
> > > > > > > >
> > > > > > > > > You could do baselining:
> > > > > > > > >
> > > > > > > > > * All files which have been changed before output timestamp,
> > > > > > > > > will
> > > > > >
> > > > > > have
> > > > > >
> > > > > > > > > output timestamp
> > > > > > > > > * All other files will retain their timestamp.
> > > > > > > > >
> > > > > > > > > A changed file will be immediately reflected by its new
> > > >
> > > > timestamp.
> > > >
> > > > > > > > that means "not reproducible", then if we're at that level,
> >
> > let's
> >
> > > > just
> > > >
> > > > > > not
> > > > > >
> > > > > > > > apply any timestamp calculations: it's just disabling
> >
> > reproducible
> >
> > > > > > output
> > > > > >
> > > > > > > > for
> > > > > > > > SNASPHOTs
> > > > > > > >
> > > > > > > > > Please also ask on [hidden email], I might have missed
> >
> > other
> >
> > > > use
> > > >
> > > > > > > > cases.
> > > > > > > > good idea, thank you
> > > > > > > >
> > > > > > > > > Michael
> > > >
> > > > --------------------------------------------------------------------
> > > >
> > > > > > > > > -
> > > > > > > > > To unsubscribe, e-mail: [hidden email]
> > > > > > > > > For additional commands, e-mail: [hidden email]
> > > >
> > > > ---------------------------------------------------------------------
> > > >
> > > > > > > > To unsubscribe, e-mail: [hidden email]
> > > > > > > > For additional commands, e-mail: [hidden email]
> >
> > ---------------------------------------------------------------------
> >
> > > > > > To unsubscribe, e-mail: [hidden email]
> > > > > > For additional commands, e-mail: [hidden email]
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: [hidden email]
> > > > For additional commands, e-mail: [hidden email]
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]