Re: [DISCUSS] configuration for Reproducible Builds

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Hervé BOUTEMY
very good question: this was a key question to me, that lead me to the PoC to
test and see
this PoC showed a basic fact: a POM will inherit the value from his parent

Then once a parent POM has a "reproducibility" timestamp, child POMs inherit
reproducible configuration, and can eventually override value

I imagine that we'll do a maven-release-plugin feature to update the value to
match release timestamp
But we can start by updating the value by hand if we want to have a value that
has a better meaning that the inherited one

Regards,

Hervé

Le samedi 28 septembre 2019, 22:52:04 CEST Enrico Olivelli a écrit :

> Hervé
> When will you set this value? During release:prepare and modify the pom?
>
> Enrico
>
> Il sab 28 set 2019, 17:55 Hervé BOUTEMY <[hidden email]> ha scritto:
> > Achieving Reproducible Builds require only one parameter: plugins that
> > create
> > zip or tar archives require a fixed timestamp for entries
> >
> > Putting that parameter as a pom property with a well known name and value
> > format permits to share the configuration between every packaging plugin.
> > This also has the advantage that child poms will inherit from parent
> > value,
> > and eventually override.
> >
> > The question is: *what property name and what value format should we
> > keep?*
> >
> > For the PoC, I chose to extrapolate from a convention from Reproducible
> > Builds
> > project, which is very Linux-oriented: SOURCE_DATE_EPOCH environment
> > variable,
> > that I transformed into source-date-epoch property name, keeping the "date
> > +
> > %s" value
> > https://reproducible-builds.org/docs/source-date-epoch/
> >
> >
> > But I feel we can do a more user-readable solution by choosing another
> > name
> > and format, like "reproducible-build-timestamp" with an ISO-8601 combined
> > date
> > and time representation
> >
> >
> > WDYT? Any other idea?
> >
> > Regards,
> >
> > Hervé
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Enrico Olivelli
Il dom 29 set 2019, 12:16 Robert Scholte <[hidden email]> ha scritto:

> I would think that all project.* properties represent the pom.xml and are
> immutable. To be more precise: the same pom.xml should effectively stay
> the same with every build.
> Instead this seems more related the (maven)session, right?
>

If I understand correctly the value f this property is to be committed in
the source code.
Some thoughts:
- it should default to 'now' sampled at the beginning of the session
- it can be overridden with a value writen in the pom or on a source file
- it must be sampled and commited as final value during a 'release process'


Enrico



> Robert
>
> On Sun, 29 Sep 2019 11:19:45 +0200, Hervé BOUTEMY <[hidden email]>
>
> wrote:
>
> > regarding the property name, I had an idea:
> >
> > why not do like we already did for  ${project.build.sourceEncoding},
> ie.
> > mimic
> > a future element in pom.xml, in build?
> >
> > could be project.build.timestamp?
> >
> > Le samedi 28 septembre 2019, 17:55:24 CEST Hervé BOUTEMY a écrit :
> >> Achieving Reproducible Builds require only one parameter: plugins that
> >> create zip or tar archives require a fixed timestamp for entries
> >>
> >> Putting that parameter as a pom property with a well known name and
> >> value
> >> format permits to share the configuration between every packaging
> >> plugin.
> >> This also has the advantage that child poms will inherit from parent
> >> value,
> >> and eventually override.
> >>
> >> The question is: *what property name and what value format should we
> >> keep?*
> >>
> >> For the PoC, I chose to extrapolate from a convention from Reproducible
> >> Builds project, which is very Linux-oriented: SOURCE_DATE_EPOCH
> >> environment
> >> variable, that I transformed into source-date-epoch property name,
> >> keeping
> >> the "date + %s" value
> >> https://reproducible-builds.org/docs/source-date-epoch/
> >>
> >>
> >> But I feel we can do a more user-readable solution by choosing another
> >> name
> >> and format, like "reproducible-build-timestamp" with an ISO-8601
> >> combined
> >> date and time representation
> >>
> >>
> >> WDYT? Any other idea?
> >>
> >> Regards,
> >>
> >> Hervé
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Hervé BOUTEMY
In reply to this post by Hervé BOUTEMY
Le dimanche 29 septembre 2019, 16:54:04 CEST Emmanuel Bourg a écrit :

> Le 28/09/2019 à 17:55, Hervé BOUTEMY a écrit :
> > Putting that parameter as a pom property with a well known name and value
> > format permits to share the configuration between every packaging plugin.
> > This also has the advantage that child poms will inherit from parent
> > value,
> > and eventually override.
>
> It seems a bit odd to me to tie the build timestamp to the pom. The fact
> that it could be inherited is disturbing, i.e. if you forget to override
> it in a subproject your build time will be wrong.
what does "wrong" or "right" mean?
I know some solution for reproducible builds that blindly put zip entries
timestamp to 1970-01-01

>
> > WDYT? Any other idea?
>
> I thought the timestamp would rather go to a separate file deployed
> along the pom and capturing the build environment. What Maven needs then
> is a command line parameter to force a specific build time (and/or
> support for the SOURCE_DATE_EPOCH environment variable).
a separate file? oh no!!!
please try Maven core branch reproducible and you'll see Maven native
Reproducible Build in action
https://github.com/apache/maven/tree/reproducible
The timestamp was defined 4 days ago, and will remain defined for as long as
nobody change it in git

Regards,

Hervé

>
> Emmanuel Bourg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Hunter C Payne
In reply to this post by Hervé BOUTEMY
 What if that timestamp was based upon the scm's last commit timestamp instead of the time of the build?

Hunter
    On Sunday, September 29, 2019, 10:25:41 AM PDT, Tibor Digana <[hidden email]> wrote:  
 
 Can somebody explain a realistic USE CASE when you trigger two consequent
builds (with no changes in sources) and that you expect identical MD5 of
the build artifacts (JAR)?
This can be achieved only when the "timeStamp" in properties is fixed
unmodified in POM.
Does it make sense to call it timeStamp if it is not related to build time
nothing but some kind of , well I don't know what time because the time is
incremental and does not stop.
Even not very pragmatic to control it because you as USER have to correlate
MD5 with the algorith which produces the timeStamp and its changes.
If no MD5 has changed, the timeStamp would not change either. And here is
the problem: what if this rule is broken? And another problem: how you want
to ensure that the timeStamp is changed if and only if the changes have
been done. Somehow connected to the previous discussion with the
Incremental Build.

On Sun, Sep 29, 2019 at 5:55 PM Romain Manni-Bucau <[hidden email]>
wrote:

> Hi all,
>
> Wonder if it can't "just" (this is not a small task but in terms of design
> it is small ;)) be a flag on higher level archiver plugins
> (maven-jar-plugin being the first one we'll all have in mind).
> I take as a reference jib here which takes into account a creation time for
> that case (
>
> https://github.com/GoogleContainerTools/jib/blob/master/jib-maven-plugin/src/main/java/com/google/cloud/tools/jib/maven/JibPluginConfiguration.java#L190
> ).
> Long story short, they have ~3 modes:
>
> 1. set epoch + 1s (there are issue setting epoch directly)
> 2. set a constant time configured by the user
> 3. respect file time (not reproducible but enable to disable it)
>
> At the end it means we don't need a project.build.* property but just to
> enrich plugins (maybe let's start with jar one) to handle that.
>
> I also wonder if I'm too biased on the topic but if I would have to work on
> it now with our current ecosystem, I would "just" (again ;)) use
> maven-shade-plugin and a set of transformers to handle all files which can
> have not deterministic changes.
> This enables to get the feature immediately without anything specific in
> maven core and handle even external plugin generated files through external
> transformers - a real reproducible build feature would need this extension
> anyway, think about frontend resources included in META-INF/resources for
> example.
> Only missing piece in shade plugin is the Jar reproducibility handling but
> this is likely very doable since we already have JarOutputStream impl at
> apache which can host it IMHO.
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github <
> https://github.com/rmannibucau> |
> LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
> <
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> >
>
>
> Le dim. 29 sept. 2019 à 17:45, Hervé BOUTEMY <[hidden email]> a
> écrit :
>
> > Le dimanche 29 septembre 2019, 12:29:47 CEST Karl Heinz Marbaise a écrit
> :
> > > Hi Hervé,
> > >
> > > On 29.09.19 11:19, Hervé BOUTEMY wrote:
> > > > regarding the property name, I had an idea:
> > > >
> > > > why not do like we already did for  ${project.build.sourceEncoding},
> > ie.
> > > > mimic a future element in pom.xml, in build?
> > > >
> > > > could be project.build.timestamp?
> > >
> > > This sounds like the best idea...
> > >
> > >
> > > This would mean to define something like this:
> > >
> > > <properties>
> > >    <project.build.timestamp>..</project.build.timestamp>
> > > </properties>
> > >
> > > But now there are coming up some questions:
> > >
> > > * Is that the real value to be used?
> > > * Or should it activate the mechanism ? (boolean?)
> > we can define both a boolean and a timestamp
> > but the timestamp de-facto means also a boolean: defined means true,
> > undefined
> > means false
> >
> > >
> > > <properties>
> > >    <project.build.timestamp.usage>true</project.build.timestamp.usage>
> > > </properties>
> > >
> > >
> > > * Or should we use it by default and giving the user the opportunity
> > >    to overrite the current timestamp by fixed timestamp for building ?
> > >    This means we would define only the real time to be used during
> > >    building. No need for a kind of activation etc.
> > >    So you could call Maven via:
> > >
> > >    mvn -Dproject.build.timestamp=... package
> > >
> > >
> > > * Or do we need a combination of the above
> > >
> > >    First activate, define the format and the timestamp to be used.
> > >
> > >
> > > Furthermore do we need to define a format either which could look like
> > this:
> > >
> > > <properties>
> > >    <project.build.timestamp.usage>true</project.build.timestamp.usage>
> > >    <project.build.timestamp>..</project.build.timestamp>
> > >
> > <project.build.timestamp.format>ISO-8601</project.build.timestamp.format>
> > > </properties>
> > letting the format as a third parameter is of course feasible, but adds
> > complexity: is it really necessary? Isn't ISO-8601 sufficient to you?
> >
> > >
> > >
> > > Kind regards
> > > Karl Heinz Marbaise
> > >
> > > > Le samedi 28 septembre 2019, 17:55:24 CEST Hervé BOUTEMY a écrit :
> > > >> Achieving Reproducible Builds require only one parameter: plugins
> that
> > > >> create zip or tar archives require a fixed timestamp for entries
> > > >>
> > > >> Putting that parameter as a pom property with a well known name and
> > value
> > > >> format permits to share the configuration between every packaging
> > plugin.
> > > >> This also has the advantage that child poms will inherit from parent
> > > >> value,
> > > >> and eventually override.
> > > >>
> > > >> The question is: *what property name and what value format should we
> > > >> keep?*
> > > >>
> > > >> For the PoC, I chose to extrapolate from a convention from
> > Reproducible
> > > >> Builds project, which is very Linux-oriented: SOURCE_DATE_EPOCH
> > > >> environment
> > > >> variable, that I transformed into source-date-epoch property name,
> > > >> keeping
> > > >> the "date + %s" value
> > > >> https://reproducible-builds.org/docs/source-date-epoch/
> > > >>
> > > >>
> > > >> But I feel we can do a more user-readable solution by choosing
> another
> > > >> name
> > > >> and format, like "reproducible-build-timestamp" with an ISO-8601
> > combined
> > > >> date and time representation
> > > >>
> > > >>
> > > >> WDYT? Any other idea?
> > > >>
> > > >> Regards,
> > > >>
> > > >> Hervé
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>  
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Tibor Digana-3
Let's not talk about technologies but talk about the use case because the
use case will uncover the purpose what target is in the background of this
request.
Romain, an therefore i was not firstly talking about concret SCM, but
Harald, and it means the SCM can be one of the implementations -
strategies. Then we will be closer to incremental build...

The problem is that "buildTimeStamp" means one purpose for you and
something else for other e.g. build-helper:timestamp-property.
Therefore it's better to list all usecases and we will see if we really
need build timestamp or another type of timestamp and what about
guarantees. Or this solution is only a glue code between POM and the User?
Just a question. The usecase would uncover all practicies with this
property, even if it is realistic and reliable.

T

On Sun, Sep 29, 2019 at 8:07 PM Romain Manni-Bucau <[hidden email]>
wrote:

> scm does not work cause one common use case is to rebuild from source the
> same artifacts (debian rebuild from source AFAIK, even java apps)
> since scm can be "proxied", copied etc then the source can differ and
> therefore commits can be differents but the content can be the same
> this is why jib uses epoch+1s to enforce reproducibility.
> that said once you have the timestamp the code is the same so let's not
> block on that, worse case we would enable to plug a value resolver with a
> few default strategy.
> this is not the central part of the feature IMHO.
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github <
> https://github.com/rmannibucau> |
> LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
> <
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> >
>
>
> Le dim. 29 sept. 2019 à 19:58, Tibor Digana <[hidden email]> a
> écrit :
>
> > yes Hunter, exactly this was one possibility.
> > The names of the property can be just like in the HTTP Headers:
> > Last-Modified
> > If-Modified-Since -> maybe here can be also the commit hash, not only
> time
> > in millis/UTC
> > ETag
> >
> > and every module may have different value ;-) then. and then the SCM...
> has
> > to resolve " If-Modified-Since " to the TIME.
> > The missing "..." is some layer useful in the incremental build too.
> >
>


--
Cheers
Tibor
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Emmanuel Bourg
In reply to this post by Hervé BOUTEMY
Le 29/09/2019 à 20:46, Hervé BOUTEMY a écrit :

> This is exactly how I see Reproducible Builds for the future:
> - select versions of plugins that bring reproducible output
> - either inherit or define a local timestamp
>
> et voilà, it's so easy (once plugins support)...

How do you plan to capture the elements of the build environment
necessary to build identical artifacts (JDK used, command line
parameters) ? As project properties too?

Emmanuel Bourg

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] configuration for Reproducible Builds

Hervé BOUTEMY
Le dimanche 29 septembre 2019, 21:31:58 CEST Emmanuel Bourg a écrit :

> Le 29/09/2019 à 20:46, Hervé BOUTEMY a écrit :
> > This is exactly how I see Reproducible Builds for the future:
> > - select versions of plugins that bring reproducible output
> > - either inherit or define a local timestamp
> >
> > et voilà, it's so easy (once plugins support)...
>
> How do you plan to capture the elements of the build environment
> necessary to build identical artifacts (JDK used, command line
> parameters) ? As project properties too?
currently, JDK version is recorded by m-jar-p in MANIFEST, and it's easy to
know if it's Windows or not: then for people who know what command to launch,
it won't be hard

for wider case, where we'd like to rebuild anything that we don't precisely
know, even with other build tools than Maven, we wrote 1 year ago a proposal:
https://reproducible-builds.org/docs/jvm/

I stopped working on the proposal, since it was not really useful until we
were able to do reproducible builds: in a few weeks, once Maven can do
Reproducible Builds quite easily, we can continue working on this proposal
with a chance people will provide feedback based on real experience

Regards,

Hervé

>
> Emmanuel Bourg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]