Talk: Bootstrapping the Java Ecosystem

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Talk: Bootstrapping the Java Ecosystem

Björn Höfling
Dear Maven Developers,

more than 4 years ago I naively asked you on how to build Maven from
sources without using Maven.

If you are interested in a declarative, bootstrappable, reproducible
and effectively executable answer to this question, Julien Lepiller
recorded a video on how he bootstrapped Maven and a maven-build-system
for GNU Guix with only using the ant-build-system. He shows how to
bootstrap Maven only from sources, which difficulties he had and how he
mastered the dependency cycles and other problems.

You can find a link to the video recordings in this announcement:

https://guix.gnu.org/en/blog/2020/online-guix-day-announce-2/

If you have any questions, you can join the discussions on Guix day,
the discussions for this talk will be on Sunday 2020-11-22, 16:00–16:25
UTC.

Happy Hacking,

Björn

attachment0 (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

AW: Talk: Bootstrapping the Java Ecosystem

Markus KARG-3
In fact, as along term user of Maven and Linux, I actually even dislike that binary JARs are rebuilt from scratch, as this is not what WORA was invented for. It adds no benefits, it just adds new bugs happening on one distro but no on another. If you ask me, immediately stop doing that and simply use the one and original built from Apache. This is not C++, this is Java.

And if you want to build Maven from scratch, then build it using Maven. That is the intended way.

-Markus


-----Ursprüngliche Nachricht-----
Von: Tamás Cservenák [mailto:[hidden email]]
Gesendet: Donnerstag, 19. November 2020 09:51
An: Maven Developers List
Betreff: Re: Talk: Bootstrapping the Java Ecosystem

Hi Bjorn and Emmanuel,

Without starting any flame wars, am really curious: why are you
repackaging Maven?

I'd understand for OS/distro native packages, but
why do you rebuild JVM bytecode as well?

Again, am not to start any flame war, am just curious!

Am Linux user since 98 (first worked on S.u.S.E at university
around '96), but was since using "dirty" distros like
openSUSE then Ubuntu and today Mint.

While watching the video, several questions arose in my head:

1) Instead of rebuilding something (let's say LOC of 1x), you
did that several, if not ten times (so LOC of 10x). So, you
had to review the codebase 10x I guess? Otherwise you are not
sure what you built - as I guess you build it from source
to be sure and able to see (among others) what are you
building. How long was the review process?

2) Similarly, in a process like this, how do you track
vulnerabilities or any other outstanding bugs? Or the
"intermediate" bootstrapped dependencies simply "does not
matter"? Just the final "output" is what matters
(let's say Maven)?

3) What are you really building? As in video, it is said
several times that you "mutilate" some package to build
it, then use it to "bootstrap" some other package, and finally
you rebuild the target package. Given in the process there
was once a "mutilated" tool, how are you certain, that output
of the build is correct (I have no doubts about
reproducibility)? How do you prove that output is what
it is thought/assumed to be?

3) (Joker) What is the overall CO2 footprint of distros like
these? I believe you did not use Apple M1 for this work... :)


Thanks in advance,
T

On Thu, Nov 19, 2020 at 9:15 AM Emmanuel Bourg <[hidden email]> wrote:

> Hi Björn,
>
> Nice presentation, the packaging of Maven in Debian followed a similar
> path but we never documented the process. Did you go as far as recording
> the exact steps and build order required to build from scratch?
>
> Spoiler for the next part of your quest toward packaging the Android
> SDK: Maven was the easy part, Gradle and Kotlin are many leagues above
> in term of circular dependencies and headache. We have been trying to
> bootstrap Kotlin for 2 years in Debian and haven't found the right path
> yet.
>
> Emmanuel Bourg
>
>
> On 18/11/2020 21:29, Björn Höfling wrote:
> > Dear Maven Developers,
> >
> > more than 4 years ago I naively asked you on how to build Maven from
> > sources without using Maven.
> >
> > If you are interested in a declarative, bootstrappable, reproducible
> > and effectively executable answer to this question, Julien Lepiller
> > recorded a video on how he bootstrapped Maven and a maven-build-system
> > for GNU Guix with only using the ant-build-system. He shows how to
> > bootstrap Maven only from sources, which difficulties he had and how he
> > mastered the dependency cycles and other problems.
> >
> > You can find a link to the video recordings in this announcement:
> >
> > https://guix.gnu.org/en/blog/2020/online-guix-day-announce-2/
> >
> > If you have any questions, you can join the discussions on Guix day,
> > the discussions for this talk will be on Sunday 2020-11-22, 16:00–16:25
> > UTC.
> >
> > Happy Hacking,
> >
> > Björn
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Talk: Bootstrapping the Java Ecosystem

Emmanuel Bourg
In reply to this post by Björn Höfling
On 19/11/2020 09:51, Tamás Cservenák wrote:

> Without starting any flame wars, am really curious: why are you
> repackaging Maven?
>
> I'd understand for OS/distro native packages, but
> why do you rebuild JVM bytecode as well?
>
> Again, am not to start any flame war, am just curious!

Short answer: why not? This is an Open Source project, not an Open
Binary project. Anyone should be able to rebuild the code, and in an
ideal world where every project is reproducible, get byte identical
binaries.

Long answer: Debian, Fedora, and I assume Guix are "closed" ecosystems
where you can rebuild every component from sources without needing tools
or libraries outside of the distribution. If you were alone on a desert
island with just a laptop, the sources and no internet connection, you
would be able to rebuild any part of the distribution from scratch.

This really goes to the roots of the open source philosophy, open source
projects are meant to be built from sources, and if it's not possible
then there is a problem somewhere. Assuming every project becomes
reproducible at some point (see https://reproducible-builds.org for why
it matters) the question of knowing who produced the binaries become
irrelevant, because everyone get the exact same binaries.


> 3) What are you really building? As in video, it is said
> several times that you "mutilate" some package to build
> it, then use it to "bootstrap" some other package, and finally
> you rebuild the target package. Given in the process there
> was once a "mutilated" tool, how are you certain, that output
> of the build is correct (I have no doubts about
> reproducibility)? How do you prove that output is what
> it is thought/assumed to be?

In Debian the Maven package we rebuild from sources is itself used to
build all the other Maven based projects packaged in Debian (that's over
600 projects currently), so regressions are caught pretty quickly (it's
rare but it happens sometimes when the binary compatibility is broken in
a core library like maven-shared-utils).


> 3) (Joker) What is the overall CO2 footprint of distros like
> these? I believe you did not use Apple M1 for this work... :)

Probably a tiny fraction of what bitcoin mining, Travis CI and
Youtube/Netflix 4K videos generate ;)

Emmanuel Bourg

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Talk: Bootstrapping the Java Ecosystem

Tamás Cservenák
Maybe it was not clear from my last statement: so you "hop on bandwagon",
and you "joined" the branch of Maven releases, you have 3.6.3. Same version
Maven devs use as well. So, you will be able to use it to rebuild 3.6.4,
then use 3.6.4 to rebuild 3.6.5, and so on...

Hence, no need to redo all this for EVERY maven version, right?

T

On Fri, Nov 20, 2020 at 10:06 AM Tamás Cservenák <[hidden email]>
wrote:

> Thanks for the answers!
>
> AFAIK, we in Apache as well "vote for source", while we provide binaries
> as well.
>
> Given the video mentions that Maven `-sources` artifacts are NOT buildable
> (which is true, they are mainly used by IDEs to display library sources
> while debug for example), am unsure -- at least for ASF artifacts -- why
> not using then source release bundles instead? For example:
> https://dist.apache.org/repos/dist/release/maven/maven-3/3.6.3/source/
>
> Also, according to your explanation, the problem is now solved once for
> all, right? You do have (those distros you mention, like Guix) Maven 3.6.3
> built now, so you do not have to repeat this anymore?
>
> My point is that Maven devs also use Maven 3.6.3 currently, and that
> version will be used to build any future Maven release as well (ie. 3.6.4
> or 4.0.0 and so on). So, you just had to "hop on" the bandwagon, do this
> "dance" once, but from now on, all this work can be scraped, right?
>
> Thanks
> Tamas
>
> On Fri, Nov 20, 2020 at 9:49 AM Emmanuel Bourg <[hidden email]> wrote:
>
>> On 19/11/2020 09:51, Tamás Cservenák wrote:
>>
>> > Without starting any flame wars, am really curious: why are you
>> > repackaging Maven?
>> >
>> > I'd understand for OS/distro native packages, but
>> > why do you rebuild JVM bytecode as well?
>> >
>> > Again, am not to start any flame war, am just curious!
>>
>> Short answer: why not? This is an Open Source project, not an Open
>> Binary project. Anyone should be able to rebuild the code, and in an
>> ideal world where every project is reproducible, get byte identical
>> binaries.
>>
>> Long answer: Debian, Fedora, and I assume Guix are "closed" ecosystems
>> where you can rebuild every component from sources without needing tools
>> or libraries outside of the distribution. If you were alone on a desert
>> island with just a laptop, the sources and no internet connection, you
>> would be able to rebuild any part of the distribution from scratch.
>>
>> This really goes to the roots of the open source philosophy, open source
>> projects are meant to be built from sources, and if it's not possible
>> then there is a problem somewhere. Assuming every project becomes
>> reproducible at some point (see https://reproducible-builds.org for why
>> it matters) the question of knowing who produced the binaries become
>> irrelevant, because everyone get the exact same binaries.
>>
>>
>> > 3) What are you really building? As in video, it is said
>> > several times that you "mutilate" some package to build
>> > it, then use it to "bootstrap" some other package, and finally
>> > you rebuild the target package. Given in the process there
>> > was once a "mutilated" tool, how are you certain, that output
>> > of the build is correct (I have no doubts about
>> > reproducibility)? How do you prove that output is what
>> > it is thought/assumed to be?
>>
>> In Debian the Maven package we rebuild from sources is itself used to
>> build all the other Maven based projects packaged in Debian (that's over
>> 600 projects currently), so regressions are caught pretty quickly (it's
>> rare but it happens sometimes when the binary compatibility is broken in
>> a core library like maven-shared-utils).
>>
>>
>> > 3) (Joker) What is the overall CO2 footprint of distros like
>> > these? I believe you did not use Apple M1 for this work... :)
>>
>> Probably a tiny fraction of what bitcoin mining, Travis CI and
>> Youtube/Netflix 4K videos generate ;)
>>
>> Emmanuel Bourg
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Talk: Bootstrapping the Java Ecosystem

Hervé BOUTEMY
In reply to this post by Björn Höfling
wow, nice results: kudos to Julien
I'll have a deep look, because (as I already discussed with Julien at some
Reproducible Builds events), if we combine his work on this bootstrapping
approach with my own Reproducible Builds for Java / Reproducible Central [1]
to get reproducible binary artifacts in Central Repository directly provided
by upstream projects, we should have the best of both worlds:
- a proof of rebuild capacity from source, which is important for deep safety,
but not easy for normal users
- a proof of content for reference binaries taken from Central Repository, to
keep an easy binary consumption

What a progress in 1 year. It's sad we can't meet and discuss as we did last
year.
I'll try to join Guix day.

Regards,

Hervé

[1] https://repo.maven.apache.org/maven2/

Le mercredi 18 novembre 2020, 21:29:04 CET Björn Höfling a écrit :

> Dear Maven Developers,
>
> more than 4 years ago I naively asked you on how to build Maven from
> sources without using Maven.
>
> If you are interested in a declarative, bootstrappable, reproducible
> and effectively executable answer to this question, Julien Lepiller
> recorded a video on how he bootstrapped Maven and a maven-build-system
> for GNU Guix with only using the ant-build-system. He shows how to
> bootstrap Maven only from sources, which difficulties he had and how he
> mastered the dependency cycles and other problems.
>
> You can find a link to the video recordings in this announcement:
>
> https://guix.gnu.org/en/blog/2020/online-guix-day-announce-2/
>
> If you have any questions, you can join the discussions on Guix day,
> the discussions for this talk will be on Sunday 2020-11-22, 16:00–16:25
> UTC.
>
> Happy Hacking,
>
> Björn





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]