Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Romain Manni-Bucau
There are multiple possible incremental support:

1. Scm related: do a status and rebuild downstream reactor
2. Full and module build graph: seems it is the one you target, ie bypass
modules without change. Note that it only works if upstream graph is taken
into account.
3. Full build: each mojo has incremental support so the full build gets it.
Issue is that it requires each mojo to know if it needs to be executed or
give enough info to the mojo executor to do so (gradle requires all
inputs/outputs to assume this state - which is still just an heuristic and
not 100% reliable).

In current state, 2. sounds like a good option since 3 can require  a loot
of work for external plugins (today's builds have a lot more of not maven
provide plugins than core plugins).
Now, we should be able to activate it or not so having a cacheLocation
config in settings.xml can be good.

Side notes:

1. having it on by default will break builds - reactor is deterministic and
bypassing a module can break a build since it can init maven properties -
for ex - for next modules
2. You cant find all in/out paths from the pom in general so your algo is
not generic, a meta config can be needed in .mvn
3. We should let a mojo be able to disable that to replace default logic
(surefire is a good example where it must be refined and it can save hours
there ;))
4. Let's try to impl it as a mvn extension first then if it works well on
multiple big project get it to core?

Romain



Le ven. 13 sept. 2019 à 23:18, Tibor Digana <[hidden email]> a
écrit :

> In theory, the incremental compiler would make it faster.
> But this can be told only if you present a demo project with has trivial
> tests taking much less time to complete than the compiler.
>
> In reality the tests in huge projects take significantly longer time than
> the compiler.
> Some developers say "switch off all the tests" in the release phase but
> that's wrong because then the quality goes down and methodologies are
> broken.
>
> I can see a big problem that we do not have an interface between Surefire
> and Compiler plugin negotiating which tests have been modified including
> modules and classes in the entire structure.
>
> Having incremental compiler is easy, just use compiler:3.8.1 or use the
> Takari compiler.
> But IMO the biggest benefit in performance would be after having the truly
> incremental test executor.
>
> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> [hidden email]> wrote:
>
> > Hi All,
> >
> >
> >
> > *We want to create upstream change to Maven* to support true incremental
> > build for big-sized projects.
> >
> > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > internal procedures. So, *before starting the process we would like to
> > get your feedback regarding this feature*.
> >
> >
> >
> > *Motivation:*
> >
> >
> >
> > Our project is hosted in mono-repo and contains ~600 modules. All modules
> > has the same SNAPSHOT version.
> >
> > There are lot of test automation around this, everything is tested before
> > merge into release branch.
> >
> >
> >
> > Current setup helps us to simplify build/release/dependency management
> for
> > 10+ teams those contribute into codebase. We can release everything in
> > 1-click.
> >
> > The major drawback of such approach is build time: *full local build took
> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >
> >
> >
> > To speed-up our build we needed 2 features: incremental build and shared
> > cache.
> >
> > Initially we started to think about migration to Gradle or Bazel. As
> > migration costs for the mentioned tools were too high, we decided to add
> > similar functionality into Maven.
> >
> >
> >
> > Current results we get: *1-2 mins for local build(*-T8*)* if build was
> > cached by CI*, CI build ~5 mins (*-T16*).*
> >
> >
> >
> > *Feature description:*
> >
> >
> >
> > The idea is to calculate checksum for inputs and save outputs in cache.
> >
> > [image: image2019-8-27_20-0-14.png]
> >
> > Each node checksum calculated with:
> >
> >
> >
> > ·         Effective POM hash
> >
> > ·         Sources hash
> >
> > ·         Dependencies hash (dependencies within multi-module project)
> >
> >
> >
> > Project sources inputs are searched inside project + all paths from
> > plugins configuration:
> >
> > [image: image2019-8-30_10-28-56.png]
> >
> > How does it work in practice:
> >
> >
> >
> > 1.       CI: runs builds and stores outputs in shared cache
> >
> > 2.       CI: reuse outputs for same inputs, so time is decreasing
> >
> > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > project, I get all actual snapshots from remote cache for this branch
> >
> > 4.       Locally: if I change multiple modules in tree, only changed
> > subtree is rebuilt
> >
> >
> >
> > Impact on current Maven codebase is very localized (MojoExecutor, where
> we
> > injected cache controller).
> >
> > Caching can be activated/deactivated by property, so current maven flow
> > will work as is.
> >
> >
> >
> > And the big plus is that you don’t need to re-work your current project.
> > Caching should work out of box, just need to add config in .mvn folder.
> >
> >
> >
> > Please let us know what do you think. We are ready to invest in this
> > feature and address any further feedback.
> >
> >
> >
> > Kind regards,
> >
> > Max
> >
> >
> >
> >
> > ---
> > This e-mail may contain confidential and/or privileged information. If
> you
> > are not the intended recipient (or have received this e-mail in error)
> > please notify the sender immediately and delete this e-mail. Any
> > unauthorized copying, disclosure or distribution of the material in this
> > e-mail is strictly forbidden.
> >
> > Please refer to https://www.db.com/disclosures for additional EU
> > corporate and regulatory disclosures and to
> > http://www.db.com/unitedkingdom/content/privacy.htm for information
> about
> > privacy.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Alexander Ashitkin
Let us evaluate this approach. But if we go extension way, it will be not so big motivation to make it part of maven. and i'm not sure what are long term strategy for maven, but without incremental builld it becomes less and less attractive in our multi-branched world

Thank you

On 2019/09/14 08:48:00, Romain Manni-Bucau <[hidden email]> wrote:

> Le sam. 14 sept. 2019 à 08:00, Alexander Ashitkin <[hidden email]>
> a écrit :
>
> > Indeed we have a kind of the option 2 with variations. Current
> > implementation is opt-in feature driven by configuration with some metadata
> > of required cache behavior and hints.
> >
> > Maven extensions is the option, but we would love to have it in maven
> > itself which is in interest of maven community i believe. Extension is a
> > way we are trying to avoid and even not sure it could be implemented as
> > extension as it requires changes in maven core.
> >
>
> No real change required in maven core here since guice enables to override
> any bean or even just to rewrite the pom to remove modules to just rebuild
> the minimum set (keeping downstream project).
>
> The only challenge is an exhaustive test suite since your current impl can
> easily fake a passing build (as gradle does today if you dont disable the
> daemon and state cache on the CI).
>
> Side note: test relationship discovery is close to AOT in terms of impl and
> very very slow so can be worse than doing the full suite in simple projects
> and it still asks the IT question.
>
> So due to the numerous "?" of a core solution, extension is the way to go.
> Now if a guice bean in core can help to write your extension, it can surely
> be reviewed more easily IMHO.
>
> Hope it helps.
>
>
> > Thanks in advance, Aleks
> >
> > On 2019/09/13 21:37:15, Romain Manni-Bucau <[hidden email]> wrote:
> > > There are multiple possible incremental support:
> > >
> > > 1. Scm related: do a status and rebuild downstream reactor
> > > 2. Full and module build graph: seems it is the one you target, ie bypass
> > > modules without change. Note that it only works if upstream graph is
> > taken
> > > into account.
> > > 3. Full build: each mojo has incremental support so the full build gets
> > it.
> > > Issue is that it requires each mojo to know if it needs to be executed or
> > > give enough info to the mojo executor to do so (gradle requires all
> > > inputs/outputs to assume this state - which is still just an heuristic
> > and
> > > not 100% reliable).
> > >
> > > In current state, 2. sounds like a good option since 3 can require  a
> > loot
> > > of work for external plugins (today's builds have a lot more of not maven
> > > provide plugins than core plugins).
> > > Now, we should be able to activate it or not so having a cacheLocation
> > > config in settings.xml can be good.
> > >
> > > Side notes:
> > >
> > > 1. having it on by default will break builds - reactor is deterministic
> > and
> > > bypassing a module can break a build since it can init maven properties -
> > > for ex - for next modules
> > > 2. You cant find all in/out paths from the pom in general so your algo is
> > > not generic, a meta config can be needed in .mvn
> > > 3. We should let a mojo be able to disable that to replace default logic
> > > (surefire is a good example where it must be refined and it can save
> > hours
> > > there ;))
> > > 4. Let's try to impl it as a mvn extension first then if it works well on
> > > multiple big project get it to core?
> > >
> > > Romain
> > >
> > >
> > >
> > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana <[hidden email]> a
> > > écrit :
> > >
> > > > In theory, the incremental compiler would make it faster.
> > > > But this can be told only if you present a demo project with has
> > trivial
> > > > tests taking much less time to complete than the compiler.
> > > >
> > > > In reality the tests in huge projects take significantly longer time
> > than
> > > > the compiler.
> > > > Some developers say "switch off all the tests" in the release phase but
> > > > that's wrong because then the quality goes down and methodologies are
> > > > broken.
> > > >
> > > > I can see a big problem that we do not have an interface between
> > Surefire
> > > > and Compiler plugin negotiating which tests have been modified
> > including
> > > > modules and classes in the entire structure.
> > > >
> > > > Having incremental compiler is easy, just use compiler:3.8.1 or use the
> > > > Takari compiler.
> > > > But IMO the biggest benefit in performance would be after having the
> > truly
> > > > incremental test executor.
> > > >
> > > > On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> > > > [hidden email]> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > >
> > > > >
> > > > > *We want to create upstream change to Maven* to support true
> > incremental
> > > > > build for big-sized projects.
> > > > >
> > > > > To raise a pull request we have to pass long chain of Deutsche Bank’s
> > > > > internal procedures. So, *before starting the process we would like
> > to
> > > > > get your feedback regarding this feature*.
> > > > >
> > > > >
> > > > >
> > > > > *Motivation:*
> > > > >
> > > > >
> > > > >
> > > > > Our project is hosted in mono-repo and contains ~600 modules. All
> > modules
> > > > > has the same SNAPSHOT version.
> > > > >
> > > > > There are lot of test automation around this, everything is tested
> > before
> > > > > merge into release branch.
> > > > >
> > > > >
> > > > >
> > > > > Current setup helps us to simplify build/release/dependency
> > management
> > > > for
> > > > > 10+ teams those contribute into codebase. We can release everything
> > in
> > > > > 1-click.
> > > > >
> > > > > The major drawback of such approach is build time: *full local build
> > took
> > > > > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> > > > >
> > > > >
> > > > >
> > > > > To speed-up our build we needed 2 features: incremental build and
> > shared
> > > > > cache.
> > > > >
> > > > > Initially we started to think about migration to Gradle or Bazel. As
> > > > > migration costs for the mentioned tools were too high, we decided to
> > add
> > > > > similar functionality into Maven.
> > > > >
> > > > >
> > > > >
> > > > > Current results we get: *1-2 mins for local build(*-T8*)* if build
> > was
> > > > > cached by CI*, CI build ~5 mins (*-T16*).*
> > > > >
> > > > >
> > > > >
> > > > > *Feature description:*
> > > > >
> > > > >
> > > > >
> > > > > The idea is to calculate checksum for inputs and save outputs in
> > cache.
> > > > >
> > > > > [image: image2019-8-27_20-0-14.png]
> > > > >
> > > > > Each node checksum calculated with:
> > > > >
> > > > >
> > > > >
> > > > > ·         Effective POM hash
> > > > >
> > > > > ·         Sources hash
> > > > >
> > > > > ·         Dependencies hash (dependencies within multi-module
> > project)
> > > > >
> > > > >
> > > > >
> > > > > Project sources inputs are searched inside project + all paths from
> > > > > plugins configuration:
> > > > >
> > > > > [image: image2019-8-30_10-28-56.png]
> > > > >
> > > > > How does it work in practice:
> > > > >
> > > > >
> > > > >
> > > > > 1.       CI: runs builds and stores outputs in shared cache
> > > > >
> > > > > 2.       CI: reuse outputs for same inputs, so time is decreasing
> > > > >
> > > > > 3.       Locally: when I checkout branch and run ‘install’ for whole
> > > > > project, I get all actual snapshots from remote cache for this branch
> > > > >
> > > > > 4.       Locally: if I change multiple modules in tree, only changed
> > > > > subtree is rebuilt
> > > > >
> > > > >
> > > > >
> > > > > Impact on current Maven codebase is very localized (MojoExecutor,
> > where
> > > > we
> > > > > injected cache controller).
> > > > >
> > > > > Caching can be activated/deactivated by property, so current maven
> > flow
> > > > > will work as is.
> > > > >
> > > > >
> > > > >
> > > > > And the big plus is that you don’t need to re-work your current
> > project.
> > > > > Caching should work out of box, just need to add config in .mvn
> > folder.
> > > > >
> > > > >
> > > > >
> > > > > Please let us know what do you think. We are ready to invest in this
> > > > > feature and address any further feedback.
> > > > >
> > > > >
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Max
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ---
> > > > > This e-mail may contain confidential and/or privileged information.
> > If
> > > > you
> > > > > are not the intended recipient (or have received this e-mail in
> > error)
> > > > > please notify the sender immediately and delete this e-mail. Any
> > > > > unauthorized copying, disclosure or distribution of the material in
> > this
> > > > > e-mail is strictly forbidden.
> > > > >
> > > > > Please refer to https://www.db.com/disclosures for additional EU
> > > > > corporate and regulatory disclosures and to
> > > > > http://www.db.com/unitedkingdom/content/privacy.htm for information
> > > > about
> > > > > privacy.
> > > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Tibor Digana
In reply to this post by Romain Manni-Bucau
Robert, I understand the discussion. There is the requirement (1) to cache
the targets and second discussion is to (2) switch on/off unchanged modules
in multi-module project.


I had used the (1) in the company Software AG and it was really
unpredictable build with hunders POMs in project. I would never ever
recommend doing it with caching targets and local repos. We as developer
were unable to run the local build, you had to rely on matching SCM
revision and hash in the cache.

Solution for (2) is simple. There are already plugins and extensions. We
only have to "tell the Maven" to switch off building unmodified modules -
skipp the lifecycle in a module.
The (2) at least does not look like a workaround because no change would
happen in unchanged module. Of course this feature should be disabled by
default, and enabled explicitly by the user.

I agree with Enrico, saying that x100 modules in multi-module project
represents a bad project design and the structure should split into
separate SCM projects.

I do not agree with some user saying that a separate dependencies (in
segregated SCM project) should declare SNAPSHOT versions. Again, I have
commercial experiences in Scheidt & Bachmann where this approach broke
consistency of the main product and caused some slowness of s/w
development. Solution was to teach the developer to execute the command
"mvn release prepare release:perform" which I did in the company.
I think the Maven is about the best practices and we should keep declaring
them all the time.

On Sat, Sep 14, 2019 at 3:44 PM Robert Scholte <[hidden email]> wrote:

> Tibor, it seems like you're missing the bigger picture.
> The question is similar to what we've discussed in the past: can we
> define
> if surefire should be executed or not?
>
> We should define incremental builds as "should a goal be executed or
> not?", e.g. based on the results of the previous build.
> First of all: calling 'clean' makes it impossible to do incremental builds.
> Next, it is the *plugin-developer* that knows best if the goal should be
> executed or not. Now it is still logic inside the plugin, but if the
> plugin API understands input and output, we can leave it up to Maven to
> decide if a goal should be executed.
> The buildplan now gives us a graph of Maven Projects, but theoretically
> with such changes we could make a graph of goals. And it could detect
> useless calls of goals, because the output is never being used.
> Some might recognize a Gradle concept here, and that's correct. At this
> point they were able to design something that works better compared to
> Maven. For their build cache extension they had to analyze the plugin
> descriptors, marking all parameters as either input or output. And that
> boosts the builds with their extension.
>
> thanks,
> Robert
>
>
> On Sat, 14 Sep 2019 13:37:40 +0200, Tibor Digana <[hidden email]>
>
> wrote:
>
> > oh yeah, exactly opposite.
> > Jenkins has several ways to create Maven build configuration and it knows
> > where the repo and workspace is, it knows where to store the archive, it
> > knows when the build failed.
> > We cannot take the responsibility because the build may fail for whatever
> > reason and we do not know whether to keep the folders or delete all
> > "/target" folders or just to delete only the failed one. The user knows
> > it.
> > We cannot archive the folders because we may significantly cause very
> > high
> > disk usage which would be without the control of CI. And we cannot take
> > the
> > responsibility of lifetime of these archives. It is all the property of
> > Jenkins and Jenkins has the feature and management plugins where the
> > workspace may retain for certain period of time, archives are limited in
> > some way. The archives can be stored in another folder and we should not
> > adopt these responsibilities because then we suddenly end up with all the
> > knowledge of the distributed system and then we as maven project would
> > end
> > as unmaintainable project with many more issues in Jira and
> requirements
> > we
> > would be able to find the spare time to develop.
> >
> > On Sat, Sep 14, 2019 at 1:25 PM Romain Manni-Bucau
> > <[hidden email]>
> > wrote:
> >
> >> Tibor, maven is the only one with the logic to give any cache the data
> >> it
> >> needs. Jenkins alone can't since it does not own the reactor nor
> plugin
> >> I/O
> >> values.
> >>
> >> Le sam. 14 sept. 2019 à 12:45, Tibor Digana <[hidden email]> a
> >> écrit :
> >>
> >> > But I do not understand why the Maven should be responsible for the
> >> project
> >> > cahe control/management of "/target" directories.
> >> > It is a responsibility of the build manager which is the Jenkins.
> >> > The Jenkins has the ability to archive files and such property already
> >> > exists in the Jenkins.
> >> >
> >> > So the Jenkins has a full knowledge about:
> >> >
> >> > 1. how long the workspace content retains intact
> >> > 2. what commit hash is for the last build/job/branch
> >> > 3. and what commit was successful
> >> >
> >> > If the target directories retain intact (or renewed from archive) in
> >> the
> >> > workspace for very long time and the workspace was reused by the next
> >> build
> >> > then I would say that the improvement should work as it is on CI
> >> level.
> >> >
> >> > Maybe what is necessary is only that improvement in Maven where we
> >> would
> >> > obtain the list of modules or directories of changes in the current
> >> commit.
> >> > Then the Maven can highly optimize its own build steps and build only
> >> those
> >> > modules which have been changed and their dependent modules.
> >> > So the interface between CI and Maven is needed in a kind of
> >> extension or
> >> > the class MavenCli can be extended with some new entrypoint.
> >> >
> >> > But I do not hink that Maven has to take care of responsibilities of
> >> CI
> >> > (project cache mgmt), that's not our task I would say and we as Maven
> >> would
> >> > never know all about the miscellaneous CI specifics and therefore we
> >> would
> >> > not cope with CI related troubles.
> >> >
> >> > Cheers
> >> > Tibor17
> >> >
> >> >
> >> >
> >> > On Sat, Sep 14, 2019 at 11:08 AM Robert Scholte <[hidden email]
> >
> >> > wrote:
> >> >
> >> > > On Fri, 13 Sep 2019 23:37:15 +0200, Romain Manni-Bucau
> >> > > <[hidden email]> wrote:
> >> > >
> >> > > > There are multiple possible incremental support:
> >> > > >
> >> > > > 1. Scm related: do a status and rebuild downstream reactor
> >> > > > 2. Full and module build graph: seems it is the one you target, ie
> >> > bypass
> >> > > > modules without change. Note that it only works if upstream
> graph
> >> is
> >> > > > taken
> >> > > > into account.
> >> > > > 3. Full build: each mojo has incremental support so the full build
> >> gets
> >> > > > it.
> >> > > > Issue is that it requires each mojo to know if it needs to be
> >> executed
> >> > or
> >> > > > give enough info to the mojo executor to do so (gradle requires
> >> all
> >> > > > inputs/outputs to assume this state - which is still just an
> >> heuristic
> >> > > > and
> >> > > > not 100% reliable).
> >> > > >
> >> > > > In current state, 2. sounds like a good option since 3 can
> >> require  a
> >> > > > loot
> >> > > > of work for external plugins (today's builds have a lot more of
> >> not
> >> > maven
> >> > > > provide plugins than core plugins).
> >> > > > Now, we should be able to activate it or not so having a
> >> cacheLocation
> >> > > > config in settings.xml can be good.
> >> > > >
> >> > > > Side notes:
> >> > > >
> >> > > > 1. having it on by default will break builds - reactor is
> >> deterministic
> >> > > > and
> >> > > > bypassing a module can break a build since it can init maven
> >> > properties -
> >> > > > for ex - for next modules
> >> > > > 2. You cant find all in/out paths from the pom in general so your
> >> algo
> >> > is
> >> > > > not generic, a meta config can be needed in .mvn
> >> > > > 3. We should let a mojo be able to disable that to replace default
> >> > logic
> >> > > > (surefire is a good example where it must be refined and it can
> >> save
> >> > > > hours
> >> > > > there ;))
> >> > > > 4. Let's try to impl it as a mvn extension first then if it works
> >> well
> >> > on
> >> > > > multiple big project get it to core?
> >> > >
> >> > > Did anyone Google for "maven extension build cache"? There are
> >> already
> >> > > commercial solutions for it.
> >> > > Even though I would like to see improvements in this area, the old
> >> > > architecture of Maven makes it quite hard to move to that situation.
> >> > > First
> >> > > of all it requires changes to the Plugin API (without breaking
> >> backwards
> >> > > compatibility) to have support out of the box.
> >> > >
> >> > > Robert
> >> > >
> >> > > >
> >> > > > Romain
> >> > > >
> >> > > >
> >> > > >
> >> > > > Le ven. 13 sept. 2019 à 23:18, Tibor Digana
> >> <[hidden email]>
> >> a
> >> > > > écrit :
> >> > > >
> >> > > >> In theory, the incremental compiler would make it faster.
> >> > > >> But this can be told only if you present a demo project with has
> >> > trivial
> >> > > >> tests taking much less time to complete than the compiler.
> >> > > >>
> >> > > >> In reality the tests in huge projects take significantly longer
> >> time
> >> > > >> than
> >> > > >> the compiler.
> >> > > >> Some developers say "switch off all the tests" in the release
> >> phase
> >> > but
> >> > > >> that's wrong because then the quality goes down and methodologies
> >> are
> >> > > >> broken.
> >> > > >>
> >> > > >> I can see a big problem that we do not have an interface between
> >> > > >> Surefire
> >> > > >> and Compiler plugin negotiating which tests have been modified
> >> > including
> >> > > >> modules and classes in the entire structure.
> >> > > >>
> >> > > >> Having incremental compiler is easy, just use compiler:3.8.1 or
> >> use
> >> > the
> >> > > >> Takari compiler.
> >> > > >> But IMO the biggest benefit in performance would be after
> having
> >> the
> >> > > >> truly
> >> > > >> incremental test executor.
> >> > > >>
> >> > > >> On Fri, Sep 13, 2019 at 10:46 PM Maximilian Novikov <
> >> > > >> [hidden email]> wrote:
> >> > > >>
> >> > > >> > Hi All,
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *We want to create upstream change to Maven* to support true
> >> > > >> incremental
> >> > > >> > build for big-sized projects.
> >> > > >> >
> >> > > >> > To raise a pull request we have to pass long chain of Deutsche
> >> > Bank’s
> >> > > >> > internal procedures. So, *before starting the process we would
> >> like
> >> > to
> >> > > >> > get your feedback regarding this feature*.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *Motivation:*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Our project is hosted in mono-repo and contains ~600 modules.
> >> All
> >> > > >> modules
> >> > > >> > has the same SNAPSHOT version.
> >> > > >> >
> >> > > >> > There are lot of test automation around this, everything is
> >> tested
> >> > > >> before
> >> > > >> > merge into release branch.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Current setup helps us to simplify build/release/dependency
> >> > management
> >> > > >> for
> >> > > >> > 10+ teams those contribute into codebase. We can release
> >> everything
> >> > in
> >> > > >> > 1-click.
> >> > > >> >
> >> > > >> > The major drawback of such approach is build time: *full local
> >> build
> >> > > >> took
> >> > > >> > 45-60 min (*-T8)*, CI build ~25min(*-T16*)*.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > To speed-up our build we needed 2 features: incremental build
> >> and
> >> > > >> shared
> >> > > >> > cache.
> >> > > >> >
> >> > > >> > Initially we started to think about migration to Gradle or
> >> Bazel.
> >> As
> >> > > >> > migration costs for the mentioned tools were too high, we
> >> decided
> >> to
> >> > > >> add
> >> > > >> > similar functionality into Maven.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Current results we get: *1-2 mins for local build(*-T8*)* if
> >> build
> >> > was
> >> > > >> > cached by CI*, CI build ~5 mins (*-T16*).*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > *Feature description:*
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > The idea is to calculate checksum for inputs and save outputs
> >> in
> >> > > >> cache.
> >> > > >> >
> >> > > >> > [image: image2019-8-27_20-0-14.png]
> >> > > >> >
> >> > > >> > Each node checksum calculated with:
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > ·         Effective POM hash
> >> > > >> >
> >> > > >> > ·         Sources hash
> >> > > >> >
> >> > > >> > ·         Dependencies hash (dependencies within multi-module
> >> > project)
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Project sources inputs are searched inside project + all paths
> >> from
> >> > > >> > plugins configuration:
> >> > > >> >
> >> > > >> > [image: image2019-8-30_10-28-56.png]
> >> > > >> >
> >> > > >> > How does it work in practice:
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > 1.       CI: runs builds and stores outputs in shared cache
> >> > > >> >
> >> > > >> > 2.       CI: reuse outputs for same inputs, so time is
> >> decreasing
> >> > > >> >
> >> > > >> > 3.       Locally: when I checkout branch and run ‘install’ for
> >> whole
> >> > > >> > project, I get all actual snapshots from remote cache for this
> >> > branch
> >> > > >> >
> >> > > >> > 4.       Locally: if I change multiple modules in tree, only
> >> changed
> >> > > >> > subtree is rebuilt
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Impact on current Maven codebase is very localized
> >> (MojoExecutor,
> >> > > >> where
> >> > > >> we
> >> > > >> > injected cache controller).
> >> > > >> >
> >> > > >> > Caching can be activated/deactivated by property, so current
> >> maven
> >> > > >> flow
> >> > > >> > will work as is.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > And the big plus is that you don’t need to re-work your current
> >> > > >> project.
> >> > > >> > Caching should work out of box, just need to add config in .mvn
> >> > > >> folder.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Please let us know what do you think. We are ready to invest in
> >> this
> >> > > >> > feature and address any further feedback.
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > Kind regards,
> >> > > >> >
> >> > > >> > Max
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >> > ---
> >> > > >> > This e-mail may contain confidential and/or privileged
> >> information.
> >> > If
> >> > > >> you
> >> > > >> > are not the intended recipient (or have received this e-mail in
> >> > error)
> >> > > >> > please notify the sender immediately and delete this e-mail.
> >> Any
> >> > > >> > unauthorized copying, disclosure or distribution of the
> >> material
> >> in
> >> > > >> this
> >> > > >> > e-mail is strictly forbidden.
> >> > > >> >
> >> > > >> > Please refer to https://www.db.com/disclosures for
> additional
> >> EU
> >> > > >> > corporate and regulatory disclosures and to
> >> > > >> > http://www.db.com/unitedkingdom/content/privacy.htm for
> >> information
> >> > > >> about
> >> > > >> > privacy.
> >> > > >> >
> >> > >
> >> > >
> >> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: [hidden email]
> >> > > For additional commands, e-mail: [hidden email]
> >> > >
> >> > >
> >> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Falko Modler
In reply to this post by Romain Manni-Bucau
Hi there,

I must admit that I did not read everything but in case you are using
Git this extension might help:

https://github.com/vackosar/gitflow-incremental-builder

It is a Maven extension and it is _not_ limited to Gitflow setups!

Disclaimer: I am not the owner of this project but I am a "Collaborator"
(I can cut releases etc.).

Feedback is very much appreciated.


PS: I hope this works, never posted to a mailing list before. :-O


Best regards,

Falko


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Maven incremental build for BIG-sized projects with local and remote caching

Enrico Olivelli
In reply to this post by Romain Manni-Bucau
Hi Maximilian,
is there anyway to see this work ? is it already open source? (I am sorry,
maybe I missed some email with links)

Enrico

Il giorno ven 20 set 2019 alle ore 19:30 Alexander Ashitkin <
[hidden email]> ha scritto:

> Hi Martijn
> thanks for positive feedback.
>
> Regarding IDE part, yes you're right on integration part, but still there
> important cases when cache helps:
> 1) you need to navigate less in project as top level targets fast enough
> to not drill down
> 2) if you need to build a part of project (say only rest of wicket) you
> need to provide up-to-date rest dependencies which are not active in the
> subproject - and caches restores missing pieces for you without rebuilding
> remaining part of the project
> 3) If you need to test project and invoke test - cache saves your time (as
> gradle does) on unchanged pieces
> 4) and because tests run faster you can try run slow tests which often too
> expensive in rapid development
>
> So maven integration in Intellij works nice. There is nothing super smart
> here, just sharing how i benefit from the cache in everyday ide work
>
> Thank you!
>
> On 2019/09/19 11:28:48, Martijn Dashorst <[hidden email]>
> wrote:
> > On Thu, Sep 19, 2019 at 7:48 AM Alexander Ashitkin
> > <[hidden email]> wrote:
> > > Configuration:
> > > * verify -T4 -P default,all-shapshots-repos
> > > * my project config (might be suboptimal for wicket)
> > > * scala tests disabled in 2 modules (caused bytecode version conflict
> on my machine)
> > >
> > > Results
> > > Clean state (cache disabled):                           15:58 min
> > > Second run, target up to date (cache disabled):      10:20 min
> > > Fully cached (no changes):                                      17.507
> s
> > > wicketstuff-jwicket-tooltip-wtooltips changed:          34.936 s
> > > wicketstuff-rest-utils changed:                                 54.040
> s
> > >
> > > If you want to try other modules - please let me know.
> >
> > Nice results!
> >
> > > regarding ide - it's a usual maven installation, so any ide with maven
> integration should benefit from cache them maven action invoked
> >
> > My instinct says that an IDE as Eclipse won't benefit much from it, as
> > it has its own build lifecycle. Only when you invoke a commandline
> > Maven action (such as generate-sources) one might have a benefit.
> >
> > So in the day-to-day life the caching might not be as beneficial for
> > developers, but commandline builds happen often enough to make this
> > matter.
> >
> > Martijn
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>