Using maven profiles for easily parallel testing?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Using maven profiles for easily parallel testing?

Kevin Burton
We use TeamCity internally (which is great btw) for Maven testing.

We have about 2000 tests which we continually integrate on every commit.

The problem is that testing takes about 15 minutes from start to end.

We use -T 16 on our tests and our boxes have 8 cores so this allows some
tests to block on IO while others execute.

However, I'm not happy with this.  I want our tests to finish in 2 minutes.

The only way this could happen is for us to hand parallelize everything (no
fun), or do it automatically, or buy REALLY expensive hardware (no fun).

I think one could do this with JUnit categories and maven profiles.

I think what one could do is hash the name of the test and then based on
the hash prefix, create N buckets.

Right now we have 4 CI boxes so we would run 4 profiles, one per box.  Then
TeamCity could integrate the results.

I think this is technically possible now but I think a lot of plugins would
need to cooperate here.

It would be better if maven didn't need to explicitly have to split these
out.

Maybe update surefire to have a way to only run 1/Nth of the tests and make
the set determinstic?

This way you can use two build boxes and each have two sets of tests with
no intersection.

Thoughts on this approach?  I think it would be pretty awesome.

I'm sure someone here knows the gang at TeamCity and could recommend this
to the proper decision makers.

--

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
Reply | Threaded
Open this post in threaded view
|

RE: Using maven profiles for easily parallel testing?

Sander Verhagen
Apparently "includes" support fairly complex syntax, such as regexes: http://maven.apache.org/surefire/maven-surefire-plugin/test-mojo.html#includes

I'm thinking of a clunky approach of including tests with names starting with [A-Ma-m] in one build configuration, and those with names starting with [^A-Ma-m] in another build configuration. These build configuration could have most of their actual configuration in a build template, and only specify the differences separately, which would be the regex.

My organization is also using TeamCity and Maven, and is about to embark on using (again) build chains in TeamCity, which would allow to define these two (or whichever many you desire) build configurations, and pull them together somehow.

We would be using this approach to split up, and parallelize unit tests vs. (module) integration tests vs. some end to end tests (cross-module integration tests). Some of these tests live in a separate Maven module already, which I'd call a best practice for cross-module integration tests. The simpler (module) integration tests live among our unit tests, and we've done a decent job at categorizing these (JUnit @Category annotation). (Actually, we have a unit test to make sure that all unit tests have said annotation.) We're using this mixed bag of approaches already to define subsequent build steps within the same build configuration, so it should be trivial to break them out into a build chain.

I've burned myself in the past with build chains in TeamCity, as I didn't get the feeling that they roll up very well, if some part of the chain gets stuck (fails). But this might be a bad observation of mine, and I'll be revisiting this shortly. And this also seems out of scope for this mailing list.

I'm curious to hear what you come up with and/or settle on. Please report back!



Sander Verhagen
[hidden email]  ]

NOTICE: my e-mail address has changed. Please remove [hidden email] now and start using [hidden email] from now on. Please update your address book. Thank you!

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kevin Burton
Sent: Saturday, November 5, 2016 14:34
To: Maven Users List <[hidden email]>
Subject: Using maven profiles for easily parallel testing?

We use TeamCity internally (which is great btw) for Maven testing.

We have about 2000 tests which we continually integrate on every commit.

The problem is that testing takes about 15 minutes from start to end.

We use -T 16 on our tests and our boxes have 8 cores so this allows some tests to block on IO while others execute.

However, I'm not happy with this.  I want our tests to finish in 2 minutes.

The only way this could happen is for us to hand parallelize everything (no fun), or do it automatically, or buy REALLY expensive hardware (no fun).

I think one could do this with JUnit categories and maven profiles.

I think what one could do is hash the name of the test and then based on the hash prefix, create N buckets.

Right now we have 4 CI boxes so we would run 4 profiles, one per box.  Then TeamCity could integrate the results.

I think this is technically possible now but I think a lot of plugins would need to cooperate here.

It would be better if maven didn't need to explicitly have to split these out.

Maybe update surefire to have a way to only run 1/Nth of the tests and make the set determinstic?

This way you can use two build boxes and each have two sets of tests with no intersection.

Thoughts on this approach?  I think it would be pretty awesome.

I'm sure someone here knows the gang at TeamCity and could recommend this to the proper decision makers.

--

We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Using maven profiles for easily parallel testing?

Kevin Burton
>
>
> I'm thinking of a clunky approach of including tests with names starting
> with [A-Ma-m] in one build configuration, and those with names starting
> with [^A-Ma-m] in another build configuration. These build configuration
> could have most of their actual configuration in a build template, and only
> specify the differences separately, which would be the regex.
>
>
Yeah.. I've thought about that too.

The big thing you have to worry about is if all your tests end with or
start with "Test" which is why I was thinking of the hashing approach.

But I guess this will work as long as you're careful.


> My organization is also using TeamCity and Maven, and is about to embark
> on using (again) build chains in TeamCity, which would allow to define
> these two (or whichever many you desire) build configurations, and pull
> them together somehow.
>
>
I think this is the hard part.  One would needs to merge all the test
reports in Teamcity. Not sure if it supports this already.

How do you use chains to get them to work in parallel?  I thought that this
would only work if you do them sequentially - which won't yield much of a
speedup.

But I do think the 'dependent build configuration' thing might work.

Right now we first run an incremental build, and that then fires up a full
build.

Maybe one solution is to first compile all the code, and then have that
trigger N chained builds at the same time.

This way you quickly confirm that the code compiles, and then that can
trigger all the secondary builds.

Of course that might be redundant and just waste 1-2 minutes.  Might be
better to just have a quick 'return true' script that always works and then
have that kick off your two parallel builds.

Maybe aggregating the tests output isn't strictly necessary since Teamcity
shows them in the UI...

We would be using this approach to split up, and parallelize unit tests vs.
> (module) integration tests vs. some end to end tests (cross-module
> integration tests). Some of these tests live in a separate Maven module
> already, which I'd call a best practice for cross-module integration tests.


Yeah.. we do the same thing.


> I'm curious to hear what you come up with and/or settle on. Please report
> back!
>

Will do.. actually I think talking this through made it seem a lot easier.



--

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
Reply | Threaded
Open this post in threaded view
|

Re: Using maven profiles for easily parallel testing?

Karl Heinz Marbaise-3
In reply to this post by Kevin Burton
Hi Kevin,

On 05/11/16 22:33, Kevin Burton wrote:
> We use TeamCity internally (which is great btw) for Maven testing.
>
> We have about 2000 tests which we continually integrate on every commit.
>
> The problem is that testing takes about 15 minutes from start to end.

Is this really the time for the tests only or is this the whole build
time including the test time? If it is only the time for running the
tests it means those tests are slow...

2000 Test / 15 Min = 2.2 Tests / Second...

A usual value is about by 10-100+ Tests / Second...

(My current project is running 5000 unit tests within 350 seconds (which
is slow as well)...but has 14.X tests / second.

Having a log file of the maven run available you can summarize the run
time of the tests only by using something like this (on a linux like
system):

cat logFile | egrep -a "^Tests run: [0-9]+, Failures: [0-9]+, Errors:
[0-9]+, Skipped: [0-9]+, Time elapsed: " | cut -d":" -f 6 | cut -d" "
-f2 | paste -sd+ | bc -l


This will summarize the time for the tests only...


>
> We use -T 16 on our tests and our boxes have 8 cores so this allows some
> tests to block on IO while others execute.


-T 16 tries to parallellize the module build and not parallize the
tests...furthermore I have my doubts that using -T 16 is faster than -T
4 etc..

I don't know how many modules you have in your build. If you have really
many modules like 100+ it might be a good idea to consider using Smart
Builder[1].

If you like to parallize the unit tests themself you have to configure
maven-surefire-plugin accordingly...I recommend reading the doc about
that subject[2].


Kind regards
Karl Heinz Marbaise

[1]: http://takari.io/book/30-team-maven.html
[2]:
http://maven.apache.org/components/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html


>
> However, I'm not happy with this.  I want our tests to finish in 2 minutes.
>
> The only way this could happen is for us to hand parallelize everything (no
> fun), or do it automatically, or buy REALLY expensive hardware (no fun).
>
> I think one could do this with JUnit categories and maven profiles.
>
> I think what one could do is hash the name of the test and then based on
> the hash prefix, create N buckets.
>
> Right now we have 4 CI boxes so we would run 4 profiles, one per box.  Then
> TeamCity could integrate the results.
>
> I think this is technically possible now but I think a lot of plugins would
> need to cooperate here.
>
> It would be better if maven didn't need to explicitly have to split these
> out.
>
> Maybe update surefire to have a way to only run 1/Nth of the tests and make
> the set determinstic?
>
> This way you can use two build boxes and each have two sets of tests with
> no intersection.
>
> Thoughts on this approach?  I think it would be pretty awesome.
>
> I'm sure someone here knows the gang at TeamCity and could recommend this
> to the proper decision makers.
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Using maven profiles for easily parallel testing?

Kevin Burton
>
>
> Is this really the time for the tests only or is this the whole build time
> including the test time? If it is only the time for running the tests it
> means those tests are slow...
>
>
Yeah.  I agree.  The slow ones are all essentially integration tests.

Some take 20-30 seconds as they involve things like stopping daemons like
zookoeeper / cassandra which just take a while.

At the 95th percentile all of our tests are <1ms


> Having a log file of the maven run available you can summarize the run
> time of the tests only by using something like this (on a linux like
> system):
>
> This will summarize the time for the tests only...
>
>
>
Yet another reason I love Teamcity... it does this automatically and has an
easy report :)


>
>> We use -T 16 on our tests and our boxes have 8 cores so this allows some
>> tests to block on IO while others execute.
>>
>
>
> -T 16 tries to parallellize the module build and not parallize the
> tests...furthermore I have my doubts that using -T 16 is faster than -T 4
> etc..
>
>
Yeah.. but the modules are running parallel so  you have as many parallel
tests as modules in that iteration.

Looking at our CPU we're pretty much at 100% across all our tests for the
entire time.


> I don't know how many modules you have in your build. If you have really
> many modules like 100+ it might be a good idea to consider using Smart
> Builder[1].
>
> If you like to parallize the unit tests themself you have to configure
> maven-surefire-plugin accordingly...I recommend reading the doc about that
> subject[2].
>

Ah.. interesting. Those changes should be incorporated back into maven
proper...

But we will try this out.  I think though in practice it won't make much of
a change for us.

We have like 200 modules and our build says nearly 100% the entire time.

Kevin

--

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
Reply | Threaded
Open this post in threaded view
|

Re: Using maven profiles for easily parallel testing?

ljnelson
On Sun, Nov 6, 2016 at 8:48 AM Kevin Burton <[hidden email]> wrote:

> > If you like to parallize the unit tests themself you have to configure
> > maven-surefire-plugin accordingly...I recommend reading the doc about
> that
> > subject[2].
>
> Ah.. interesting. Those changes should be incorporated back into maven
> proper...


Just a data point: I wrote a blog entry on this years ago:
https://lairdnelson.wordpress.com/2013/09/25/concurrent-testing/

I hope that is helpful.

Best,
Laird
--
http://about.me/lairdnelson