[Commented] (SUREFIRE-1302) Surefire does not wait long enough for the forked VM and assumes it to be dead

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[Commented] (SUREFIRE-1302) Surefire does not wait long enough for the forked VM and assumes it to be dead

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SUREFIRE-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015668#comment-16015668 ]

Olivier Peyrusse commented on SUREFIRE-1302:

Hello Tibor,

running surefire 2.21-SNAPSHOT, I got once again the VM shutdown. I attached to the issue the maven logs. My personal logs seems to show that a small thread supposed to print a message every 250 ms was not scheduled during 12s.
GC logs available as error in jvmRun1 show that a pause of ~12s, confirming my diagnostic.
# Created on 2017-05-18T05:50:30.395
Corrupted stdin stream in forked JVM 1. Stream '[GC pause (G1 Evacuation Pause) (young)-- 888M->820M(1024M), 11.8820911 secs]'.
So the current adaptive ping seems not to be enough. Tell me if you want me to test something else.

Best regards

> Surefire does not wait long enough for the forked VM and assumes it to be dead
> ------------------------------------------------------------------------------
>                 Key: SUREFIRE-1302
>                 URL: https://issues.apache.org/jira/browse/SUREFIRE-1302
>             Project: Maven Surefire
>          Issue Type: Request
>          Components: Maven Surefire Plugin
>    Affects Versions: 2.19.1
>            Reporter: Yuriy Zaplavnov
>            Assignee: Tibor Digana
>             Fix For: 2.20.1
>         Attachments: 2017-05-18T05-48-08_685-jvmRun1.dumpstream, surefire-logs, surefire-tests-terminated-master-aa9330316038f6b46316ce36ff40714ffc7cf299.zip, tests_log_01.txt, tests_log_02.txt
> This issue happens because surefire kills the forked container if it times out waiting for the 'ping'.
> In org.apache.maven.surefire.booter.ForkedBooter class there is hardcoded constant PING_TIMEOUT_IN_SECONDS  = 20 which is used in the following method:
> {code}
> private static ScheduledFuture<?> listenToShutdownCommands( CommandReader reader )
>     {
>         reader.addShutdownListener( createExitHandler( reader ) );
>         AtomicBoolean pingDone = new AtomicBoolean( true );
>         reader.addNoopListener( createPingHandler( pingDone ) );
>         return JVM_TERMINATOR.scheduleAtFixedRate( createPingJob( pingDone, reader ),
>                                                    0,PING_TIMEOUT_IN_SECONDS, SECONDS );
>     }
> {code}
> to create ScheduledFuture.
> In some of the cases the forked container might respond a bit later than it's expected and surefire kills it
> {code}
> private static Runnable createPingJob( final AtomicBoolean pingDone, final CommandReader reader  )
>     {
>         return new Runnable()
>         {
>             public void run()
>             {
>                 boolean hasPing = pingDone.getAndSet( false );
>                 if ( !hasPing )
>                 {
>                     exit( 1, KILL, reader, true );
>                 }
>             }
>         };
>     }
> {code}
> As long as we need to terminate it anyway, It would be really helpful if the problem could be solved making the PING_TIMEOUT_IN_SECONDS  configurable with the ability to specify the value from maven-surefire-plugin.
> It would help to configure this timeout based on needs and factors of the projects where surefire runs.

This message was sent by Atlassian JIRA