Hi Henk,
I'm trying to get a handle on Vdbench return codes.
Once in a long while I run into an error where one of the slave JVMs doesn't shutdown correctly.
localhost stdout:
09:14:53.694 java.lang.RuntimeException: Shutdown took more than 5 minutes; Run aborted
09:14:53.694 at Vdb.common.failure(common.java:335)
09:14:53.694 at Vdb.SlaveWorker.doRegularWorkload(SlaveWorker.java:269)
09:14:53.694 at Vdb.SlaveWorker.run(SlaveWorker.java:136)
logfile:
09:14:56.557 SocketException from aborting slave. That's OK.
09:14:56.557
09:14:56.557 Slave localhost-14 prematurely terminated.
09:14:56.557
java.lang.RuntimeException: Slave localhost-14 prematurely terminated.
at Vdb.common.failure(common.java:335)
at Vdb.SlaveStarter.startSlave(SlaveStarter.java:198)
at Vdb.SlaveStarter.run(SlaveStarter.java:47)
For example, one slave JVM (of 8) at 256qd issues IOs at the very first interval and then never issues any more IOs, times out after the 5 minutes after the 5 minute workload ends, and vdbench gives a return code of 157.
The achieved queue depth is 233 instead of 256 because this one JVM is waiting forever and I guess part of the queue relies on that JVM.
I've encountered this same error on two different machines on two different drives twice. And it's impossible for me to reliably reproduce and is a very rare occurrence.
I'm running on Linux with Vdb 5.04.06 with the SlaveJVM patch
Is the return code 157 unique to this error?
---
On a related note, I'm seeing Vdbench giving a returncode of 1 on Windows sometimes (once every 100 or so runs) even though the workload completes successfully and no error messages are displayed.
Is sometimes returning 1 indented on Windows?