[PATCH] CTDB test improvements
Martin Schwenke
martin at meltin.net
Fri Feb 2 04:05:39 UTC 2018
On Fri, 2 Feb 2018 14:27:52 +1300, Gary Lockyer via samba-technical
<samba-technical at lists.samba.org> wrote:
> We're seeing consistent test timeout's in our cloud autobuilds, as below.
>
> ==========================================================================
> TEST TIMEOUT: tests/cunit/protocol_test_002.sh (status 124) (duration: 600s)
> ==========================================================================
>
> If needed I can stand up an instance on our cloud to assist with debugging.
Sorry... :-(
If I run that by hand on my laptop it takes only 1m44.363s!
Under valgrind it times out because it would take about 40 minutes to run! That's annoying... and is why, when I hand test under valgrind, I always interrupt this test and make it fail.
In the last autobuild I did on sn-devel it took this long:
==========================================================================
TEST PASSED: tests/cunit/protocol_test_002.sh (duration: 125s)
==========================================================================
We're doing 1000 iterations with random data to ensure confidence in
our protocol marshalling code.
Options:
* We can increase the timeout.
This timeout is meant to be a public service to avoid indefinitely
hanging autobuilds due to an unexpected programming errors in tests.
It seems to have backfired. :-(
How long does it usually take to run in your cloud autobuilds?
If you don't have any old logs showing this then "git grep
TEST_TIMEOUT" will show you where the default of 600s is set. Please
try adding a patch on top to increase it and see how long it take.
ctdb/tests/run_tests.sh ctdb/tests/cunit/protocol_test_002.sh will
let you run that test on its own.
We could increase this timeout to an hour if we need to.
* We could consider reducing the run-time of the test by doing less
iterations. However, that obviously makes the test less useful.
* We can insist that Samba autobuild is run with a realistic amount of
CPU power! ;-)
I'm semi-serious here. I wonder what you're doing that makes this
test run for more than 10 minutes. The test uses a single
process/thread so it just needs a single CPU thread.
I think I understand that you're running autobuilds in some sort of
constrained manner to make races more obvious, but there has to be a
lower bound on the resources needed to run autobuild.
We clearly need a solution... I'm happy with the first as long as you
can give me a number, so we're not playing whack-a-mole by continually
patching the timeout upwards. Will an hour do the trick?
Thanks...
peace & happiness,
martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20180202/f822c96c/attachment-0001.sig>
More information about the samba-technical
mailing list