[PATCH] CTDB test improvements

Martin Schwenke martin at meltin.net
Fri Feb 2 04:05:39 UTC 2018


On Fri, 2 Feb 2018 14:27:52 +1300, Gary Lockyer via samba-technical
<samba-technical at lists.samba.org> wrote:

> We're seeing consistent test timeout's in our cloud autobuilds, as below.
> 
> ==========================================================================
> TEST TIMEOUT: tests/cunit/protocol_test_002.sh (status 124) (duration: 600s)
> ==========================================================================
> 
> If needed I can stand up an instance on our cloud to assist with debugging.

Sorry...  :-(

If I run that by hand on my laptop it takes only 1m44.363s!

Under valgrind it times out because it would take about 40 minutes to run!  That's annoying... and is why, when I hand test under valgrind, I always interrupt this test and make it fail.

In the last autobuild I did on sn-devel it took this long:

==========================================================================
TEST PASSED: tests/cunit/protocol_test_002.sh (duration: 125s)
==========================================================================

We're doing 1000 iterations with random data to ensure confidence in
our protocol marshalling code.

Options:

* We can increase the timeout.

  This timeout is meant to be a public service to avoid indefinitely
  hanging autobuilds due to an unexpected programming errors in tests.
  It seems to have backfired.  :-(

  How long does it usually take to run in your cloud autobuilds?

  If you don't have any old logs showing this then "git grep
  TEST_TIMEOUT" will show you where the default of 600s is set. Please
  try adding a patch on top to increase it and see how long it take.
  ctdb/tests/run_tests.sh ctdb/tests/cunit/protocol_test_002.sh will
  let you run that test on its own.

  We could increase this timeout to an hour if we need to.

* We could consider reducing the run-time of the test by doing less
  iterations.  However, that obviously makes the test less useful.

* We can insist that Samba autobuild is run with a realistic amount of
  CPU power!  ;-)

  I'm semi-serious here.  I wonder what you're doing that makes this
  test run for more than 10 minutes.  The test uses a single
  process/thread so it just needs a single CPU thread.

  I think I understand that you're running autobuilds in some sort of
  constrained manner to make races more obvious, but there has to be a
  lower bound on the resources needed to run autobuild.

We clearly need a solution...  I'm happy with the first as long as you
can give me a number, so we're not playing whack-a-mole by continually
patching the timeout upwards.  Will an hour do the trick?

Thanks...

peace & happiness,
martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20180202/f822c96c/attachment-0001.sig>


More information about the samba-technical mailing list