[PATCH] CTDB test improvements
Martin Schwenke
martin at meltin.net
Mon Feb 5 04:55:10 UTC 2018
On Mon, 05 Feb 2018 08:05:40 +1300, Andrew Bartlett
<abartlet at samba.org> wrote:
> On Fri, 2018-02-02 at 15:05 +1100, Martin Schwenke via samba-technical
> wrote:
> >
> > Sorry... :-(
> >
> > If I run that by hand on my laptop it takes only 1m44.363s!
> >
> > Under valgrind it times out because it would take about 40 minutes
> > to run! That's annoying... and is why, when I hand test under
> > valgrind, I always interrupt this test and make it fail.
> >
> > In the last autobuild I did on sn-devel it took this long:
> >
> > ==========================================================================
> > TEST PASSED: tests/cunit/protocol_test_002.sh (duration: 125s)
> > ==========================================================================
> >
> > We're doing 1000 iterations with random data to ensure confidence in
> > our protocol marshalling code.
> >
> > Options:
> >
> > * We can increase the timeout.
> >
> > This timeout is meant to be a public service to avoid indefinitely
> > hanging autobuilds due to an unexpected programming errors in
> > tests. It seems to have backfired. :-(
> >
> > How long does it usually take to run in your cloud autobuilds?
> >
> > If you don't have any old logs showing this then "git grep
> > TEST_TIMEOUT" will show you where the default of 600s is set.
> > Please try adding a patch on top to increase it and see how long it
> > take. ctdb/tests/run_tests.sh ctdb/tests/cunit/protocol_test_002.sh
> > will let you run that test on its own.
> >
> > We could increase this timeout to an hour if we need to.
>
> =======================================================================
> ===
> TEST PASSED: tests/cunit/protocol_test_002.sh (duration: 953s)
> =======================================================================
> ===
So, this can currently take more than 15 minutes. If you reduce memory
in future then there might be some slowdown (swapping?), so we need some
wiggle room here...
> > * We could consider reducing the run-time of the test by doing less
> > iterations. However, that obviously makes the test less useful.
> >
> > * We can insist that Samba autobuild is run with a realistic amount of
> > CPU power! ;-)
> >
> > I'm semi-serious here. I wonder what you're doing that makes this
> > test run for more than 10 minutes. The test uses a single
> > process/thread so it just needs a single CPU thread.
> >
> > I think I understand that you're running autobuilds in some sort of
> > constrained manner to make races more obvious, but there has to be a
> > lower bound on the resources needed to run autobuild.
>
> The environment we are running autobuild.py on is currently a 2 CPU 16
> GB ram VM on the Catalyst Cloud. We are trying to reduce this back to
> 8GB and hope to eventually use 4GB.
>
> The primary constraint is cost, as Samba tests flap just too often and
> overall take just too long, we regularly run them in parallel in order
> to get results we can use. (More CPUs cost more, naturally).
>
> > We clearly need a solution... I'm happy with the first as long as you
> > can give me a number, so we're not playing whack-a-mole by continually
> > patching the timeout upwards. Will an hour do the trick?
>
> I'm more than happy to get you a cloud VM to play with.
No need. There's no mystery here - the slowest test is just running
slower in your environment - so no analysis to do. We just need to
increase the timeout. Patch attached! :-)
Please review and maybe push...
peace & happiness,
martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ctdb-tests-Set-test-timeout-to-an-hour.patch
Type: text/x-patch
Size: 1051 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20180205/cd288d35/0001-ctdb-tests-Set-test-timeout-to-an-hour.bin>
More information about the samba-technical
mailing list