Performance testing framework

Fri Aug 26 04:51:16 UTC 2016

This is another performance testing case study. The story begins with
Andrew Bartlett coming up with a series of patches to improve ldb
unindexed search performance (see the thread "[PATCH] Improve ldb
unindexed search perf"):

https://lists.samba.org/archive/samba-technical/2016-August/115750.html

To prove how brilliant these patches are, we ran the performance
testing framework over them. The results are the shown in the first
two pictures at

https://www.samba.org/~dbagnall/perf-tests/ldb.html

This is using this branch
http://git.catalyst.net.nz/gitweb?p=samba.git;a=shortlog;h=refs/heads/faster-ldb-less-talloc-on-perf-test-2
-- the two patches preceding Andrew's ldb work switch the performance
tests to search heavy tasks, one of which I made specially, and two
which are existing selftest tests.

The thing you'll notice is that the first and last patches aren't
actually any good for performance. Possibly they're detrimental. So we
tried without them, which you see in the last two graphs.

The benefit of this testing is that Samba avoided two patches that
appeared at first to be works of genius but in fact introduce nothing
but churn and slower adds on large databases. At the same time we know
we get approximately a 20% speed up on indexed searches, even though
the patch series was introduced as "[PATCH] Improve ldb unindexed
search perf".

As before, these are using the tools in 
http://git.catalyst.net.nz/gitweb?p=samba-cloud-autobuild.git;a=tree;h=refs/heads/perf;hb=refs/heads/perf

I think I'd like to put them in the Samba repo when they are ready.

Douglas

On 19/08/16 13:32, Douglas Bagnall wrote:
> 
> I have been mangling selftest to work in a mode that outputs
> performance timings and nothing else, and I have managed to backport
> this as far as 4.0 (which seems a long time ago to *me*). I've made a
> set of tests that exercise some simple AD stuff, and without further
> ado, here are some graphs:
> 
> https://www.samba.org/~dbagnall/perf-tests/
> 
> If for any reason you did not see, the graphs show that 4.0 to 4.3 are
> mostly similar, 4.4 is slow, and 4.5 is fast. Many operations
> involving many objects with linked attributes are roughly twice as
> fast in 4.5. This of course represents tester's bias: we've been
> working on that and want to show it.
> 
> The tests are plain subunit tests, so any existing test could be
> tracked (add it to selftest/perf_tests.py), but beware of tests that
> change over time or that came about with actual fixes. I chose to test
> very simple things that should work in every version.
> 
> The times are formatted as JSON. I can hear you falling off your
> chairs. JSON?! Don't we have a perfectly good serialization format in
> ASN.1? Well, the aim is to generate automatic performance charts, akin
> to http://speed.pypy.org/ or https://arewefastyet.com/ or
> http://www.astropy.org/astropy-benchmarks/. Javascript graphing
> libraries really like JSON, and it is actually quite simple to deal
> with in Python. I would like to run the tests more often -- possibly
> on every autobuild commit, or at least daily or weekly. We have people
> using Samba in large deployments, and we would prefer to discover
> scalability regressions before they grind some vast organisation to a
> halt.
> 
> The scripts I used to run the tests and make the graphs are not in the
> attached patch, and not in a Samba tree. Instead they're in a messy
> tools repository we keep at Catalyst. (e.g.:
> http://git.catalyst.net.nz/gitweb?p=samba-cloud-autobuild.git;a=blob;f=scripts/multi-perf-test;h=64f6d2d6e6;hb=refs/heads/perf).
> They could go in the Samba repository but it gets a bit meta -- the
> script spawns and patches copies of the Samba repository, which could
> get confusing if run inside a Samba repo.
> 
> And please, if you want to see how print spool performance has fared
> over time, add your test to selftest/perf_tests.py. The tests need to
> be reasonably slow (in the order of seconds) to avoid selftest
> overhead noise, and they need to be unchanging. It doesn't matter if
> they're old and unchanging or new and unchanging and cleanly apply in
> old trees, so long as they do the same thing in all the tested
> versions.
> 
> Douglas
>