selftest: mark the samba4.blackbox.dbcheck test as flapping.
jelmer at samba.org
Fri Mar 20 06:17:49 MDT 2015
On Fri, Mar 20, 2015 at 10:14:51PM +1300, Andrew Bartlett wrote:
> On Fri, 2015-03-20 at 08:38 +0100, Michael Adam wrote:
> > Hi Andrew,
> > On 2015-03-20 at 19:33 +1300, Andrew Bartlett wrote:
> > > On Thu, 2015-03-19 at 20:44 +0100, Michael Adam wrote:
> > > > The branch, master has been updated
> > > > via 77ba781 selftest: mark the samba4.blackbox.dbcheck test as flapping.
> > > > via 04d6ef8 selftest: mark the samba4.blackbox.samba_tool_demote test flakey.
> > > > ...
> > > > from 5f01bb1 build: Add talloc and samba-debug dep for gensec_external module
> > > >
> > > > ...
> > >
> > > My concern with this change is that now a change that permanently breaks
> > > this test won't ever be noticed, because we have no way to detect that
> > > this test has gone from flapping (once every 10 days, eg 40 autobuilds
> > > by your stats) to totally busted.
> > >
> > > While trying to chase the tombstone_reanimation test certainly I've seen
> > > a lot of flapping tests all over the code, which made looking for that
> > > failure very difficult. So, I'm not saying the current situation is
> > > ideal, but I am concerned.
> > >
> > > The problem is particularly difficult given our limited resources -
> > > limited resources to fix flapping tests, and limited resources to fix
> > > regressions introduced and hidden behind flapping entries.
> > Indeed. In this case, limited resources (to fix the AD-related
> > flapping tests) were the reason to mark these two (and the
> > preceeding two) as flapping. This was discussed with Metze,
> > who has the suspicion that there is one common root cause
> > for these class of failing AD/dsdb related tests. So I did
> > consult with one of our resouces capable of fixing such tests. :)
> > The main point was to not block all other peoples work.
> > This _is_ a problem on sn-devel, maybe not so much on different
> > machines, and it kept autobuilds failing again and again in
> > the last couple of days. So much for the reasons.
> > > I would appreciate your further thoughts on how to address this further,
> > > and to find a way we can qualify getting tests back of the flapping
> > > list.
> > Some thoughts:
> > - In the long run we should to separate the autobuild that
> > gates upstream from the intermittent autobuild failure check.
> > I.e. have a sane, defined environment for autobuild-to-push
> > jobs (not necessarily a very beefy system) and one ideally
> > beefy system (like our sn-devel) that runs the intermittent
> > failure test runs in an ideally hightly contended environment.
> > This has to be thought through.
> > - What can be done right now is this:
> > We can have intermittent failure run use a different flapping
> > file (maybe even an empty one) so that we can check here if
> > the tests marked as flapping are flapping still.
> > How does that sound as an idea for a start?
> > (This is in fact a first step in decoupling the push-gate from
> > the intermittent check.)
> We need some way to detect that a 'flapping' test isn't flapping, it's
> stuck one way or the other. That is, something that is recording those
> results, and noticing which tests always pass, which tests always fail,
> and ensuring that none of these are in the flapping list.
This could be done by grepping over the subunit logs that are
available in the build farm. This should not be too hard to script:
* find all log files later than a certain date
* grep for "testsuite-success: bla" and "testsuite-xfail: " in the output
More information about the samba-technical