selftest: mark the samba4.blackbox.dbcheck test as flapping.

Michael Adam obnox at samba.org
Fri Mar 20 01:38:21 MDT 2015


Hi Andrew,

On 2015-03-20 at 19:33 +1300, Andrew Bartlett wrote:
> On Thu, 2015-03-19 at 20:44 +0100, Michael Adam wrote:
> > The branch, master has been updated
> >        via  77ba781 selftest: mark the samba4.blackbox.dbcheck test as flapping.
> >        via  04d6ef8 selftest: mark the samba4.blackbox.samba_tool_demote test flakey.
> >       ...
> >       from  5f01bb1 build: Add talloc and samba-debug dep for gensec_external module
> > 
> > ...

> 
> My concern with this change is that now a change that permanently breaks
> this test won't ever be noticed, because we have no way to detect that
> this test has gone from flapping (once every 10 days, eg 40 autobuilds
> by your stats) to totally busted.
> 
> While trying to chase the tombstone_reanimation test certainly I've seen
> a lot of flapping tests all over the code, which made looking for that
> failure very difficult.  So, I'm not saying the current situation is
> ideal, but I am concerned.  
> 
> The problem is particularly difficult given our limited resources -
> limited resources to fix flapping tests, and limited resources to fix
> regressions introduced and hidden behind flapping entries.

Indeed. In this case, limited resources (to fix the AD-related
flapping tests) were the reason to mark these two (and the
preceeding two) as flapping. This was discussed with Metze,
who has the suspicion that there is one common root cause
for these class of failing AD/dsdb related tests. So I did
consult with one of our resouces capable of fixing such tests. :)

The main point was to not block all other peoples work.

This _is_ a problem on sn-devel, maybe not so much on different
machines, and it kept autobuilds failing again and again in
the last couple of days. So much for the reasons.

> I would appreciate your further thoughts on how to address this further,
> and to find a way we can qualify getting tests back of the flapping
> list. 

Some thoughts:

- In the long run we should to separate the autobuild that
  gates upstream from the intermittent autobuild failure check.
  I.e. have a sane, defined environment for autobuild-to-push
  jobs (not necessarily a very beefy system) and one ideally
  beefy system (like our sn-devel) that runs the intermittent
  failure test runs in an ideally hightly contended environment.
  This has to be thought through.

- What can be done right now is this:
  We can have intermittent failure run use a different flapping
  file (maybe even an empty one) so that we can check here if
  the tests marked as flapping are flapping still.

  How does that sound as an idea for a start?
  (This is in fact a first step in decoupling the push-gate from
  the intermittent check.)

 Cheers - Michael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150320/d8e42eb3/attachment.pgp>


More information about the samba-technical mailing list