[PATCH] Solve issues with flapping patches

Andrew Bartlett abartlet at samba.org
Fri Jun 30 20:59:30 UTC 2017


On Fri, 2017-06-30 at 09:12 -0700, Jeremy Allison wrote:
> On Fri, Jun 30, 2017 at 06:17:46AM +0200, Stefan Metzmacher via samba-technical wrote:
> > Am 30.06.2017 um 06:09 schrieb Martin Schwenke via samba-technical:
> > > 
> > > I have vaguely noticed the work but it isn't on my radar.  I need to
> > > get my code and the code it depends on into the tree.  I would like to
> > > do that without having to push 3 or 4 times, every time.
> 
> This is the critical thing. These tests are important, but if they're
> failing 75% of the time and preventing others from getting work
> upstream then it's time to mark flapping, and unmark them when
> the problem has been fixed.

I do appreciate your frustration.  The combination of increased load on
the queue and the time pressure to get things in for 4.7 really
increases the stress levels.  I too have only got one autobuild in this
week, and one last week.  That is why this has been my primary focus
since before SambaXP.

> Andrew, I would understand your position if Martin was proposing
> removing the tests, but that's not what is being requested here.

The blunt reality is that it isn't the tests listed in Martin's commit
- they are neither at fault, nor the minimal set needed to resolve the
issue.  

Indeed, now that we actually understand the issue, we realise that a
knownfail probably isn't enough, nothing short of skipping every drs,
rpc and ldap test will likely 'resolve' it. 

The problem is that without read locking, any DRS replication can race
against a rename or delete, and once that race is lost, the next poor
test to rely on DRS replication gets it in the neck.

It is a low-level, rare, race condition that rarely breaks almost any
test, which is why each test alone is still entirely valid, yet due to
the multiplier effect overall it impacts the whole build, somewhere,
much too often.

> These tests are critical ! We must have them - but we have
> a known bug now that is being worked on so a little temporary
> relief is reasonable.
> 
> All these tests are doing right now is making it painful for
> anyone to get work into upstream.
> 
> This situation wouldn't be tolerated in commercial software
> development, and I don't think we should accept less for Samba.

Indeed it wouldn't be.  I would have expected that it would not fall to
just one engineer and a reviewer to shoulder the whole load.  But this
isn't commercial software, and nobody is here to direct anybody else.

I'm just glad metze is able to keep finding the time to review this!

In the meantime, could you review the patches metze posted, and do the
tests that he asked for?  It isn't complex, just fiddly installing new
LDB versions and building old Samba versions against them.  That is the
last blocker, and should let the patch set land.

I'm trying to keep away from Samba for the weekend and spend time with
the family, so I can be fresh for the final push next week.

Thanks,

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba




More information about the samba-technical mailing list