autobuild failure due to python replication tests - why ?

Jeremy Allison jra at samba.org
Tue Aug 2 02:28:57 UTC 2016


On Tue, Aug 02, 2016 at 01:39:50PM +1200, Garming Sam wrote:
> Not to disagree, but I think there's a number of points to raise
> with the approach.

Well you *are* disagreeing, but that's ok :-).

> Turning off the tests, in this case particularly, probably means
> turning over 50 tests, each of which may or may not actually trigger
> this particular failure. Blanket bans on tests I don't think
> necessarily does anyone any good, especially since tests are not
> created equally. Some are definitely more important than others and
> some are definitely not as easy to individually knownfail (or rather
> move to flapping). In part, the test is to blame, but it's not
> necessarily easy to write such a targeted test.

Problem is this specific test just wasted 3+ hours.
I don't have lots of 3+ hours left to waste (I'm a
lot older than you :-).

And there was no error message that meant I could
do *anything* or address anything. Just resubmit.

Volker has just ended up using a cron job to automatically
resubmitt autobuilds when they fail.

That's one approach, but I don't like that either.

The tests *MUST* just work is they're in the autobuild
pipeline, otherwise they're just worse than useless.

My approach is a way of catagorizing which ones need
to get thrown out until they're fixed.

Note I didn't say do it immediately, keep a track of
which ones are failing and add up the times it happens.

Once it gets to an unbearable level of annoyance,
out they go. I think that's reasonable.

What *ISN'T* reasonable is tests that randomly fail.

> I think there is an importance in who actually switches off the test
> in the end. And that someone should be doing that manually. I think
> it's safe to say the flapping file, and the tests in it, are quite
> easily forgotten. You could send an email to the maintainer, and
> they might just miss it and not realize their test was moved. And it
> won't be fixed. Even if they're the one that turned it off, people
> move on to other things, they have new priorities. As much as I hate
> to see intermittent failures, they're still conscious reminders to
> actually fix something, or acknowledge that something needs to be
> fixed.
> 
> I think there are some quite serious bugs in the flapping file, and
> we have already found and fixed some of them. And if we're going to
> put more things into it, we need the list of these tests to be on a
> higher profile.

All of the above is true - none of it is any help at all
other than saying "just resubmit the autobuild".

I'm not going to live with that anymore.



More information about the samba-technical mailing list