The responsibility for flapping tests
Matthieu Patou
mat at matws.net
Sun May 31 19:23:30 MDT 2015
Is it the root cause of the dbcheck failure ?
Matthieu.
On 05/30/2015 04:12 PM, Kamen Mazdrashki wrote:
> Hi Andrew,
>
> Could you please take a look at attached patch - it disables
> tombstone_reanimation module.
> It is still compiled, but skips registration and I have also removed it
> from the list of dsdb modules.
> Running tombstone_reanimation test fails nicely :)
>
> Cheers,
> kamen
>
> On Thu, May 28, 2015 at 2:40 PM, Kamen Mazdrashki <kamenim at samba.org> wrote:
>
>>
>> On Thu, May 28, 2015 at 6:10 AM, Andrew Bartlett <abartlet at samba.org>
>> wrote:
>>
>>> On Thu, 2015-05-28 at 01:21 +0300, Kamen Mazdrashki wrote:
>>>> On Wed, May 27, 2015 at 5:56 AM, Andrew Bartlett <abartlet at samba.org>
>>> wrote:
>>>>> On Wed, 2015-05-27 at 05:43 +0300, Kamen Mazdrashki wrote:
>>>>>>
>>>>>> On Tue, May 26, 2015 at 4:53 PM, Andrew Bartlett <
>>> abartlet at samba.org>
>>>>>> wrote:
>>>>>> On Tue, 2015-05-26 at 05:46 -0700, Matthieu Patou wrote:
>>>>>> > On 05/25/2015 09:54 PM, Andrew Bartlett wrote:
>>>>>> > > On Mon, 2015-05-25 at 21:50 -0700, Matthieu Patou wrote:
>>>>>> > >> Hello All,
>>>>>> > >>
>>>>>> > >> Please find attached 2 patches that reduce the time of
>>>>>> dbcheck by 10%,
>>>>>> > >> they avoid calling getwellknowobjects when not needed
>>>>>> (ie. when dealing
>>>>>> > >> with the schema NC) and also cache the result of the
>>>>>> function as well
>>>>>> > >> known object didn't change.
>>>>>> > >>
>>>>>> > >> Pass make test.
>>>>>> > > As we got into the unfortunate situation where dbcheck
>>> was
>>>>>> marked as
>>>>>> > > flapping, sadly this means this code is untested in
>>> 'make
>>>>>> test'. :-(
>>>>>> > >
>>>>>> > > Andrew Bartlett
>>>>>> > >
>>>>>> > Ok but still it doesn't mean that those changes are wrong.
>>>>>> > One of those is to cache the wellknownobjects when found;
>>> as
>>>>>> the name
>>>>>> > imply the wellknownobjects doesn't change. Also I think I
>>>>>> did several
>>>>>> > rounds of make tests without getting an error so if let's
>>>>>> say 3 rounds
>>>>>> > of make test didn't show any regression we can treat this
>>> as
>>>>>> a reliable
>>>>>> > patch no ?
>>>>>>
>>>>>> No, because tests in the flapping file are essentially
>>> skipped
>>>>>> - we have
>>>>>> no idea if they succeeded or failed, or if they continue to.
>>>>>>
>>>>>> Before we get this in, we have to revert that unfortunate
>>>>>> addition, and
>>>>>> to do that we probably need to revert whatever made it flap
>>>>>> (presumably
>>>>>> the tombstone reanimation work).
>>>>>>
>>>>>> I realise this is totally unfair. Untested code is really
>>>>>> bad, but
>>>>>> changing untested code is even worse.
>>>>>>
>>>>>> The only way forward for this in the long term would be to
>>>>>> declare that
>>>>>> we will remove, after a timeout of two months, each entry
>>> from
>>>>>> flapping,
>>>>>> and that as part of this we will review those items to
>>> decide
>>>>>> that they
>>>>>> are either pointless (and so remove the pretence that they
>>>>>> have value),
>>>>>> or fix them not to flap. That, and better pre-commit tests
>>>>>> that make
>>>>>> flapping tests harder to get in, could be a better
>>> compromise
>>>>>> to release
>>>>>> the pressure of failed autobuilds in the short-term without
>>>>>> abandoning
>>>>>> tests for the long term.
>>>>>>
>>>>>> That is, the current process where tests get added to
>>> flapping
>>>>>> 'one
>>>>>> way', just leaves the costs of the underlying failure on our
>>>>>> users and
>>>>>> never on us as developers.
>>>>>>
>>>>>> Like portability, it will only change if work is invested.
>>>>>>
>>>>>> When I'm over jet-lag, I will propose to remove tombstone
>>>>>> reanimation as
>>>>>> a supported feature (by removing the module from the default
>>>>>> stack), and
>>>>>> knownfail or skip the associated tests.
>>>>>>
>>>>>> I am afraid that just removing the module won't remove all changes
>>>>>> introduced
>>>>>> for tombstone reanimation :). Funny thing is, as Matthieu mentioned
>>> he
>>>>>> did for his
>>>>>> optimization work, that I have run the test suite for tombstone
>>>>>> reanimation for so
>>>>>> many times without a failure. Which leads me to believe that there
>>> is
>>>>>> a subtle
>>>>>> interference between tests/environments. afaik, we don't run every
>>>>>> test in
>>>>>> a cleanly provisioned environment, do we?
>>>>> Correct, the tests run in a common environment, and the interaction
>>>>> between the various DCs that are running at the time can impact on the
>>>>> results. I would suggest that for tests like this one, that we should
>>>>> actually force that, by having it run against promoted_dc or
>>> vampire_dc,
>>>>> as well as a standalone server such as fl2008dc so that the
>>> replication
>>>>> behaviours and standalone behaviours are captured.
>>>>
>>>> This would be great - after all, tombstone reanimation has nothing to do
>>>> with replication test and we have different set of tests to show deleted
>>>> objects are replicated correctly.
>>>> How can I achieve that? Should I move Tombstone reanimation test
>>>> earlier in tests.py?
>>>>
>>>>
>>>>
>>>>
>>>>> But regarding this test and the functionality it was intended to
>>> prove,
>>>>> what would you suggest? The cleanest solution would be to revert the
>>>>> lot until we understand it, but that also seems rather drastic.
>>>>>
>>>>> It seems reasonable to at least remove the module and test, even if
>>> that
>>>>> won't be a full revert, and then put it back when the full make test
>>> is
>>>>> shown not to flap.
>>>>>
>>>> You are totally right. I thought you want to revert the whole patch set,
>>>> sorry!
>>>> Disabling just this module is *very* reasonable approach indeed.
>>> I've found myself a bit busy, can you knock up the patch?
>>>
>> I will out of town, getting back tomorrow and i will post a patch.
>> Hopefully I can play
>> with reordering the tests to be executed in a 'cleaner' environment (I
>> count on your
>> help heavily there thought :)
>>
>> Cheers,
>> Kamen
>>
>>
>>> Thanks,
>>>
>>> --
>>> Andrew Bartlett
>>> http://samba.org/~abartlet/
>>> Authentication Developer, Samba Team http://samba.org
>>> Samba Developer, Catalyst IT
>>> http://catalyst.net.nz/services/samba
>>>
>>>
>>>
>>>
>>>
More information about the samba-technical
mailing list