replication rename fixes in 4.1

Stefan (metze) Metzmacher metze at samba.org
Wed Apr 16 00:40:24 MDT 2014


Am 14.04.2014 01:06, schrieb Andrew Bartlett:
> On Fri, 2014-04-04 at 09:34 +1300, Andrew Bartlett wrote:
>> On Thu, 2014-04-03 at 16:09 +0200, Stefan (metze) Metzmacher wrote:
>>> Am 26.03.2014 21:41, schrieb Andrew Bartlett:
>>>> On Wed, 2014-03-26 at 20:48 +0100, Stefan (metze) Metzmacher wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> a while ago you fixed some rename problems during incoming replication.
>>>>>
>>>>> I saw a domain with 4.0 where some corruption happened.
>>>>>
>>>>> An object was created on DC1 then replicated to DC2.
>>>>> Then the DCs lost their link and the object was
>>>>> deleted on DC2 and modified on DC1.
>>>>>
>>>>> When the link came back DC2 replicated the modification
>>>>> from DC1 and renamed the object to its original dn,
>>>>> as always used the incoming dn,
>>>>> which means the "cn" and "name" attributes doesn't match the rdn
>>>>> in the dn anymore.
>>>>>
>>>>> Then the result is replicated back to DC1 and there we have the original dn
>>>>> and original "name" attribute while "cn" is the correct value with \nDEL.
>>>>>
>>>>> Is this the problem you intended to fix?
>>>>
>>>> Yes, I think this was the kind of issue I was trying to fix, but this
>>>> may also be an additional issue.  The main issue was around deleted
>>>> objects un-deleting themselves (moving out from under CN=Deleted
>>>> Objects), and instead gaining their original name from the other replica
>>>> again. 
>>>
>>> Yes, they get back their original dn, while the 'name' attribute
>>> still has the correct value. As the RDN value is replicated implicitly
>>> it's also wrong on all but one DC (the one that created the problem, it
>>> still has the RDN value == name value).
>>>
>>> Here're some patches for dbcheck and a simple bug fix I found during the
>>> developement.
>>>
>>> I've created https://bugzilla.samba.org/show_bug.cgi?id=10536 for it.
>>
>> If you captured a testing sam.ldb in that state, can we please have it
>> for our dbcheck tests?
>>
>> I'll shortly propose something similar for the missing-object-class
>> test, with the database artefacts from my testing there yesterday.
> 
> Patch #3 explains why we could not replicate in deleted objects
> correctly, Thanks!
> 
> Any chance you got the sam.ldb we can test on.  I'm happy to do the
> transformation into the testing commit if you can get me the private/
> and etc/ dirs.

No, I only saw it in a production sam.ldb.
But it shouldn't be too hard to reproduce this.

I also noticed that we still have problems in that area,
see https://git.samba.org/autobuild.flakey/2014-04-12-1407/samba.stdout
I saw this from time to time...

metze


More information about the samba-technical mailing list