replication rename fixes in 4.1

Stefan (metze) Metzmacher metze at samba.org
Wed Apr 16 02:51:31 MDT 2014


Am 16.04.2014 09:52, schrieb Andrew Bartlett:
> On Wed, 2014-04-16 at 08:40 +0200, Stefan (metze) Metzmacher wrote:
>> Am 14.04.2014 01:06, schrieb Andrew Bartlett:
>>> On Fri, 2014-04-04 at 09:34 +1300, Andrew Bartlett wrote:
>>>> On Thu, 2014-04-03 at 16:09 +0200, Stefan (metze) Metzmacher wrote:
>>>>> Am 26.03.2014 21:41, schrieb Andrew Bartlett:
>>>>>> On Wed, 2014-03-26 at 20:48 +0100, Stefan (metze) Metzmacher wrote:
>>>>>>> Hi Andrew,
>>>>>>>
>>>>>>> a while ago you fixed some rename problems during incoming replication.
>>>>>>>
>>>>>>> I saw a domain with 4.0 where some corruption happened.
>>>>>>>
>>>>>>> An object was created on DC1 then replicated to DC2.
>>>>>>> Then the DCs lost their link and the object was
>>>>>>> deleted on DC2 and modified on DC1.
>>>>>>>
>>>>>>> When the link came back DC2 replicated the modification
>>>>>>> from DC1 and renamed the object to its original dn,
>>>>>>> as always used the incoming dn,
>>>>>>> which means the "cn" and "name" attributes doesn't match the rdn
>>>>>>> in the dn anymore.
>>>>>>>
>>>>>>> Then the result is replicated back to DC1 and there we have the original dn
>>>>>>> and original "name" attribute while "cn" is the correct value with \nDEL.
>>>>>>>
>>>>>>> Is this the problem you intended to fix?
>>>>>>
>>>>>> Yes, I think this was the kind of issue I was trying to fix, but this
>>>>>> may also be an additional issue.  The main issue was around deleted
>>>>>> objects un-deleting themselves (moving out from under CN=Deleted
>>>>>> Objects), and instead gaining their original name from the other replica
>>>>>> again. 
>>>>>
>>>>> Yes, they get back their original dn, while the 'name' attribute
>>>>> still has the correct value. As the RDN value is replicated implicitly
>>>>> it's also wrong on all but one DC (the one that created the problem, it
>>>>> still has the RDN value == name value).
>>>>>
>>>>> Here're some patches for dbcheck and a simple bug fix I found during the
>>>>> developement.
>>>>>
>>>>> I've created https://bugzilla.samba.org/show_bug.cgi?id=10536 for it.
>>>>
>>>> If you captured a testing sam.ldb in that state, can we please have it
>>>> for our dbcheck tests?
>>>>
>>>> I'll shortly propose something similar for the missing-object-class
>>>> test, with the database artefacts from my testing there yesterday.
>>>
>>> Patch #3 explains why we could not replicate in deleted objects
>>> correctly, Thanks!
>>>
>>> Any chance you got the sam.ldb we can test on.  I'm happy to do the
>>> transformation into the testing commit if you can get me the private/
>>> and etc/ dirs.
>>
>> No, I only saw it in a production sam.ldb.
>> But it shouldn't be too hard to reproduce this.
> 
> What steps do you think are required?

>From the commit message of
https://git.samba.org/abartlet/samba.git/?p=abartlet/samba.git/.git;a=commitdiff;h=4576e346ad3a114bde3d5200920aaccbfb34bb6c

With older Samba versions (4.0.x) the following could happen:

- On account was created on DC1
- It was replicated to DC2
- The connection between the dcs is offline
- The account gets modified on DC2
- The account gets deleted on DC1
- The connection becomes online again
- DC1 replicates the modification from DC2,
  this resets the dn to the original value.
  'name' and 'cn' are correct (with '\nDEL${GUID}'),
  but 'dn' is wrong.
- DC2 replicates the deletion from DC1.
  this doesn't include a changed dn as DC1
  had a bug.
  'name' is correct (with '\nDEL${GUID}'),
  but 'cn' and 'dn' are wrong.

metze


More information about the samba-technical mailing list