replication rename fixes in 4.1

Andrew Bartlett abartlet at samba.org
Wed Apr 16 01:52:32 MDT 2014


On Wed, 2014-04-16 at 08:40 +0200, Stefan (metze) Metzmacher wrote:
> Am 14.04.2014 01:06, schrieb Andrew Bartlett:
> > On Fri, 2014-04-04 at 09:34 +1300, Andrew Bartlett wrote:
> >> On Thu, 2014-04-03 at 16:09 +0200, Stefan (metze) Metzmacher wrote:
> >>> Am 26.03.2014 21:41, schrieb Andrew Bartlett:
> >>>> On Wed, 2014-03-26 at 20:48 +0100, Stefan (metze) Metzmacher wrote:
> >>>>> Hi Andrew,
> >>>>>
> >>>>> a while ago you fixed some rename problems during incoming replication.
> >>>>>
> >>>>> I saw a domain with 4.0 where some corruption happened.
> >>>>>
> >>>>> An object was created on DC1 then replicated to DC2.
> >>>>> Then the DCs lost their link and the object was
> >>>>> deleted on DC2 and modified on DC1.
> >>>>>
> >>>>> When the link came back DC2 replicated the modification
> >>>>> from DC1 and renamed the object to its original dn,
> >>>>> as always used the incoming dn,
> >>>>> which means the "cn" and "name" attributes doesn't match the rdn
> >>>>> in the dn anymore.
> >>>>>
> >>>>> Then the result is replicated back to DC1 and there we have the original dn
> >>>>> and original "name" attribute while "cn" is the correct value with \nDEL.
> >>>>>
> >>>>> Is this the problem you intended to fix?
> >>>>
> >>>> Yes, I think this was the kind of issue I was trying to fix, but this
> >>>> may also be an additional issue.  The main issue was around deleted
> >>>> objects un-deleting themselves (moving out from under CN=Deleted
> >>>> Objects), and instead gaining their original name from the other replica
> >>>> again. 
> >>>
> >>> Yes, they get back their original dn, while the 'name' attribute
> >>> still has the correct value. As the RDN value is replicated implicitly
> >>> it's also wrong on all but one DC (the one that created the problem, it
> >>> still has the RDN value == name value).
> >>>
> >>> Here're some patches for dbcheck and a simple bug fix I found during the
> >>> developement.
> >>>
> >>> I've created https://bugzilla.samba.org/show_bug.cgi?id=10536 for it.
> >>
> >> If you captured a testing sam.ldb in that state, can we please have it
> >> for our dbcheck tests?
> >>
> >> I'll shortly propose something similar for the missing-object-class
> >> test, with the database artefacts from my testing there yesterday.
> > 
> > Patch #3 explains why we could not replicate in deleted objects
> > correctly, Thanks!
> > 
> > Any chance you got the sam.ldb we can test on.  I'm happy to do the
> > transformation into the testing commit if you can get me the private/
> > and etc/ dirs.
> 
> No, I only saw it in a production sam.ldb.
> But it shouldn't be too hard to reproduce this.

What steps do you think are required?

Thanks,

> I also noticed that we still have problems in that area,
> see https://git.samba.org/autobuild.flakey/2014-04-12-1407/samba.stdout
> I saw this from time to time...

Indeed.  

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba




More information about the samba-technical mailing list