[PATCH] Assert that the objectClass is always present

Andrew Bartlett abartlet at samba.org
Tue Mar 11 12:56:35 MDT 2014


On Tue, 2014-03-11 at 17:37 +0100, Arvid Requate wrote:
> Hello,
> 
> On Fri, 2014-02-28 at 17:51:43 +1300, Andrew Bartlett wrote:
> > Metze and Arvid,
> > 
> > What do you think of the attached, to try and ensure the objectClass
> > missing bug can't be propagated?
> 
> On Sat, 2014-03-01 at 07:49 +1300, Andrew Bartlett wrote:
> > If that is how this corruption happens, then while disturbing, this also
> > suggests that the fix is to force re-replication, not to delete the
> > object, as on at least one DC, the whole correct object exists.  Anyway,
> > the new assertions in the patches should help with detecting this, as we
> > won't accept the object without an objectclass any more.
> 
> 
> Ok, how will drepl behave after the new assertion has been triggered? Will 
> this block replication full stop or will it cause the object to be neglected 
> or will it try tro re-replicate the object in the next replication run?
> 
> * If the patch brings replication to a grinding halt that would be a show 
> stopper.

Yes, it would do exactly that.  I agree it stops the show, and in the
case of corruption, I think that is the only safe action.

>From here what I would suggest is a new forced replication, either
overwriting local changes totally or applying the replication merge
logic, but asking the remote server for all objects by suggesting our
USN is actually 0. 

> * Neglecting the entire object in replication might also cause additional 
> inconsistencies (think group membership or other backlinks), but maybe it's 
> better than forcing samba to swallow a broken object which it cannnot digest 
> anyway. I cannot rate the impact such a broken object would cause in other 
> parts of samba. Segfaulting processes are definitely not good either, but the 
> impact is more isolated than a stopped replication. A segfault usually doesn't 
> go unnoticed, a stopped replication may go unnoticed for too long time. And 
> the cause might be harder to identify.

This broken object cannot be allowed to persist, it will cause further
havoc.  Even if we fix the segfaults, we will just move on to other hard
errors elsewhere.

> * The third option, re-replication, would be ideal, obviously, if it doesn't 
> lead to an infinite circle.

I would suggest that this, run manually by the administrator, is the
only safe option.  It wouldn't lead to an infinite cycle because it
would need to be administrator-run.  

Naturally, this needs to be combined with actually finding and fixing
whatever can cause this in the first place - it should not be a natural
part of operating a Samba domain.

> Felix already commented on the other points. I may add that we have seen 
> missing objectclasses in four different environments.

Thanks.  The next step I think is to work out how the USN records become
inconsistent. 

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba




More information about the samba-technical mailing list