BUG# 12754 [WIP] Avoid replication lockup by using USN from the start of the DRS cycle
abartlet at samba.org
Fri Apr 21 21:24:33 UTC 2017
I wrote this up with Garming earlier this week based on his analysis of
our flapping tests over Easter.
If we use the USN of an object at the time we fetch the full object to
calculate the up-to-dateness vector, we risk ignoring objects that
should appear later in the replication cycle.
This can happen if objects A B and C have USN:
but during replicaiton of 3 pages of results, B is modified, getting
Then we send:
This is because the server sets an uptodateness vector of 400 at B, and
client sends it back, causing the server to ignore C at 300, even when
the USN check (alone) would have sent it.
The patch instead only sends an uptodatenss vector matching the USN
seen at the time the cycle starts, this means we re-send the object
later for the higher USN.
We need a test for this (shouldn't take long to write). We think this
causes the WRITE_FAULT errors during DRS tests. It would also be safer
with the ldb locking patches and without nested event loops for ldb
searches (make the search for GUIDs more atomic).
Andrew Bartlett http://samba.org/~abartlet/
Authentication Developer, Samba Team http://samba.org
Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 8208 bytes
Desc: not available
More information about the samba-technical