[PATCH] Avoid replication corruption on the RODC

Garming Sam garming at samba.org
Wed Feb 14 00:48:01 UTC 2018


Hi,

Attached are some patches to fix this particular issue in the RODC as
well as the tests. What has happened is that an error code has not been
set in the failure case on an RODC when a conflict record must be
created (which the RODC cannot do alone). This meant that the RODC
silently skips the object or update, but still updates the highwatermark
and up-to-dateness vector as if it had applied the update.

Please review and push.


Cheers,

Garming

On 12/02/18 10:56, Garming Sam wrote:
> Hi metze,
>
> We've learnt a little more about this error (as I can reproduce it
> reliably with our experimental LMDB backend). Seems to be an error which
> only occurs on the RODC where an object has skipped an add without
> returning any errors. I'm currently running some autobuilds to see if a
> possible fix is correct.
>
> (Also the issue seems to only occur in autobuild with this particular
> test, it presumably has to do with the ordering it creates)
>
>
> Cheers,
>
> Garming
>
> On 12/01/18 05:41, Stefan Metzmacher wrote:
>> Hi,
>>
>> has someone seen this before?
>> It's from my current autobuild.
>>
>> I didn't expect that to happen anymore or do we have some
>> strange magic to inject invalid stuff into the tests?
>>
>> metze
>>
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_spnupdate:
>> WARNING: The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_dnsupdate:
>> WARNING: The "server schannel" option is deprecated
>> replmd_check_singleval_la_conflict: Link conflict for managedBy
>> attribute on
>> OU=src-5372430,OU=test_link_conflict895631,DC=samba,DC=example,DC=com
>> replmd_check_singleval_la_conflict: Using existing target
>> OU=target2-5203693,OU=test_link_conflict895631,DC=samba,DC=example,DC=com,
>> over received value
>> OU=target1-236770,OU=test_link_conflict895631,DC=samba,DC=example,DC=com
>> ldb: No objectClass found in replPropertyMetaData for
>> CN=src-3467800\0ADEL:b0b7ed0d-4a90-4ab6-8b98-ee6e523898b1,CN=Deleted
>> Objects,DC=samba,DC=example,DC=com!
>>
>> Failed to apply records: replmd_replicated_apply_add: error during DRS
>> repl ADD: No objectClass found in replPropertyMetaData for
>> CN=src-3467800\0ADEL:b0b7ed0d-4a90-4ab6-8b98-ee6e523898b1,CN=Deleted
>> Objects,DC=samba,DC=example,DC=com!
>> : Object class violation
>> Failed to commit objects:
>> WERR_GEN_FAILURE/NT_STATUS_INVALID_NETWORK_RESPONSE
>> replmd_check_singleval_la_conflict: Link conflict for managedBy
>> attribute on
>> OU=src-6452532,OU=test_link_conflict895631,DC=samba,DC=example,DC=com
>> replmd_check_singleval_la_conflict: Using received value
>> OU=target2-7345816,OU=test_link_conflict895631,DC=samba,DC=example,DC=com,
>> over existing target
>> OU=target1-7523941,OU=test_link_conflict895631,DC=samba,DC=example,DC=com
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_kcc: WARNING:
>> The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_spnupdate:
>> WARNING: The "lsa over netlogon" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_spnupdate:
>> WARNING: The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_dnsupdate:
>> WARNING: The "lsa over netlogon" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_dnsupdate:
>> WARNING: The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_kcc: WARNING:
>> The "lsa over netlogon" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_kcc: WARNING:
>> The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_spnupdate:
>> WARNING: The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_dnsupdate:
>> WARNING: The "server schannel" option is deprecated
>> /memdisk/slow/a/b4165738/samba/source4/scripting/bin/samba_kcc: WARNING:
>> The "server schannel" option is deprecated
>> Missing target object when we didn't set the DRSUAPI_DRS_GET_TGT flag,
>> retrying
>> replmd_allow_missing_target:
>> CN=target-4296305\0ADEL:651d4ea5-3894-4d77-a25e-74a1597f7f63,CN=Deleted
>> Objects,DC=samba,DC=example,DC=com is Deleted but up to date. Ignoring
>> link from
>> CN=src-791470,OU=test_link_conflict4456064,DC=samba,DC=example,DC=com
>>
>> ==> samba.stdout <==
>> [2187(16020)/2234 at 3h37m8s]
>> samba4.drs.repl_schema.python(vampire_dc)(vampire_dc)
>> [2188(16029)/2234 at 3h37m51s]
>> samba4.drs.repl_schema.python(promoted_dc)(promoted_dc)
>>
>>
>

-------------- next part --------------
From 5c82cdb0540b40a80eb7e507beea6ef0e5208d63 Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Wed, 14 Feb 2018 13:26:35 +1300
Subject: [PATCH 1/6] tests/replica_sync: Add some additional replication in
 setUp

This should avoid some failures due to stale objects.

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 source4/torture/drs/python/replica_sync.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/source4/torture/drs/python/replica_sync.py b/source4/torture/drs/python/replica_sync.py
index 93407df..927a085 100644
--- a/source4/torture/drs/python/replica_sync.py
+++ b/source4/torture/drs/python/replica_sync.py
@@ -42,6 +42,8 @@ class DrsReplicaSyncTestCase(drs_base.DrsBaseTestCase):
 
     def setUp(self):
         super(DrsReplicaSyncTestCase, self).setUp()
+        self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1, forced=True)
+        self._net_drs_replicate(DC=self.dnsname_dc1, fromDC=self.dnsname_dc2, forced=True)
         self.ou1 = None
         self.ou2 = None
 
-- 
1.9.1


From 160035bc75f1649b47e870d6c3892a451bdf6686 Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Wed, 14 Feb 2018 13:27:59 +1300
Subject: [PATCH 2/6] tests/drs_base: Allow the net drs replicate to try with a
 single object

This eventually passes down the replicate single object exop.

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 source4/torture/drs/python/drs_base.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/source4/torture/drs/python/drs_base.py b/source4/torture/drs/python/drs_base.py
index 10f2e63..66a0d8d 100644
--- a/source4/torture/drs/python/drs_base.py
+++ b/source4/torture/drs/python/drs_base.py
@@ -120,7 +120,8 @@ class DrsBaseTestCase(SambaToolCmdTest):
         # bin/samba-tool drs <drs_command> <cmdline_auth>
         return ["drs", drs_command, cmdline_auth]
 
-    def _net_drs_replicate(self, DC, fromDC, nc_dn=None, forced=True, local=False, full_sync=False):
+    def _net_drs_replicate(self, DC, fromDC, nc_dn=None, forced=True,
+                           local=False, full_sync=False, single=False):
         if nc_dn is None:
             nc_dn = self.domain_dn
         # make base command line
@@ -134,6 +135,8 @@ class DrsBaseTestCase(SambaToolCmdTest):
             samba_tool_cmdline += ["--local"]
         if full_sync:
             samba_tool_cmdline += ["--full-sync"]
+        if single:
+            samba_tool_cmdline += ["--single-object"]
 
         (result, out, err) = self.runsubcmd(*samba_tool_cmdline)
         self.assertCmdSuccess(result, out, err)
-- 
1.9.1


From 3bd05a1975ef84b036b744b97f5d3961d7df6c2d Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Wed, 14 Feb 2018 13:27:27 +1300
Subject: [PATCH 3/6] selftest: Add RODC variables to list of those exported

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 selftest/selftest.pl | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/selftest/selftest.pl b/selftest/selftest.pl
index ff19d59..0e56e6a 100755
--- a/selftest/selftest.pl
+++ b/selftest/selftest.pl
@@ -835,6 +835,12 @@ my @exported_envvars = (
 	"VAMPIRE_DC_NETBIOSNAME",
 	"VAMPIRE_DC_NETBIOSALIAS",
 
+	# domain controller stuff for RODC
+	"RODC_DC_SERVER",
+	"RODC_DC_SERVER_IP",
+	"RODC_DC_SERVER_IPV6",
+	"RODC_DC_NETBIOSNAME",
+
 	# domain controller stuff for FL 2000 Vampired DC
 	"VAMPIRE_2000_DC_SERVER",
 	"VAMPIRE_2000_DC_SERVER_IP",
-- 
1.9.1


From abb1082a414b641c0c21e9deac33dc354949b1fd Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Wed, 14 Feb 2018 13:30:26 +1300
Subject: [PATCH 4/6] tests/replica_sync_rodc: Test conflict handling on an
 RODC

There are two cases we are interested in:

1) RODC receives two identical DNs which conflict
2) RODC receives a rename to a DN which already exists

Currently these issues are ignored, but the UDV and HWM are being
updated, leading to objects/updates being skipped.

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 selftest/knownfail.d/rodc_repl                  |   2 +
 source4/selftest/tests.py                       |   6 +
 source4/torture/drs/python/replica_sync_rodc.py | 156 ++++++++++++++++++++++++
 3 files changed, 164 insertions(+)
 create mode 100644 selftest/knownfail.d/rodc_repl
 create mode 100644 source4/torture/drs/python/replica_sync_rodc.py

diff --git a/selftest/knownfail.d/rodc_repl b/selftest/knownfail.d/rodc_repl
new file mode 100644
index 0000000..f6313c1
--- /dev/null
+++ b/selftest/knownfail.d/rodc_repl
@@ -0,0 +1,2 @@
+^samba4.drs.replica_sync_rodc.python.rodc..replica_sync_rodc.DrsReplicaSyncTestCase.test_ReplConflictsRODC.rodc
+^samba4.drs.replica_sync_rodc.python.rodc..replica_sync_rodc.DrsReplicaSyncTestCase.test_ReplConflictsRODCRename.rodc
diff --git a/source4/selftest/tests.py b/source4/selftest/tests.py
index f5ff906..621a613 100755
--- a/source4/selftest/tests.py
+++ b/source4/selftest/tests.py
@@ -769,6 +769,12 @@ plantestsuite_loadlist("samba4.ldap.rodc_rwdc.python(rodc)", "rodc:local",
                         '$SERVER', '$DC_SERVER', '-U"$USERNAME%$PASSWORD"',
                         '--workgroup=$DOMAIN', '$LOADLIST', '$LISTOPT'])
 
+planoldpythontestsuite("rodc:local", "replica_sync_rodc",
+                       extra_path=[os.path.join(samba4srcdir, 'torture/drs/python')],
+		       name="samba4.drs.replica_sync_rodc.python(rodc)",
+		       environ={'DC1': '$DC_SERVER', 'DC2': '$RODC_DC_SERVER'},
+		       extra_args=['-U$DOMAIN/$DC_USERNAME%$DC_PASSWORD'])
+
 for env in ["ad_dc_ntvfs", "fl2000dc", "fl2003dc", "fl2008r2dc"]:
     plantestsuite_loadlist("samba4.ldap_schema.python(%s)" % env, env, [python, os.path.join(samba4srcdir, "dsdb/tests/python/ldap_schema.py"), '$SERVER', '-U"$USERNAME%$PASSWORD"', '--workgroup=$DOMAIN', '$LOADLIST', '$LISTOPT'])
     plantestsuite("samba4.ldap.possibleInferiors.python(%s)" % env, env, [python, os.path.join(samba4srcdir, "dsdb/samdb/ldb_modules/tests/possibleinferiors.py"), "ldap://$SERVER", '-U"$USERNAME%$PASSWORD"', "-W$DOMAIN"])
diff --git a/source4/torture/drs/python/replica_sync_rodc.py b/source4/torture/drs/python/replica_sync_rodc.py
new file mode 100644
index 0000000..907cef4
--- /dev/null
+++ b/source4/torture/drs/python/replica_sync_rodc.py
@@ -0,0 +1,156 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Test conflict scenarios on the RODC
+#
+# Copyright (C) Kamen Mazdrashki <kamenim at samba.org> 2011
+# Copyright (C) Catalyst.NET Ltd 2018
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+#
+# Usage:
+#  export DC1=dc1_dns_name
+#  export DC2=dc2_dns_name (RODC)
+#  export SUBUNITRUN=$samba4srcdir/scripting/bin/subunitrun
+#  PYTHONPATH="$PYTHONPATH:$samba4srcdir/torture/drs/python" $SUBUNITRUN replica_sync_rodc -U"$DOMAIN/$DC_USERNAME"%"$DC_PASSWORD"
+#
+
+import drs_base
+import samba.tests
+import time
+import ldb
+
+from ldb import (
+    SCOPE_BASE, LdbError, ERR_NO_SUCH_OBJECT)
+
+class DrsReplicaSyncTestCase(drs_base.DrsBaseTestCase):
+    """Intended as a black box test case for DsReplicaSync
+       implementation. It should test the behavior of this
+       case in cases when inbound replication is disabled"""
+
+    def setUp(self):
+        super(DrsReplicaSyncTestCase, self).setUp()
+        self._disable_inbound_repl(self.dnsname_dc1)
+        self._disable_all_repl(self.dnsname_dc1)
+        self.ou1 = None
+        self.ou2 = None
+
+    def tearDown(self):
+        # re-enable replication
+        self._enable_inbound_repl(self.dnsname_dc1)
+        self._enable_all_repl(self.dnsname_dc1)
+
+        super(DrsReplicaSyncTestCase, self).tearDown()
+
+    def _create_ou(self, samdb, name):
+        ldif = """
+dn: %s,%s
+objectClass: organizationalUnit
+""" % (name, self.domain_dn)
+        samdb.add_ldif(ldif)
+        res = samdb.search(base="%s,%s" % (name, self.domain_dn),
+                           scope=SCOPE_BASE, attrs=["objectGUID"])
+        return self._GUID_string(res[0]["objectGUID"][0])
+
+    def _check_deleted(self, sam_ldb, guid):
+        # search the user by guid as it may be deleted
+        res = sam_ldb.search(base='<GUID=%s>' % guid,
+                             controls=["show_deleted:1"],
+                             attrs=["isDeleted", "objectCategory", "ou"])
+        self.assertEquals(len(res), 1)
+        ou_cur = res[0]
+        # Deleted Object base DN
+        dodn = self._deleted_objects_dn(sam_ldb)
+        # now check properties of the user
+        name_cur  = ou_cur["ou"][0]
+        self.assertEquals(ou_cur["isDeleted"][0],"TRUE")
+        self.assertTrue(not("objectCategory" in ou_cur))
+        self.assertTrue(dodn in str(ou_cur["dn"]),
+                        "OU %s is deleted but it is not located under %s!" % (name_cur, dodn))
+
+
+    def test_ReplConflictsRODC(self):
+        """Tests that objects created in conflict become conflict DNs"""
+        # Replicate all objects to RODC beforehand
+        self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1, forced=True)
+
+        # Create conflicting objects on DC1 and DC2, with DC1 object created first
+        name = "OU=Test RODC Conflict"
+        self.ou1 = self._create_ou(self.ldb_dc1, name)
+
+        # Replicate single object
+        self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1,
+                                nc_dn="%s,%s" % (name, self.domain_dn),
+                                local=True, single=True, forced=True)
+
+        # Delete the object, so another can be added
+        self.ldb_dc1.delete('<GUID=%s>' % self.ou1)
+
+        # Create a conflicting DN as it would appear to the RODC
+        self.ou2 = self._create_ou(self.ldb_dc1, name)
+
+        try:
+            self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1,
+                                    nc_dn="%s,%s" % (name, self.domain_dn),
+                                    local=True, single=True, forced=True)
+        except:
+            # Cleanup the object
+            self.ldb_dc1.delete('<GUID=%s>' % self.ou2)
+            return
+
+        # Replicate cannot succeed, HWM would be updated incorrectly.
+        self.fail("DRS replicate should have failed.")
+
+    def test_ReplConflictsRODCRename(self):
+        """Tests that objects created in conflict become conflict DNs"""
+        # Replicate all objects to RODC beforehand
+        self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1, forced=True)
+
+        # Create conflicting objects on DC1 and DC2, with DC1 object created first
+        name = "OU=Test RODC Rename Conflict"
+        self.ou1 = self._create_ou(self.ldb_dc1, name)
+
+        # Replicate single object
+        self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1,
+                                nc_dn="%s,%s" % (name, self.domain_dn),
+                                local=True, single=True, forced=True)
+
+        # Create a non-conflicting DN to rename as conflicting
+        free_name = "OU=Test RODC Rename No Conflict"
+        self.ou2 = self._create_ou(self.ldb_dc1, free_name)
+
+        self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1,
+                                nc_dn="%s,%s" % (free_name, self.domain_dn),
+                                local=True, single=True, forced=True)
+
+        # Delete the object, so we can rename freely
+        # DO NOT REPLICATE TO THE RODC
+        self.ldb_dc1.delete('<GUID=%s>' % self.ou1)
+
+        # Collide the name from the RODC perspective
+        self.ldb_dc1.rename("<GUID=%s>" % self.ou2, "%s,%s" % (name, self.domain_dn))
+
+        try:
+            self._net_drs_replicate(DC=self.dnsname_dc2, fromDC=self.dnsname_dc1,
+                                    nc_dn="%s,%s" % (name, self.domain_dn),
+                                    local=True, single=True, forced=True)
+        except:
+            # Cleanup the object
+            self.ldb_dc1.delete('<GUID=%s>' % self.ou2)
+            return
+
+        # Replicate cannot succeed, HWM would be updated incorrectly.
+        self.fail("DRS replicate should have failed.")
-- 
1.9.1


From 1447c48c5d0b7fd2957653de631f2a702ffd570c Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Wed, 14 Feb 2018 13:32:24 +1300
Subject: [PATCH 5/6] repl_metadata: Avoid silent skipping an object during DRS
 (due to RODC name collisions)

No error code was being set in this case, and so, we would commit the
HWM and UDV without actually having all the updates.

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 selftest/knownfail.d/rodc_repl                  | 1 -
 source4/dsdb/samdb/ldb_modules/repl_meta_data.c | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/selftest/knownfail.d/rodc_repl b/selftest/knownfail.d/rodc_repl
index f6313c1..bd5c04e 100644
--- a/selftest/knownfail.d/rodc_repl
+++ b/selftest/knownfail.d/rodc_repl
@@ -1,2 +1 @@
-^samba4.drs.replica_sync_rodc.python.rodc..replica_sync_rodc.DrsReplicaSyncTestCase.test_ReplConflictsRODC.rodc
 ^samba4.drs.replica_sync_rodc.python.rodc..replica_sync_rodc.DrsReplicaSyncTestCase.test_ReplConflictsRODCRename.rodc
diff --git a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
index 198ac84..259a1ae 100644
--- a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
+++ b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
@@ -4920,6 +4920,7 @@ static int replmd_op_possible_conflict_callback(struct ldb_request *req, struct
 				       "Conflict adding object '%s' from incoming replication as we are read only for the partition.  \n"
 				       " - We must fail the operation until a master for this partition resolves the conflict",
 				       ldb_dn_get_linearized(conflict_dn));
+		ret = LDB_ERR_OPERATIONS_ERROR;
 		goto failed;
 	}
 
-- 
1.9.1


From 83058d6002a3b7531e185fede1bc73303f9c7294 Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Wed, 14 Feb 2018 13:32:33 +1300
Subject: [PATCH 6/6] repl_metadata: Avoid silent skipping an object during DRS
 (due to RODC rename collisions)

No error code was being set in this case, and so, we would commit the
HWM and UDV without actually having all the updates.

Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 selftest/knownfail.d/rodc_repl                  | 1 -
 source4/dsdb/samdb/ldb_modules/repl_meta_data.c | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)
 delete mode 100644 selftest/knownfail.d/rodc_repl

diff --git a/selftest/knownfail.d/rodc_repl b/selftest/knownfail.d/rodc_repl
deleted file mode 100644
index bd5c04e..0000000
--- a/selftest/knownfail.d/rodc_repl
+++ /dev/null
@@ -1 +0,0 @@
-^samba4.drs.replica_sync_rodc.python.rodc..replica_sync_rodc.DrsReplicaSyncTestCase.test_ReplConflictsRODCRename.rodc
diff --git a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
index 259a1ae..0e4ca10 100644
--- a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
+++ b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
@@ -5559,6 +5559,7 @@ static int replmd_replicated_handle_rename(struct replmd_replicated_request *ar,
 				       "Conflict adding object '%s' from incoming replication but we are read only for the partition.  \n"
 				       " - We must fail the operation until a master for this partition resolves the conflict",
 				       ldb_dn_get_linearized(conflict_dn));
+		ret = LDB_ERR_OPERATIONS_ERROR;
 		goto failed;
 	}
 
-- 
1.9.1



More information about the samba-technical mailing list