[PATCH] Fix LDAP connection timeout during large join

Tim Beale timbeale at catalyst.net.nz
Thu Oct 18 02:48:59 UTC 2018


Attached is a fix for bug: https://bugzilla.samba.org/show_bug.cgi?id=13612

With a large DB (mostly lots of links), the replication and commit can
take so long that the LDAP connection to the remote DC times out. This
fixes it, by adding a sanity-check that the connection is still alive,
and reconnecting if not.

CI link: https://gitlab.com/catalyst-samba/samba/pipelines/33370455

Review appreciated. Thanks.

-------------- next part --------------
From ca2afd5143206a792a40fd655a05c000b991e64a Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Wed, 17 Oct 2018 14:41:12 +1300
Subject: [PATCH 1/2] join: LDAP connection to remote DC can timeout in large
 join

When joining a very large domain (e.g. 100K users), the replication can
take so long that the LDAP connection to the remote DC times out.

This patch avoids the problem by adding in a sanity-check after the
replication finishes that the LDB connection is still alive. If not,
then we reconnect.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13612

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 python/samba/join.py | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/python/samba/join.py b/python/samba/join.py
index 3869947..c726fbd 100644
--- a/python/samba/join.py
+++ b/python/samba/join.py
@@ -1020,6 +1020,27 @@ class DCJoinContext(object):
         else:
             ctx.local_samdb.transaction_commit()
 
+        # A large replication may have caused our LDB connection to the
+        # remote DC to timeout, so check the connection is still alive
+        ctx.refresh_ldb_connection()
+
+    def refresh_ldb_connection(ctx):
+        try:
+            # query the rootDSE to check the connection
+            ctx.samdb.search(scope=ldb.SCOPE_ONELEVEL, attrs=["dn"])
+        except ldb.LdbError as e4:
+            (enum, estr) = e4.args
+
+            # if the connection was disconnected, then reconnect
+            if (enum == ldb.ERR_OPERATIONS_ERROR and
+                'NT_STATUS_CONNECTION_DISCONNECTED' in estr):
+                ctx.logger.warning("LDB connection disconnected. Reconnecting")
+                ctx.samdb = SamDB(url="ldap://%s" % ctx.server,
+                                  session_info=system_session(),
+                                  credentials=ctx.creds, lp=ctx.lp)
+            else:
+                raise DCJoinException(estr)
+
     def send_DsReplicaUpdateRefs(ctx, dn):
         r = drsuapi.DsReplicaUpdateRefsRequest1()
         r.naming_context = drsuapi.DsReplicaObjectIdentifier()
-- 
2.7.4


From 08fa3034ac548875855fe68833cdece9c604986a Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Thu, 18 Oct 2018 13:07:20 +1300
Subject: [PATCH 2/2] join: Sanity-check LDB connection before failed join
 cleanup

Joining a large DB can take so long that the LDAP connection times out.
The previous patch fixed the 'happy case' where the join succeeds.
However, if the commit or replication fails (throwing an exception),
then the cleanup code can also fail when it tries to delete objects from
the remote DC. This then gives you an error pointing to
cleanup_old_accounts() rather than what actually went wrong.

This patch adds a sanity-check that if the join fails, that the LDB
connection to the remote DC is still alive, before we start deleting
objects.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13612

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 python/samba/join.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/python/samba/join.py b/python/samba/join.py
index c726fbd..3642880 100644
--- a/python/samba/join.py
+++ b/python/samba/join.py
@@ -1443,6 +1443,10 @@ class DCJoinContext(object):
                 print("Join failed - cleaning up")
             except IOError:
                 pass
+
+            # cleanup the failed join (checking we still have a live LDB
+            # connection to the remote DC first)
+            ctx.refresh_ldb_connection()
             ctx.cleanup_old_join()
             raise
 
-- 
2.7.4



More information about the samba-technical mailing list