[SCM] Samba Shared Repository - branch master updated

Amitay Isaacs amitay at samba.org
Fri Dec 4 11:26:04 UTC 2015


The branch, master has been updated
       via  fe91857 ctdb-ipalloc: Rename top level IP allocation algorithm functions
       via  821aa24 ctdb-ipalloc: Rename ctdb_takeover_run_core() to ipalloc()
       via  99abcc1 ctdb-ipalloc: Fold force_rebalance_candidates into IP allocation state
       via  13aa583 ctdb-ipalloc: Fold all IPs list into IP allocation state
       via  fb66232 ctdb-ipalloc: Tidy up some of the IP allocation functions
       via  5dcc1d7 ctdb-daemon: Don't delete connection information for released IP
       via  4261d6e ctdb-daemon: Move VNN lookup out of ctdb_remove_tcp_connection()
       via  473f1a7 ctdb-daemon: Do not process tickle updates for hosted IP addresses
       via  80c0511 ctdb-docs: Rewrite event script documentation
       via  d7424f9 ctdb-scripts: Add exportfs cache to NFS Linux kernel callout
       via  bd7c94d ctdb-recoverd: Drop function unban_all_nodes()
       via  ad66858 ctdb-daemon: Drop handling of ban control sent to unexpected node
      from  e153501 ldb torture: Test ldb unpacking and printing

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit fe918572cb7330b9413eb88035eb133476d58a99
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 6 16:27:17 2015 +1100

    ctdb-ipalloc: Rename top level IP allocation algorithm functions
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
    Autobuild-Date(master): Fri Dec  4 12:25:14 CET 2015 on sn-devel-104

commit 821aa24ffdda78eed13397767c3d76869238bf7e
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 6 16:25:32 2015 +1100

    ctdb-ipalloc: Rename ctdb_takeover_run_core() to ipalloc()
    
    It just does IP allocation...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 99abcc108c3c3379230684a81862f7da8e0a0131
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 6 16:20:53 2015 +1100

    ctdb-ipalloc: Fold force_rebalance_candidates into IP allocation state
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 13aa583ea46d7d2468ed4e9c58124259b7176fe0
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 6 15:55:07 2015 +1100

    ctdb-ipalloc: Fold all IPs list into IP allocation state
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit fb662321557ab0de4df60e8630e332b21b671edf
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 6 13:02:04 2015 +1100

    ctdb-ipalloc: Tidy up some of the IP allocation functions
    
    Shorter temporary variables for compactness/readability.  "tmp_ip" is
    5 characters longer than "t".  In each for statement it is used 4
    times, so costs 20 characters.  Save those extra characters so that
    future edits will avoid going over 80 columns.
    
    Tweak whitespace for readability, rewrap some code.
    
    No functional changes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 5dcc1d7a69b0123fb8490a13a8852e7044e1ad88
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 22 00:13:48 2015 +1000

    ctdb-daemon: Don't delete connection information for released IP
    
    As per the comment:
    
      If the IP address is hosted on this node then remove the connection.
    
      Otherwise this function has been called because the server IP
      address has been released to another node and the client has exited.
      This means that we should not delete the connection information.
      The takeover node processes connections too.
    
    This doesn't matter at the moment, since the empty connection list for
    an IP address that has been released will never be pushed to another
    node.  However, it matters if the connection information is stored in
    a real replicated database.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 4261d6e70ad2f03d80ae263a04216e0728139520
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 21 22:42:13 2015 +1000

    ctdb-daemon: Move VNN lookup out of ctdb_remove_tcp_connection()
    
    In a subsequent commit ctdb_takeover_client_destructor_hook() needs to
    know the VNN.  So just have both callers of
    ctdb_remove_tcp_connection() do the lookup and pass in the VNN.
    
    This should cause no change in behaviour.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 473f1a77e1791cbe3a079ba7b20dd85a1797ee61
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Mar 27 15:30:16 2015 +1100

    ctdb-daemon: Do not process tickle updates for hosted IP addresses
    
    Tickle list updates are broadcast to all connected nodes and are
    accepted even when received on the same node that sent them.  This
    could actually lead to lost connection information when information
    about new connections is received while an update is in-flight.
    
    Instead, return early when the IP is hosted on the current node, since
    it is the only one that could have sent the update.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 80c05114210ac78841a34a0e44765166f1501364
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Nov 26 19:30:20 2015 +1100

    ctdb-docs: Rewrite event script documentation
    
    Move information about TCP connection tracking and resetting into
    ctdb.7.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit d7424f97c91618a2077cd6754151ec6a61ee8801
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 2 09:38:04 2015 +1000

    ctdb-scripts: Add exportfs cache to NFS Linux kernel callout
    
    exportfs can hang when, for example, DNS is flakey.  Given that
    exports don't change much, it makes sense to cache them.
    
    Don't try to add error handling when exportfs fails but do print a
    warning.  Proper error handling can be added separately.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit bd7c94d5ac27174a2bff03e3409127d042b7f26d
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Nov 26 19:31:28 2015 +1100

    ctdb-recoverd: Drop function unban_all_nodes()
    
    It hasn't worked since commit cda5f02c7c3491917d831ee23b93278dfaa5c82b
    in 2009, which reworked the banning code.  Since then
    ctdb_control_modflags() has contained a comment saying:
    
      /* we don't let other nodes modify our BANNED status */
    
    Unbanning all nodes originally occurred here when the recovery master
    role moved to a new node.  The logic could have been meant for the
    case when the old recovery master was malfunctioning, so got banned.
    If any other nodes had been banned by this recovery master then they
    would be unbanned.  However, this would also unban the old recovery
    master, which is probably suboptimal.  The logic would also trigger if
    a node was banned for a good reason and then the recovery master was
    stopped.  So, apart from doing nothing, the logic is too simplistic so
    might as well be removed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit ad6685847b18f63acb27d1bd998f0e757a6302eb
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 29 19:34:23 2015 +1000

    ctdb-daemon: Drop handling of ban control sent to unexpected node
    
    The banning code caters for the case where the node specified in the
    bantime data is not the node receiving the control.  This never
    happens.  There are 2 places where ctdb_ctrl_set_ban() is called: the
    ctdb CLI tool and the recovery daemon.  Both pass the same node in the
    bantime data that they are sending the control to.  There are no plans
    to do anything more elaborate, so just delete the handling of this
    special case.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/config/events.d/README                     | 271 ++++++++++++-----------
 ctdb/config/nfs-linux-kernel-callout            |  32 ++-
 ctdb/doc/ctdb.7.xml                             |  24 +-
 ctdb/server/ctdb_banning.c                      |  25 +--
 ctdb/server/ctdb_recoverd.c                     |  33 ---
 ctdb/server/ctdb_takeover.c                     | 279 +++++++++++++-----------
 ctdb/tests/eventscripts/scripts/local.sh        |   3 +
 ctdb/tests/eventscripts/stubs/ctdb              |   2 +-
 ctdb/tests/src/ctdb_takeover_tests.c            |  74 +++----
 ctdb/tests/takeover/scripts/local.sh            |   2 +-
 ctdb/tests/takeover/simulation/ctdb_takeover.py |   4 +-
 11 files changed, 393 insertions(+), 356 deletions(-)


Changeset truncated at 500 lines:

diff --git a/ctdb/config/events.d/README b/ctdb/config/events.d/README
index c7c3dea..11da702 100644
--- a/ctdb/config/events.d/README
+++ b/ctdb/config/events.d/README
@@ -1,170 +1,191 @@
-This directory is where you should put any local or application
-specific event scripts for ctdb to call.
+The events.d/ directory contains event scripts used by CTDB.  Event
+scripts are triggered on certain events, such as startup, monitoring
+or public IP allocation.  Scripts may be specific to services,
+networking or internal CTDB operations.
+
+All event scripts start with the prefix 'NN.' where N is a digit.  The
+event scripts are run in sequence based on NN.  Thus 10.interface will
+be run before 60.nfs.  It is recommended to keep each NN unique.
+However, scripts with the same NN prefix will be executed in
+alphanumeric sort order.
+
+As a special case, any eventscript that ends with a '~' character will be
+ignored since this is a common postfix that some editors will append to
+older versions of a file.
 
-All event scripts start with the prefic 'NN.' where N is a digit.
-The event scripts are run in sequence based on NN.
-Thus 10.interfaces will be run before 60.nfs.
+Only executable event scripts are run by CTDB.  Any event script that
+does not have execute permission is ignored.
 
-Each NN must be unique and duplicates will cause undefined behaviour.
-I.e. having both 10.interfaces and 10.otherstuff is not allowed.
+The eventscripts are called with varying number of arguments.  The
+first argument is the event name and the rest of the arguments depend
+on the event name.
 
+Event scripts must return 0 for success and non-zero for failure.
 
-As a special case, any eventscript that ends with a '~' character will be 
-ignored since this is a common postfix that some editors will append to 
-older versions of a file.
+Output of event scripts is logged.  On failure the output of the
+failing event script is included in the output of "ctdb scriptstatus".
 
-Only event scripts with executable permissions are run from CTDB. Any event
-script that does not have executable permission is ignored.
+The following events are supported (with arguments shown):
 
-The eventscripts are called with varying number of arguments.
-The first argument is the "event" and the rest of the arguments depend
-on which event was triggered.
+init
 
-All of the events except the 'shutdown' and 'startrecovery' events will be
-called with the ctdb daemon in NORMAL mode (ie. not in recovery)
+	This event is triggered once when CTDB is starting up.  This
+	event is used to do some basic cleanup and initialisation.
 
-The events currently implemented are
-init
-	This event does not take any additional arguments.
-	This event is only invoked once, when ctdb is starting up.
-	This event is used to do some cleanup work from earlier runs
-	and prepare the basic setup.
-	At this stage 'ctdb' commands won't work.
+	During the "init" event CTDB is not listening on its Unix
+	domain socket, so the "ctdb" CLI will not work.
 
-	Example: 00.ctdb cleans up $CTDB_SCRIPT_VARDIR
+	Failure of this event will cause CTDB to terminate.
+
+	Example: 00.ctdb creates $CTDB_SCRIPT_VARDIR
 
 setup
-	This event does not take any additional arguments.
-	This event is only invoked once, after init event is completed.
-	This event is used to do setup any tunables defined in ctdb 
-        configuration file.
+
+	This event is triggered once, after the "init" event has
+	completed.
+
+	For this and any subsequent events the CTDB Unix domain socket
+	is available, so the "ctdb" CLI will work.
+
+	Failure of this event will cause CTDB to terminate.
+
+	Example: 00.ctdb processes tunables defined in the CTDB
+        configuration using CTDB_SET_<TunableName>=<TunableValue>.
 
 startup
-	This event does not take any additional arguments.
-	This event is only invoked once, when ctdb has finished
-	the initial recoveries. This event is used to wait for
-	the service to start and all resources for the service
-	becoming available.
 
-	This is used to prevent ctdb from starting up and advertize its
-	services until all dependent services have become available.
+	This event is triggered after the "setup" event has completed
+	and CTDB has finished its initial database recovery.
+
+	This event starts all services that are managed by CTDB.  Each
+	service that is managed by CTDB should implement this event
+	and use it to (re)start the service.
 
-	All services that are managed by ctdb should implement this
-	event and use it to start the service.
+	If the "startup" event fails then CTDB will retry it until it
+	succeeds.  There is no limit on the number of retries.
 
-	Example: 50.samba uses this event to start the samba daemon
-	and then wait until samba and all its associated services have
-	become available. It then also proceeds to wait until all
-	shares have become available.
+	Example: 50.samba uses this event to start the Samba daemon if
+	CTDB_MANAGES_SAMBA=yes.
 
 shutdown
-	This event is called when the ctdb service is shuting down.
-	
-	All services that are managed by ctdb should implement this event
-	and use it to perform a controlled shutdown of the service.
 
-	Example: 60.nfs uses this event to shut down nfs and all associated
-	services and stop exporting any shares when this event is invoked.
+	This event is triggered when CTDB is shutting down.
+
+	This event shuts down all services that are managed by CTDB.
+	Each service that is managed by CTDB should implement this
+	event and use it to stop the service.
+
+	Example: 50.samba uses this event to shut down the Samba
+	daemon if CTDB_MANAGES_SAMBA=yes.
 
 monitor
-	This event is invoked every X number of seconds.
-	The interval can be configured using the MonitorInterval tunable
-	but defaults to 15 seconds.
 
-	This event is triggered by ctdb to continuously monitor that all
-	managed services are healthy.
-	When invoked, the event script will check that the service is healthy
-	and return 0 if so. If the service is not healthy the event script
-	should return non zero.
+	This event is run periodically.  The interval between
+	successive "monitor" events is configured using the
+	MonitorInterval tunable, which defaults to 15 seconds.
 
-	If a service returns nonzero from this script this will cause ctdb
-	to consider the node status as UNHEALTHY and will cause the public
-	address and all associated services to be failed over to a different
-	node in the cluster.
+	This event is triggered by CTDB to continuously monitor that
+	all managed services are healthy.  If all event scripts
+	complete then the monitor event successfully then the node is
+	marked HEALTHY.  If any event script fails then no subsequent
+	scripts will be run for that event and the node is marked
+	UNHEALTHY.
 
-	All managed services should implement this event.
+	Each service that is managed by CTDB should implement this
+	event and use it to monitor the service.
 
-	Example: 10.interfaces which checks that the public interface (if used)
-	is healthy, i.e. it has a physical link established.
+	Example: 10.interface checks that each configured interface
+	for public IP addresses has a physical link established.
 
-takeip
-	This event is triggered everytime the node takes over a public ip
-	address during recovery.
-	This event takes three additional arguments :
-	'interface' 'ipaddress' and 'netmask'
+startrecovery
 
-	Before this event there will always be a 'startrecovery' event.
+	This event is triggered every time a database recovery process
+	is started.
 
-	This event will always be followed by a 'recovered' event once
-	all ipaddresses have been reassigned to new nodes and the ctdb database
-	has been recovered.
-	If multiple ip addresses are reassigned during recovery it is
-	possible to get several 'takeip' events followed by a single 
-	'recovered' event.
+	This is rarely used.
 
-	Since there might involve substantial work for the service when an ip
-	address is taken over and since multiple ip addresses might be taken 
-	over in a single recovery it is often best to only mark which addresses
-	are being taken over in this event and defer the actual work to 
-	reconfigure or restart the services until the 'recovered' event.
+recovered
 
-	Example: 60.nfs which just records which ip addresses are being taken
-	over into a local state directory   and which defers the actual
-	restart of the services until the 'recovered' event.
+	This event is triggered every time a database recovery process
+	is completed.
 
+	This is rarely used.
 
-releaseip
-	This event is triggered everytime the node releases a public ip
-	address during recovery.
-	This event takes three additional arguments :
-	'interface' 'ipaddress' and 'netmask'
+takeip <interface> <ip-address> <netmask-bits>
 
-	In all other regards this event is analog to the 'takeip' event above.
+	This event is triggered for each public IP address taken by a
+	node during IP address (re)assignment.  Multiple "takeip"
+	events can be run in parallel if multiple IP addresses are
+	being assigned.
 
-	Example: 60.nfs
+	Example: In 10.interface the "ip" command (from the Linux
+	iproute2 package) is used to add the specified public IP
+	address to the specified interface.  The "ip" command can
+	safely be run concurrently.  However, the "iptables" command
+	cannot be run concurrently so a wrapper is used to serialise
+	runs using exclusive locking.
 
-updateip
-	This event is triggered everytime the node moves a public ip
-	address between interfaces
-	This event takes four additional arguments :
-	'old-interface' 'new-interface' 'ipaddress' and 'netmask'
+	If substantial work is required to reconfigure a service when
+	a public IP address is taken over it can be better to defer
+	service reconfiguration to the "ipreallocated" event, after
+	all IP addresses have been assigned.
 
-	Example: 10.interface
+	Example: 60.nfs uses ctdb_service_set_reconfigure() to flag
+	that public IP addresses have changed so that service
+	reconfiguration will occur in the "ipreallocated" event.
 
-startrecovery
-	This event is triggered everytime we start a recovery process
-	or before we start changing ip address allocations.
+releaseip <interface> <ip-address> <netmask-bits>
+
+	This event is triggered for each public IP address released by
+	a node during IP address (re)assignment.  Multiple "releaseip"
+	events can be run in parallel if multiple IP addresses are
+	being unassigned.
+
+	In all other regards, this event is analogous to the "takeip"
+	event above.
+
+updateip <old-interface> <new-interface> <ip-address> <netmask-bits>
+
+	This event is triggered for each public IP address moved
+	between interfaces on a node during IP address (re)assignment.
+	Multiple "updateip" events can be run in parallel if multiple
+	IP addresses are being moved.
+
+        This event is only used if multiple interfaces are capable of
+        hosting an IP address, as specified in the public addresses
+        configuration file.
+
+	This event is similar to the "takeip" event above.
 
-recovered
-	This event is triggered every time we have finished a full recovery
-	and also after we have finished reallocating the public ip addresses
-	across the cluster.
-
-	Example: 60.nfs which if the ip address configuration has changed
-	during the recovery (i.e. if addresses have been taken over or
-	released) will kill off any tcp connections that exist for that
-	service and also send out statd notifications to all registered 
-	clients.
-	
 ipreallocated
 
-	This event is triggered after releaseip and takeip events in a
-	takeover run.  It can be used to reconfigure services, update
-	routing and many other things.
+	This event is triggered after "releaseip", "takeip" and
+	"updateip" events during public IP address (re)assignment.
+
+	This event is used to reconfigure services.
+
+        This event runs even if public IP addresses on a node have not
+	been changed.  This allows reconfiguration to depend on the
+	states of other nodes rather that just IP addresses.
+
+	Example: 11.natgw recalculates the NAT gateway master and
+	updates the relevant network configuration on each node if the
+	NAT gateway master has changed.
 
-Additional note for takeip, releaseip, recovered:
+Additional notes for "takeip", "releaseip", "updateip",
+ipreallocated":
 
-ALL services that depend on the ip address configuration of the node must 
-implement all three of these events.
+* Failure of any of these events causes IP allocation to be retried.
 
-ALL services that use TCP should also implement these events and at least
-kill off any tcp connections to the service if the ip address config has 
-changed in a similar fashion to how 60.nfs does it.
-The reason one must do this is that ESTABLISHED tcp connections may survive
-when an ip address is released and removed from the host until the ip address
-is re-takenover.
-Any tcp connections that survive a release/takeip sequence can potentially
-cause the client/server tcp connection to get out of sync with sequence and 
-ack numbers and cause a disruptive ack storm.
+* The "ipreallocated" event is run on all nodes.  It is even run if no
+  "takeip", "releaseip" or "updateip" events were triggered.
 
+* An event script can use ctdb_service_set_reconfigure() in "takeip"
+  or "releaseip" events to flag that its service needs to be
+  reconfigured.  The event script can then define a
+  service_reconfigure() function, which will be implicitly run before
+  the "ipreallocated" event.  This is a useful way of performing
+  reconfiguration that is conditional upon public IP address changes.
 
+  This means an explicit "ipreallocated" event handler is usually not
+  necessary.
diff --git a/ctdb/config/nfs-linux-kernel-callout b/ctdb/config/nfs-linux-kernel-callout
index 5e77031..9532906 100755
--- a/ctdb/config/nfs-linux-kernel-callout
+++ b/ctdb/config/nfs-linux-kernel-callout
@@ -3,6 +3,15 @@
 # Exit on 1st error
 set -e
 
+# NFS exports file.  Some code below keeps a cache of output derived
+# from exportfs(8).  When this file is updated the cache is invalid
+# and needs to be regenerated.
+#
+# To change the file, edit the default value below.  Do not set
+# CTDB_NFS_EXPORTS_FILE - it isn't a configuration variable, just a
+# hook for testing.
+nfs_exports_file="${CTDB_NFS_EXPORTS_FILE:-/var/lib/nfs/etab}"
+
 # Red Hat
 nfs_service="nfs"
 nfslock_service="nfslock"
@@ -170,10 +179,25 @@ nfs_check_thread_count ()
 
 nfs_monitor_list_shares ()
 {
-    exportfs -v |
-	grep '^/' |
-	sed -e 's@[[:space:]][[:space:]]*[^[:space:]()][^[:space:]()]*([^[:space:]()][^[:space:]()]*)$@@' |
-	sort -u
+    _cache_file="${CTDB_NFS_CALLOUT_STATE_DIR}/list_shares_cache"
+    if  [ ! -r "$nfs_exports_file" ] || [ ! -r "$_cache_file" ] || \
+	    [ "$nfs_exports_file" -nt "$_cache_file" ] ; then
+	mkdir -p "$CTDB_NFS_CALLOUT_STATE_DIR"
+	# We could just use the contents of $nfs_exports_file.
+	# However, let's regard that file as internal to NFS and use
+	# exportfs, which is the public API.
+	if ! _exports=$(exportfs -v) ; then
+	    echo "WARNING: failed to run exportfs to list NFS shares" >&2
+	    return
+	fi
+
+	echo "$_exports" |
+	    grep '^/' |
+	    sed -e 's@[[:space:]][[:space:]]*[^[:space:]()][^[:space:]()]*([^[:space:]()][^[:space:]()]*)$@@' |
+	    sort -u >"$_cache_file"
+    fi
+
+    cat "$_cache_file"
 }
 
 ##################################################
diff --git a/ctdb/doc/ctdb.7.xml b/ctdb/doc/ctdb.7.xml
index ffa51db..45d7c23 100644
--- a/ctdb/doc/ctdb.7.xml
+++ b/ctdb/doc/ctdb.7.xml
@@ -588,7 +588,29 @@ CTDB_LVS_PUBLIC_IP=10.1.1.237
 
     </refsect2>
   </refsect1>
-    
+
+  <refsect1>
+    <title>TRACKING AND RESETTING TCP CONNECTIONS</title>
+
+    <para>
+      CTDB tracks TCP connections from clients to public IP addresses,
+      on known ports.  When an IP address moves from one node to
+      another, all existing TCP connections to that IP address are
+      reset.  The node taking over this IP address will also send
+      gratuitous ARPs (for IPv4, or neighbour advertisement, for
+      IPv6).  This allows clients to reconnect quickly, rather than
+      waiting for TCP timeouts, which can be very long.
+    </para>
+
+    <para>
+      It is important that established TCP connections do not survive
+      a release and take of a public IP address on the same node.
+      Such connections can get out of sync with sequence and ACK
+      numbers, potentially causing a disruptive ACK storm.
+    </para>
+
+  </refsect1>
+
   <refsect1>
     <title>NAT GATEWAY</title>
 
diff --git a/ctdb/server/ctdb_banning.c b/ctdb/server/ctdb_banning.c
index 92bc510..56d3b29 100644
--- a/ctdb/server/ctdb_banning.c
+++ b/ctdb/server/ctdb_banning.c
@@ -87,27 +87,10 @@ int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata)
 	DEBUG(DEBUG_INFO,("SET BAN STATE\n"));
 
 	if (bantime->pnn != ctdb->pnn) {
-		if (bantime->pnn >= ctdb->num_nodes) {
-			DEBUG(DEBUG_ERR,(__location__ " ERROR: Invalid ban request. PNN:%d is invalid. Max nodes %d\n", bantime->pnn, ctdb->num_nodes));
-			return -1;
-		}
-		if (bantime->time == 0) {
-			DEBUG(DEBUG_NOTICE,("unbanning node %d\n", bantime->pnn));
-			ctdb->nodes[bantime->pnn]->flags &= ~NODE_FLAGS_BANNED;
-		} else {
-			DEBUG(DEBUG_NOTICE,("banning node %d\n", bantime->pnn));
-			if (ctdb->tunable.enable_bans == 0) {
-				/* FIXME: This is bogus. We really should be
-				 * taking decision based on the tunables on
-				 * the banned node and not local node.
-				 */
-				DEBUG(DEBUG_WARNING,("Bans are disabled - ignoring ban of node %u\n", bantime->pnn));
-				return 0;
-			}
-
-			ctdb->nodes[bantime->pnn]->flags |= NODE_FLAGS_BANNED;
-		}
-		return 0;
+		DEBUG(DEBUG_WARNING,
+		      ("SET_BAN_STATE control for PNN %d ignored\n",
+		       bantime->pnn));
+		return -1;
 	}
 
 	already_banned = false;
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index c9f19fa..1d63526 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -2345,37 +2345,6 @@ static int send_election_request(struct ctdb_recoverd *rec, uint32_t pnn)
 }
 
 /*
-  this function will unban all nodes in the cluster
-*/
-static void unban_all_nodes(struct ctdb_context *ctdb)
-{
-	int ret, i;
-	struct ctdb_node_map_old *nodemap;
-	TALLOC_CTX *tmp_ctx = talloc_new(ctdb);
-	
-	ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, tmp_ctx, &nodemap);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR,(__location__ " failed to get nodemap to unban all nodes\n"));
-		return;
-	}
-
-	for (i=0;i<nodemap->num;i++) {
-		if ( (!(nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED))
-		  && (nodemap->nodes[i].flags & NODE_FLAGS_BANNED) ) {
-			ret = ctdb_ctrl_modflags(ctdb, CONTROL_TIMEOUT(),
-						 nodemap->nodes[i].pnn, 0,
-						 NODE_FLAGS_BANNED);
-			if (ret != 0) {
-				DEBUG(DEBUG_ERR, (__location__ " failed to reset ban state\n"));
-			}
-		}
-	}
-
-	talloc_free(tmp_ctx);
-}
-
-
-/*
   we think we are winning the election - send a broadcast election request
  */
 static void election_send_request(struct tevent_context *ev,
@@ -2725,7 +2694,6 @@ static void election_handler(uint64_t srvid, TDB_DATA data, void *private_data)
 					timeval_current_ofs(0, 500000),
 					election_send_request, rec);
 		}
-		/*unban_all_nodes(ctdb);*/
 		return;
 	}
 
@@ -2735,7 +2703,6 @@ static void election_handler(uint64_t srvid, TDB_DATA data, void *private_data)
 	/* Release the recovery lock file */
 	if (ctdb_recovery_have_lock(ctdb)) {
 		ctdb_recovery_unlock(ctdb);
-		unban_all_nodes(ctdb);
 	}
 
 	clear_ip_assignment_tree(ctdb);


-- 
Samba Shared Repository



More information about the samba-cvs mailing list