[SCM] CTDB repository - branch master updated - ctdb-2.4-146-g30a6565

Wed Oct 30 00:31:34 MDT 2013

The branch, master has been updated
       via  30a6565a7b476516f3daed0669b5650e1be3cd18 (commit)
       via  a7a844e7600b59d876de94ec5bf7bd1647508cdf (commit)
       via  15b5c6c00c248bc1a8364a6da103296a55d7bfb6 (commit)
       via  ca5fc3431573c44d55d09d987c715fb53756fc1f (commit)
       via  afd9b51644af074752d74c412cb4e7ec2eba2c69 (commit)
       via  275ed9ebe287e39d891888c13810c70f347af8ac (commit)
       via  c8b542e059a54b8d524bd430cad9d82e5edd864d (commit)
      from  eb8ec5681bfccb26c8ffae72952d54bb0ba46249 (commit)

http://gitweb.samba.org/?p=ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 30a6565a7b476516f3daed0669b5650e1be3cd18
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 16 11:46:54 2013 +1100

    doc: Update NEWS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit a7a844e7600b59d876de94ec5bf7bd1647508cdf
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 30 13:22:21 2013 +1100

    web: Add links to new manpages
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 15b5c6c00c248bc1a8364a6da103296a55d7bfb6
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:26:16 2013 +1000

    doc: Major updates to manual pages
    
    This includes new manpages for ctdb.7, ctdb.conf.5 and ctdb-tunables.7.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit ca5fc3431573c44d55d09d987c715fb53756fc1f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 30 12:37:15 2013 +1100

    tunables: Remove obsolete tunables
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit afd9b51644af074752d74c412cb4e7ec2eba2c69
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 30 12:17:37 2013 +1100

    recoverd: Rebalancing should be done regardless tunable
    
    Rebalance target nodes should be set even if a deferred rebalance is
    not configured.  The user can explicitly cause a takeover run.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 275ed9ebe287e39d891888c13810c70f347af8ac
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 30 11:32:28 2013 +1100

    recoverd: Improve an error message in the election code
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c8b542e059a54b8d524bd430cad9d82e5edd864d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 29 16:38:42 2013 +1100

    Revert "if a new node enters the cluster, that node will already be frozen at start"
    
    This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94.
    Furthermore, if a node doesn't force an election but wins it then it
    can fail to record that it is the new recovery master.  This can lead
    to a reverse split brain where there is no recovery master.
    
    This reverts commit c5035657606283d2e35bea40992505e84ca8e7be.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    
    Conflicts:
    	server/ctdb_recoverd.c

-----------------------------------------------------------------------

Summary of changes:
 NEWS                    |  103 +++
 doc/Makefile            |   10 +-
 doc/ctdb-tunables.7.xml |  708 +++++++++++++++
 doc/ctdb.1.xml          | 1861 +++++++++++++++++++++------------------
 doc/ctdb.7.xml          | 1001 +++++++++++++++++++++
 doc/ctdbd.1.xml         | 2215 +++++++++++------------------------------------
 doc/ctdbd.conf.5.xml    | 1598 ++++++++++++++++++++++++++++++++++
 doc/ctdbd_wrapper.1.xml |  106 +++
 doc/ltdbtool.1.xml      |  248 ++++--
 doc/onnode.1.xml        |  331 ++++---
 doc/ping_pong.1.xml     |  169 +++--
 include/ctdb_private.h  |    3 -
 server/ctdb_recoverd.c  |   56 +-
 server/ctdb_tunables.c  |    3 -
 web/documentation.html  |   14 +-
 15 files changed, 5529 insertions(+), 2897 deletions(-)
 create mode 100644 doc/ctdb-tunables.7.xml
 create mode 100644 doc/ctdb.7.xml
 create mode 100644 doc/ctdbd.conf.5.xml
 create mode 100644 doc/ctdbd_wrapper.1.xml


Changeset truncated at 500 lines:

diff --git a/NEWS b/NEWS
index be8f9dc..ae4cff6 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,106 @@
+Changes in CTDB 2.5
+===================
+
+User-visible changes
+--------------------
+
+* The default location of the ctdbd socket is now:
+
+    /var/run/ctdb/ctdbd.socket
+
+  If you currently set CTDB_SOCKET in configuration then unsetting it
+  will probably do what you want.
+
+* The default location of CTDB TDB databases is now:
+
+    /var/lib/ctdb
+
+  If you only set CTDB_DBDIR (to the old default of /var/ctdb) then
+  you probably want to move your databases to /var/lib/ctdb, drop your
+  setting of CTDB_DBDIR and just use the default.
+
+  To maintain the database files in /var/ctdb you will need to set
+  CTDB_DBDIR, CTDB_DBDIR_PERSISTENT and CTDB_DBDIR_STATE, since all of
+  these have moved.
+
+* Use of CTDB_OPTIONS to set ctdbd command-line options is no longer
+  supported.  Please use individual configuration variables instead.
+
+* Obsolete tunables VacuumDefaultInterval, VacuumMinInterval and
+  VacuumMaxInterval have been removed.  Setting them had no effect but
+  if you now try to set them in a configuration files via CTDB_SET_X=Y
+  then CTDB will not start.
+
+* Much improved manual pages.  Added new manpages ctdb(7),
+  ctdbd.conf(5), ctdb-tunables(7).  Still some work to do.
+
+* Most CTDB-specific configuration can now be set in
+  /etc/ctdb/ctdbd.conf.
+
+  This avoids cluttering distribution-specific configuration files,
+  such as /etc/sysconfig/ctdb.  It also means that we can say: see
+  ctdbd.conf(5) for more details.  :-)
+
+* Configuration variable NFS_SERVER_MODE is deprecated and has been
+  replaced by CTDB_NFS_SERVER_MODE.  See ctdbd.conf(5) for more
+  details.
+
+* "ctdb reloadips" is much improved and should be used for reloading
+  the public IP configuration.
+
+  This commands attempts to yield much more predictable IP allocations
+  than using sequences of delip and addip commands.  See ctdb(1) for
+  details.
+
+* Ability to pass comma-separated string to ctdb(1) tool commands via
+  the -n option is now documented and works for most commands.  See
+  ctdb(1) for details.
+
+* "ctdb rebalancenode" is now a debugging command and should not be
+  used in normal operation.  See ctdb(1) for details.
+
+* "ctdb ban 0" is now invalid.
+
+  This was documented as causing a permanent ban.  However, this was
+  not implemented and caused an "unban" instead.  To avoid confusion,
+  0 is now an invalid ban duration.  To administratively "ban" a node
+  use "ctdb stop" instead.
+
+* The systemd configuration now puts the PID file in /run/ctdb (rather
+  than /run/ctdbd) for consistency with the initscript and other uses
+  of /var/run/ctdb.
+
+Important bug fixes
+-------------------
+
+* Traverse regression fixed.
+
+* The default recovery method for persistent databases has been
+  changed to use database sequence numbers instead of doing
+  record-by-record recovery (using record sequence numbers).  This
+  fixes issues including registry corruption.
+
+* Banned nodes are no longer told to run the "ipreallocated" event
+  during a takeover run, when in fallback mode with nodes that don't
+  support the IPREALLOCATED control.
+
+Important internal changes
+--------------------------
+
+* Persistent transactions are now compatible with Samba and work
+  reliably.
+
+* The recovery master role has been made more stable by resetting the
+  priority time each time a node becomes inactive.  This means that
+  nodes that are active for a long time are more likely to retain the
+  recovery master role.
+
+* The incomplete libctdb library has been removed.
+
+* Test suite now starts ctdbd with the --sloppy-start option to speed
+  up startup.  However, this should not be done in production.
+
+
 Changes in CTDB 2.4
 ===================
 
diff --git a/doc/Makefile b/doc/Makefile
index 2f7d41d..34303a5 100644
--- a/doc/Makefile
+++ b/doc/Makefile
@@ -1,15 +1,19 @@
 DOCS = ctdb.1 ctdb.1.html \
 	ctdbd.1 ctdbd.1.html \
+	ctdbd_wrapper.1 ctdbd_wrapper.1.html \
 	onnode.1 onnode.1.html \
 	ltdbtool.1 ltdbtool.1.html \
-	ping_pong.1 ping_pong.1.html
+	ping_pong.1 ping_pong.1.html \
+	ctdbd.conf.5 ctdbd.conf.5.html \
+	ctdb.7 ctdb.7.html \
+	ctdb-tunables.7 ctdb-tunables.7.html
 
 all: $(DOCS)
 
-%.1: %.1.xml
+%: %.xml
 	xsltproc -o $@ http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl $<
 
-%.1.html: %.1.xml
+%.html: %.xml
 	xsltproc -o $@ http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl $<
 
 distclean:
diff --git a/doc/ctdb-tunables.7.xml b/doc/ctdb-tunables.7.xml
new file mode 100644
index 0000000..456e856
--- /dev/null
+++ b/doc/ctdb-tunables.7.xml
@@ -0,0 +1,708 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE refentry
+	PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+	"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+
+<refentry id="ctdb-tunables.7">
+
+  <refmeta>
+    <refentrytitle>ctdb-tunables</refentrytitle>
+    <manvolnum>7</manvolnum>
+    <refmiscinfo class="source">ctdb</refmiscinfo>
+    <refmiscinfo class="manual">CTDB - clustered TDB database</refmiscinfo>
+  </refmeta>
+
+  <refnamediv>
+    <refname>ctdb-tunables</refname>
+    <refpurpose>CTDB tunable configuration variables</refpurpose>
+  </refnamediv>
+
+  <refsect1>
+    <title>DESCRIPTION</title>
+
+    <para>
+      CTDB's behaviour can be configured by setting run-time tunable
+      variables.  This lists and describes all tunables.  See the
+      <citerefentry><refentrytitle>ctdb</refentrytitle>
+      <manvolnum>1</manvolnum></citerefentry>
+      <command>listvars</command>, <command>setvar</command> and
+      <command>getvar</command> commands for more details.
+    </para>
+
+    <refsect2>
+      <title>MaxRedirectCount</title>
+      <para>Default: 3</para>
+      <para>
+	If we are not the DMASTER and need to fetch a record across the network
+	we first send the request to the LMASTER after which the record
+	is passed onto the current DMASTER. If the DMASTER changes before
+	the request has reached that node, the request will be passed onto the
+	"next" DMASTER. For very hot records that migrate rapidly across the
+	cluster this can cause a request to "chase" the record for many hops
+	before it catches up with the record.
+
+	this is how many hops we allow trying to chase the DMASTER before we
+	switch back to the LMASTER again to ask for new directions.
+      </para>
+      <para>
+	When chasing a record, this is how many hops we will chase the record
+	for before going back to the LMASTER to ask for new guidance.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>SeqnumInterval</title>
+      <para>Default: 1000</para>
+      <para>
+	Some databases have seqnum tracking enabled, so that samba will be able
+	to detect asynchronously when there has been updates to the database.
+	Everytime a database is updated its sequence number is increased.
+      </para>
+      <para>
+	This tunable is used to specify in 'ms' how frequently ctdb will
+	send out updates to remote nodes to inform them that the sequence
+	number is increased.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>ControlTimeout</title>
+      <para>Default: 60</para>
+      <para>
+	This is the default
+	setting for timeout for when sending a control message to either the
+	local or a remote ctdb daemon.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>TraverseTimeout</title>
+      <para>Default: 20</para>
+      <para>
+	This setting controls how long we allow a traverse process to run.
+	After this timeout triggers, the main ctdb daemon will abort the
+	traverse if it has not yet finished.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>KeepaliveInterval</title>
+      <para>Default: 5</para>
+      <para>
+	How often in seconds should the nodes send keepalives to eachother.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>KeepaliveLimit</title>
+      <para>Default: 5</para>
+      <para>
+	After how many keepalive intervals without any traffic should a node
+	wait until marking the peer as DISCONNECTED.
+      </para>
+      <para>
+	If a node has hung, it can thus take KeepaliveInterval*(KeepaliveLimit+1)
+	seconds before we determine that the node is DISCONNECTED and that we
+	require a recovery. This limitshould not be set too high since we want
+	a hung node to be detectec, and expunged from the cluster well before
+	common CIFS timeouts (45-90 seconds) kick in.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>RecoverTimeout</title>
+      <para>Default: 20</para>
+      <para>
+	This is the default setting for timeouts for controls when sent from the
+	recovery daemon. We allow longer control timeouts from the recovery daemon
+	than from normal use since the recovery dameon often use controls that 
+	can take a lot longer than normal controls.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>RecoverInterval</title>
+      <para>Default: 1</para>
+      <para>
+	How frequently in seconds should the recovery daemon perform the
+	consistency checks that determine if we need to perform a recovery or not.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>ElectionTimeout</title>
+      <para>Default: 3</para>
+      <para>
+	When electing a new recovery master, this is how many seconds we allow
+	the election to take before we either deem the election finished
+	or we fail the election and start a new one.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>TakeoverTimeout</title>
+      <para>Default: 9</para>
+      <para>
+	This is how many seconds we allow controls to take for IP failover events.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>MonitorInterval</title>
+      <para>Default: 15</para>
+      <para>
+	How often should ctdb run the event scripts to check for a nodes health.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>TickleUpdateInterval</title>
+      <para>Default: 20</para>
+      <para>
+	How often will ctdb record and store the "tickle" information used to
+	kickstart stalled tcp connections after a recovery.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>EventScriptTimeout</title>
+      <para>Default: 20</para>
+      <para>
+	How long should ctdb let an event script run before aborting it and
+	marking the node unhealthy.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>EventScriptTimeoutCount</title>
+      <para>Default: 1</para>
+      <para>
+	How many events in a row needs to timeout before we flag the node UNHEALTHY.
+	This setting is useful if your scripts can not be written so that they
+	do not hang for benign reasons.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>EventScriptUnhealthyOnTimeout</title>
+      <para>Default: 0</para>
+      <para>
+	This setting can be be used to make ctdb never become UNHEALTHY if your
+	eventscripts keep hanging/timing out.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>RecoveryGracePeriod</title>
+      <para>Default: 120</para>
+      <para>
+	During recoveries, if a node has not caused recovery failures during the
+	last grace period, any records of transgressions that the node has caused
+	recovery failures will be forgiven. This resets the ban-counter back to 
+	zero for that node.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>RecoveryBanPeriod</title>
+      <para>Default: 300</para>
+      <para>
+	If a node becomes banned causing repetitive recovery failures. The node will
+	eventually become banned from the cluster.
+	This controls how long the culprit node will be banned from the cluster
+	before it is allowed to try to join the cluster again.
+	Don't set to small. A node gets banned for a reason and it is usually due
+	to real problems with the node.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>DatabaseHashSize</title>
+      <para>Default: 100001</para>
+      <para>
+	Size of the hash chains for the local store of the tdbs that ctdb manages.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>DatabaseMaxDead</title>
+      <para>Default: 5</para>
+      <para>
+	How many dead records per hashchain in the TDB database do we allow before
+	the freelist needs to be processed.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>RerecoveryTimeout</title>
+      <para>Default: 10</para>
+      <para>
+	Once a recovery has completed, no additional recoveries are permitted
+	until this timeout has expired.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>EnableBans</title>
+      <para>Default: 1</para>
+      <para>
+	When set to 0, this disables BANNING completely in the cluster and thus
+	nodes can not get banned, even it they break. Don't set to 0 unless you
+	know what you are doing.  You should set this to the same value on
+	all nodes to avoid unexpected behaviour.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>DeterministicIPs</title>
+      <para>Default: 0</para>
+      <para>
+	When enabled, this tunable makes ctdb try to keep public IP addresses
+	locked to specific nodes as far as possible. This makes it easier for
+	debugging since you can know that as long as all nodes are healthy
+	public IP X will always be hosted by node Y. 
+      </para>
+      <para>
+	The cost of using deterministic IP address assignment is that it
+	disables part of the logic where ctdb tries to reduce the number of
+	public IP assignment changes in the cluster. This tunable may increase
+	the number of IP failover/failbacks that are performed on the cluster
+	by a small margin.
+      </para>
+
+    </refsect2>
+    <refsect2>
+      <title>LCP2PublicIPs</title>
+      <para>Default: 1</para>
+      <para>
+	When enabled this switches ctdb to use the LCP2 ip allocation
+	algorithm.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>ReclockPingPeriod</title>
+      <para>Default: x</para>
+      <para>
+	Obsolete
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>NoIPFailback</title>
+      <para>Default: 0</para>
+      <para>
+	When set to 1, ctdb will not perform failback of IP addresses when a node
+	becomes healthy. Ctdb WILL perform failover of public IP addresses when a
+	node becomes UNHEALTHY, but when the node becomes HEALTHY again, ctdb
+	will not fail the addresses back.
+      </para>
+      <para>
+	Use with caution! Normally when a node becomes available to the cluster
+	ctdb will try to reassign public IP addresses onto the new node as a way
+	to distribute the workload evenly across the clusternode. Ctdb tries to
+	make sure that all running nodes have approximately the same number of
+	public addresses it hosts.
+      </para>
+      <para>
+	When you enable this tunable, CTDB will no longer attempt to rebalance
+	the cluster by failing IP addresses back to the new nodes. An unbalanced
+	cluster will therefore remain unbalanced until there is manual
+	intervention from the administrator. When this parameter is set, you can
+	manually fail public IP addresses over to the new node(s) using the
+	'ctdb moveip' command.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>DisableIPFailover</title>
+      <para>Default: 0</para>
+      <para>
+	When enabled, ctdb will not perform failover or failback. Even if a
+	node fails while holding public IPs, ctdb will not recover the IPs or
+	assign them to another node.
+      </para>
+      <para>
+	When you enable this tunable, CTDB will no longer attempt to recover
+	the cluster by failing IP addresses over to other nodes. This leads to
+	a service outage until the administrator has manually performed failover
+	to replacement nodes using the 'ctdb moveip' command.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>NoIPTakeover</title>
+      <para>Default: 0</para>
+      <para>
+	When set to 1, ctdb will not allow IP addresses to be failed over
+	onto this node. Any IP addresses that the node currently hosts
+	will remain on the node but no new IP addresses can be failed over
+	to the node.
+      </para>
+    </refsect2>
+
+    <refsect2>
+      <title>NoIPHostOnAllDisabled</title>
+      <para>Default: 0</para>
+      <para>
+	If no nodes are healthy then by default ctdb will happily host
+	public IPs on disabled (unhealthy or administratively disabled)
+	nodes.  This can cause problems, for example if the underlying
+	cluster filesystem is not mounted.  When set to 1 on a node and
+	that node is disabled it, any IPs hosted by this node will be
+	released and the node will not takeover any IPs until it is no
+	longer disabled.
+      </para>
+    </refsect2>


-- 
CTDB repository