[SCM] Samba Shared Repository - branch master updated

Mon Jan 17 11:17:02 UTC 2022

The branch, master has been updated
       via  da2e1047f1f WHATSNEW: Document CTDB leader and cluster lock changes
       via  f7de2132bb9 ctdb-doc: Remove documentation for recovery process
       via  a940ad93706 ctdb-doc: Update example configuration migration script
       via  01313ea243e ctdb-tests: Improve test coverage for leader role yield and elections
       via  5d317781498 ctdb-tests: Support commenting out local daemons configuration options
       via  34d2ca0ae64 ctdb-config: Add configuration option [cluster] leader timeout
       via  1dfb266038f ctdb-config: [legacy] recmaster capability -> [cluster] leader capability
       via  f5a39058f07 ctdb-config: [cluster] recovery lock -> [cluster] cluster lock
       via  d752a92e115 ctdb-doc: Update documentation for leader and cluster lock
       via  73555e8248a ctdb-recoverd: Use race for cluster lock as election when lock is enabled
       via  938d64c8ff3 ctdb-protocol: Mark {GET,SET}_RECMASTER controls obsolete
       via  03ae158cffc ctdb-protocol: Drop marshalling for {GET,SET}_RECMASTER controls
       via  a76374070d3 ctdb-daemon: Drop implementation of {GET,SET}_RECMASTER controls
       via  193b624d26a ctdb-protocol: Drop protocol client functions for recmaster controls
       via  cda673ff6dc ctdb-client: Drop unused recmaster functions
       via  16efbca0036 ctdb-daemon: Drop unused old client recmaster functions
       via  c68267b2a60 ctdb-recoverd: Drop calls to ctdb_ctrl_setrecmaster()
       via  58d7fcdf7c9 ctdb-recoverd: Drop recovery master verification
       via  f02e0974857 ctdb-tools: recovery master -> leader
       via  e60581d5b5e ctdb-tools: Use leader broadcast in get_leader()
       via  92fb68e9b8a ctdb-tools: Factor out get_leader()
       via  17ba15ccd88 ctdb-tools: Handle leader broadcasts in ctdb tool
       via  ec90f36cc61 ctdb-tools: Print "UNKNOWN" when leader PNN is unknown
       via  01a8d1a4a40 ctdb-client: Factor out function ctdb_client_wait_func_timeout()
       via  403db5b5288 ctdb-tests: Factor out getting leader and waiting for leader change
       via  4786982cc80 ctdb-tests: Add leader broadcasts to fake_ctdbd
       via  756dfdfed9f ctdb-tests: Implement srvid_handler for dispatching messages
       via  958746f947d ctdb-recoverd: Simplify some stopped/banned checks to inactive checks
       via  358c59f51ab ctdb-recoverd: No longer take cluster lock during recovery
       via  36ffaaa691c ctdb-recoverd: Add and use function cluster_lock_enabled()
       via  5ee664ee17f ctdb-recoverd: Terminology change: recovery lock -> cluster lock
       via  0f2250f4f9f ctdb-recoverd: Take cluster lock when election completes
       via  011e880002b ctdb-recoverd: Factor out function cluster_lock_take()
       via  037abf86206 ctdb-tests: Avoid a race
       via  ef7e3265f76 ctdb-tests: Setup cluster with expected arguments
       via  b029ca4d513 ctdb-recoverd: Drop leader validation
       via  7e53fab0a36 ctdb-recoverd: Drop special case for elected-before-connected
       via  ef4b8c13c07 ctdb-recoverd: Handle leader broadcast timeout
       via  5c7f6da0f0e ctdb-recoverd: Send leader broadcasts
       via  789a75abfa2 ctdb-recoverd: Process leader broadcasts
       via  3d3767a259b ctdb-protocol: Add CTDB_SRVID_LEADER
       via  c2cfd9c21aa ctdb-recoverd: Add an explicit flag for election in progress
       via  ac5a3ca063f ctdb-recoverd: Only start election if node can be leader
       via  7baadfe27ed ctdb-recoverd: Add and use function this_node_can_be_leader()
       via  94b546c268e ctdb-recoverd: Logging/comments: recovery master -> leader
       via  dd79e9bd14d ctdb-recoverd: Rename recmaster field to leader
       via  2ee6763c7d9 ctdb-recoverd: Use rec->pnn everywhere
       via  4af3b10a378 ctdb-recoverd: Change argument to srvid_disable_and_reply()
       via  b7c138ca99a ctdb-recoverd: Simplify arguments to ctdb_ban_node()
       via  a5e0ddac626 ctdb-recoverd: Simplify arguments to verify_local_ip_allocation()
       via  67b51916408 ctdb-recoverd: Simplify arguments to do_recovery()
       via  57882beb16a ctdb-recoverd: Simplify arguments to some election functions
       via  9dbe7cc85e4 ctdb-recoverd: Add PNN to recovery daemon context
       via  ff0140e4700 ctdb-recoverd: Use this_node_is_leader() in an extra context
       via  c8721d01c65 ctdb-recoverd: Factor out and use function this_node_is_leader()
      from  57a32cebdd8 ctdb-recoverd: Pass SIGHUP to running helper

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit da2e1047f1fc9f0ac98490c79c21c427b47274d5
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 14 13:39:34 2022 +1100

    WHATSNEW: Document CTDB leader and cluster lock changes
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Martin Schwenke <martins at samba.org>
    Autobuild-Date(master): Mon Jan 17 11:16:14 UTC 2022 on sn-devel-184

commit f7de2132bb999780331e5b005946ba5b494063c1
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 10 13:41:31 2022 +1100

    ctdb-doc: Remove documentation for recovery process
    
    This is many years out of date and recent changes make it worse.  It
    is unlikely that anyone has the time to fix this in the near future,
    so remove it because it is misleading.
    
    Database recovery steps are well documented in comments in the
    recovery helper.  Cluster monitoring documentation can be re-added
    when things stop changing.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit a940ad9370687c97d1ccb0f934842b69c1d44c76
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 17 09:16:17 2022 +1100

    ctdb-doc: Update example configuration migration script
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 01313ea243e4d52ea558ca4c53b6f4a1f07341e7
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 14 23:09:38 2022 +1100

    ctdb-tests: Improve test coverage for leader role yield and elections
    
    Rename test, clean up node selection.  Duplicate for for banning and
    removing leader capability cases.  Repeat all 3 tests without cluster
    lock.
    
    All of the standard election triggers are now tested, with and without
    cluster lock.  Due to test cluster configuration limitations, the
    tests without cluster lock are skipped on a real cluster.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 5d317781498a69c94b47ce47b60438e6cb520f96
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 14 13:59:25 2022 +1100

    ctdb-tests: Support commenting out local daemons configuration options
    
    Can be used to disable default options, such as cluster lock.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 34d2ca0ae6471c8d742b22aa4c57012232a2a832
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat Jan 15 13:02:02 2022 +1100

    ctdb-config: Add configuration option [cluster] leader timeout
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 1dfb266038f6fdf971bb0ffe0726f778b986371d
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 10 14:15:25 2022 +1100

    ctdb-config: [legacy] recmaster capability -> [cluster] leader capability
    
    Rename this configuration item and move it into the [cluster]
    configuration section.
    
    Update documentation to match.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit f5a39058f0743f5607df91cb698a2b15618e1360
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 10 19:18:14 2022 +1100

    ctdb-config: [cluster] recovery lock -> [cluster] cluster lock
    
    Retain "recovery lock" and mark as deprecated for backward
    compatibility.
    
    Some documentation is still inconsistent.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit d752a92e1153fa355b0cbaa1f482fdc0d88e42f5
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 10 14:18:32 2022 +1100

    ctdb-doc: Update documentation for leader and cluster lock
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 73555e8248aff683b6cb3a02262a66ab52f2c665
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Mar 18 15:14:39 2020 +1100

    ctdb-recoverd: Use race for cluster lock as election when lock is enabled
    
    If the cluster is partitioned then nodes in one partition can not take
    the lock anyway, so election is pointless.  It just introduces
    unnecessary corner cases.
    
    Instead just race for the lock.
    
    When a node notices a lack of leader and notifies other nodes of an
    election via an unknown leader broadcast, the cluster lock election is
    hooked into this broadcast.
    
    The test needs to be updated because losing the cluster lock can now
    result in a leadership change.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 938d64c8ff3d1776c2d5959714c4c11eba7278c4
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 6 00:19:38 2020 +1000

    ctdb-protocol: Mark {GET,SET}_RECMASTER controls obsolete
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 03ae158cffc3812f82365c65f8333768539f854d
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 6 00:10:22 2020 +1000

    ctdb-protocol: Drop marshalling for {GET,SET}_RECMASTER controls
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit a76374070d38e2dc86067ce413bb26b8e554c0b2
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 6 00:01:05 2020 +1000

    ctdb-daemon: Drop implementation of {GET,SET}_RECMASTER controls
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 193b624d26acffaa39a5fc393268f152b5809f99
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 5 23:58:38 2020 +1000

    ctdb-protocol: Drop protocol client functions for recmaster controls
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit cda673ff6dc6e33e947022305859f004197a803a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 5 23:56:10 2020 +1000

    ctdb-client: Drop unused recmaster functions
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 16efbca0036ee444aecfa0a992ff733bb182b2c7
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 5 23:52:05 2020 +1000

    ctdb-daemon: Drop unused old client recmaster functions
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit c68267b2a60559755835c4d56b5ba7c766155489
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 5 23:26:41 2020 +1000

    ctdb-recoverd: Drop calls to ctdb_ctrl_setrecmaster()
    
    Nothing fetches this value anymore.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 58d7fcdf7c9568a3a4b9d8e5db8b68f073409ab1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 5 23:25:34 2020 +1000

    ctdb-recoverd: Drop recovery master verification
    
    This doesn't make sense if leader broadcasts are used.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit f02e097485722badf27523c706adb99f21342f56
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 10 13:22:19 2022 +1100

    ctdb-tools: recovery master -> leader
    
    The following command names are changed:
    
      recmaster -> leader
      setrecmasterrole -> setleaderrole
    
    Command output changed for the following commands:
    
      status
      getcapabilities
    
    Documentation and tests are updated to reflect these changes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit e60581d5b5ecbac2b4bae49fbf60e071372fc2d3
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Mar 19 17:14:10 2020 +1100

    ctdb-tools: Use leader broadcast in get_leader()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 92fb68e9b8a5481d9dd5c9033c98e204035509fe
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Mar 19 17:30:24 2020 +1100

    ctdb-tools: Factor out get_leader()
    
    This seems pointless but it localises a subsequent change and also
    starts a terminology change in the tool code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 17ba15ccd88367dca82b0c4c8e4ff3f859896d87
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 4 17:56:22 2020 +1000

    ctdb-tools: Handle leader broadcasts in ctdb tool
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit ec90f36cc6185fc6ed13164fb13ec3630aff68ad
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Mar 19 10:46:25 2020 +1100

    ctdb-tools: Print "UNKNOWN" when leader PNN is unknown
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 01a8d1a4a400a3bacbe334ef0f379c03d64633d5
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 4 19:01:09 2020 +1000

    ctdb-client: Factor out function ctdb_client_wait_func_timeout()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 403db5b52882c91f35ae189bcf8f01f8180c7b50
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 14 21:47:52 2022 +1100

    ctdb-tests: Factor out getting leader and waiting for leader change
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 4786982cc80f4ec0c23673a144ac179fa60bde78
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 5 23:02:03 2020 +1000

    ctdb-tests: Add leader broadcasts to fake_ctdbd
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 756dfdfed9fe7d6acf2cf894d9918c8ac489571e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue May 5 16:53:39 2020 +1000

    ctdb-tests: Implement srvid_handler for dispatching messages
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Martin Schwenke <martin at meltin.net>

commit 958746f947dcd499b0fe9afee21e436912739284
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Mar 17 17:10:20 2020 +1100

    ctdb-recoverd: Simplify some stopped/banned checks to inactive checks
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 358c59f51ab39175ffe72afdfc4c2e0ed23b5929
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 4 17:45:51 2020 +1000

    ctdb-recoverd: No longer take cluster lock during recovery
    
    Confirm instead that it is already held.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 36ffaaa691c63896b7b92628b147b7a564421311
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 10 11:43:10 2021 +1100

    ctdb-recoverd: Add and use function cluster_lock_enabled()
    
    Now all references to ctdb->recovery_lock are encapsulated in the
    cluster lock code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 5ee664ee17fa4d2fbdea2be3f4c0b1fd8f8971b1
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 10 11:29:06 2021 +1100

    ctdb-recoverd: Terminology change: recovery lock -> cluster lock
    
    No functional changes, just name changes for clarity.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 0f2250f4f9f4efbf73e887538969c395c57e57be
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Sep 20 14:13:58 2018 +1000

    ctdb-recoverd: Take cluster lock when election completes
    
    It is no longer just a recovery lock but is always held by the cluster
    leader.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 011e880002b8d2bc783f96e8ea5713322fcc2a93
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Sep 20 12:30:58 2018 +1000

    ctdb-recoverd: Factor out function cluster_lock_take()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 037abf862069694acd849760175be9943a6fcd3e
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Mar 17 17:58:02 2020 +1100

    ctdb-tests: Avoid a race
    
    See the comment in the code for details.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit ef7e3265f76fbfdacdd9f17f3ddfca79ce823b60
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 7 17:00:36 2021 +1100

    ctdb-tests: Setup cluster with expected arguments
    
    ctdb_test_init() doesn't actually pass arguments to local_daemons.sh.
    This needs to be done using ctdb_nodes_start_custom().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit b029ca4d513163c4b0146c2a303130ae2a2581b4
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 17 12:54:23 2021 +1100

    ctdb-recoverd: Drop leader validation
    
    The introduction of the leader broadcast timeout provides an
    alternative to the current leader validation.  Using the leader
    broadcast may not be as fast but it is more correct.
    
    When the leader node is stopped or banned, the only way of triggering
    an election is currently to fetch the leader's node map to check
    whether the it is still active.  This is because the leader will no
    longer push the node map to other nodes.  However, having all nodes
    fetch the node map from an inactive leader may be unreliable.
    
    Most of the other cases are also handled more reliably by the leader
    broadcast timeout.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 7e53fab0a364426a03932974727c386e750716be
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 6 14:47:45 2022 +1100

    ctdb-recoverd: Drop special case for elected-before-connected
    
    This no longer occurs at startup due to the leader broadcast timeout.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit ef4b8c13c0762fc5072627ee0211b3bf506f2d73
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 17 14:42:47 2021 +1100

    ctdb-recoverd: Handle leader broadcast timeout
    
    If no leader broadcasts have been received from the leader for more
    than 5s then trigger an election.
    
    Apart from being sane behaviour, this avoids elected-before-connected
    bugs at startup, where a node elects itself leader before it is
    connected to other nodes.
    
    When a node processes a leader broadcast timeout it sends an unknown
    leader broadcast to all nodes.  That causes cancellation of the leader
    broadcast timeout across the cluster.  This is particular important at
    startup, since nodes may be started in a staggered fashion.  Without
    this cluster-wide cancellation, a node might notice the lack of
    leader, win an election and complete a recovery before other nodes
    notice the lack of leader.  When the leader broadcast timeout finally
    occurs on the other nodes then they'll put the cluster back into an
    unnecessary recovery.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 5c7f6da0f0e6c92ae4cd338b92f475bb4a8e2cc9
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Mar 16 16:16:44 2020 +1100

    ctdb-recoverd: Send leader broadcasts
    
    These are triggered on 1 second timer, but are only sent if the node
    is the current leader and there is no election underway.
    
    If this node can not be the leader then ensure it releases the
    recovery lock.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 789a75abfa2af0af39616c69575882e5db2b6f07
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Mar 16 16:07:26 2020 +1100

    ctdb-recoverd: Process leader broadcasts
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 3d3767a259b29674882c102fe629cff1eb1a702c
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Mar 16 16:05:29 2020 +1100

    ctdb-protocol: Add CTDB_SRVID_LEADER
    
    CTDB_SRVID_LEADER will be regularly broadcast to all connected nodes
    by the leader.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit c2cfd9c21aae6045b4ebf3ba330cbf2b9631490e
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Mar 18 20:27:10 2020 +1100

    ctdb-recoverd: Add an explicit flag for election in progress
    
    An alternate election method will be added that doesn't use the
    election timeout, so this provides a common way for recognising when
    an election is in progress.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit ac5a3ca063fd7435557a65866fda5fa1e0012394
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 7 11:27:06 2022 +1100

    ctdb-recoverd: Only start election if node can be leader
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 7baadfe27eda40560753fb4a61e053ea357fd2d2
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 14 10:57:03 2021 +1100

    ctdb-recoverd: Add and use function this_node_can_be_leader()
    
    This makes the code self-documenting.
    
    In ctdb_election_data() there is a slight behaviour change.  An
    inactive node will now try to lose an election.  This case should not happen
    because:
    
    * An inactive node can't win an election round and then send a reply.
    
    * Any inactive node should never start an election.  There are
      currently places where this happens and they will be fixed later.
    
    There is an instance where this could be used in
    validate_recovery_master() but this involves a more serious logic
    change.  Overhaul this function later.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 94b546c268ee5fb4505c6febe4bce05f1d75e7cd
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Dec 8 11:07:25 2021 +1100

    ctdb-recoverd: Logging/comments: recovery master -> leader
    
    There are some remaining instances in this file but they will be
    removed in subsequent commits.
    
    Modernise debug macros as appropriate.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit dd79e9bd14dd61fc60dfaac5c9065d465336714c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 14 15:22:33 2020 +1000

    ctdb-recoverd: Rename recmaster field to leader
    
    Recovery master is being renamed to leader.  This follows clustering
    best practice (e.g. RAFT).
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 2ee6763c7d9a8e347c0a98f918ad39f62222df31
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Dec 8 20:25:46 2021 +1100

    ctdb-recoverd: Use rec->pnn everywhere
    
    This is currently referenced in a number of inconsistent
    ways, including:
    
    * pnn
    * rec->ctdb->pnn
    * ctdb->pnn
    * ctdb_get_pnn(ctdb)
    * ctdb_get_pnn(rec->ctdb)
    
    The first of these always requires some thought about the context - is
    this the node PNN or some other PNN (e.g. argument to function)?
    
    rec->pnn is now always used when referring to the recovery daemon's
    PNN.
    
    Doing this also reduces reliance on struct ctdb_context internals.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 4af3b10a378ea614f926c23570ec91334e2c6408
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Dec 8 21:28:05 2021 +1100

    ctdb-recoverd: Change argument to srvid_disable_and_reply()
    
    Reduce dependency on struct ctdb_context internals, enable a
    subsequent change.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit b7c138ca99a4a839b9c30e59dff40fd2b95e13ec
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 10 10:31:56 2021 +1100

    ctdb-recoverd: Simplify arguments to ctdb_ban_node()
    
    ban_time argument is always ctdb->tunable.recovery_ban_period, so
    build this in and make the calling code more readable.
    
    ctdb_ban_node() already logs how long a node is banned for, so don't
    repeatedly log this.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit a5e0ddac626bc90c859949c977657cdf1fa110ac
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Dec 13 09:51:36 2021 +1100

    ctdb-recoverd: Simplify arguments to verify_local_ip_allocation()
    
    All other arguments are available via rec, so simplify.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 67b51916408831f13ca05a6c395f01824288fe8d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jan 16 16:20:05 2018 +1100

    ctdb-recoverd: Simplify arguments to do_recovery()
    
    pnn and nodemap are both available via the rec context, so simplify.
    vnnmap is unused.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 57882beb16a89d5e4081d0645549891a04ab5fb0
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Dec 8 19:27:01 2021 +1100

    ctdb-recoverd: Simplify arguments to some election functions
    
    The pnn and nodemap arguments to force_election() and
    send_election_request() are always effectively rec->pnn and
    rec->nodemap, so simplify.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 9dbe7cc85e41ce4f9163d8298ba9fb20052db894
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 9 10:33:17 2021 +1100

    ctdb-recoverd: Add PNN to recovery daemon context
    
    This is currently referenced in a number of inconsistent
    ways, including:
    
    * pnn
    * rec->ctdb->pnn
    * ctdb->pnn
    * ctdb_get_pnn(ctdb)
    * ctdb_get_pnn(rec->ctdb)
    
    The first of these always requires some thought about the context - is
    this the node PNN or some other PNN (e.g. argument to function)?
    
    The intention is to always use rec->pnn when referring to the recovery
    daemon's PNN.
    
    Doing this also reduces reliance on struct ctdb_context internals.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit ff0140e470016a7a2b5365c06f4d912e7a7c8af8
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 9 11:47:54 2021 +1100

    ctdb-recoverd: Use this_node_is_leader() in an extra context
    
    This is arguably clearer.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit c8721d01c6547f33f51b8e26b3e1f4370ec1ecc6
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Dec 8 19:37:39 2021 +1100

    ctdb-recoverd: Factor out and use function this_node_is_leader()
    
    Make the code self-documenting.
    
    This preempts an upcoming change to terminology but doing it now saves
    a lot of churn.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

-----------------------------------------------------------------------

Summary of changes:
 WHATSNEW.txt                                       |  58 ++
 ctdb/client/client.h                               |  22 +
 ctdb/client/client_connect.c                       |  30 +-
 ctdb/client/client_control_sync.c                  |  58 --
 ctdb/client/client_sync.h                          |  10 -
 ctdb/cluster/cluster_conf.c                        |  49 +-
 ctdb/cluster/cluster_conf.h                        |   3 +
 ctdb/config/ctdb.conf                              |  10 +-
 ctdb/doc/cluster_mutex_helper.txt                  |   6 +-
 ctdb/doc/ctdb-etcd.7.xml                           |   4 +-
 ctdb/doc/ctdb.1.xml                                |  44 +-
 ctdb/doc/ctdb.7.xml                                |  99 ++-
 ctdb/doc/ctdb.conf.5.xml                           |  69 +-
 ctdb/doc/ctdb_mutex_ceph_rados_helper.7.xml        |   6 +-
 ctdb/doc/examples/config_migrate.sh                |   4 +-
 ctdb/doc/examples/ctdb.conf                        |   2 +-
 ctdb/doc/recovery-process.txt                      | 436 ----------
 ctdb/include/ctdb_client.h                         |  22 -
 ctdb/include/ctdb_private.h                        |   3 -
 ctdb/protocol/protocol.h                           |   9 +-
 ctdb/protocol/protocol_api.h                       |   8 -
 ctdb/protocol/protocol_client.c                    |  46 -
 ctdb/protocol/protocol_control.c                   |  27 -
 ctdb/protocol/protocol_message.c                   |  12 +
 ctdb/server/ctdb_client.c                          |  64 --
 ctdb/server/ctdb_config.c                          |  16 +-
 ctdb/server/ctdb_config.h                          |   4 +-
 ctdb/server/ctdb_control.c                         |   7 +-
 ctdb/server/ctdb_recover.c                         |  25 -
 ctdb/server/ctdb_recoverd.c                        | 947 +++++++++++----------
 ctdb/server/ctdbd.c                                |  12 +-
 ctdb/server/legacy_conf.c                          |   5 -
 ctdb/server/legacy_conf.h                          |   1 -
 .../INTEGRATION/database/recovery.001.volatile.sh  |  62 +-
 .../INTEGRATION/database/recovery.002.large.sh     |   8 +-
 .../simple/cluster.001.stop_leader_yield.sh        |  26 +
 .../simple/cluster.002.ban_leader_yield.sh         |  26 +
 .../simple/cluster.002.recmaster_yield.sh          |  29 -
 .../simple/cluster.003.capability_leader_yield.sh  |  24 +
 .../cluster.006.stop_leader_yield_no_lock.sh       |  30 +
 .../simple/cluster.007.ban_leader_yield_no_lock.sh |  30 +
 .../cluster.008.capability_leader_yield_no_lock.sh |  28 +
 .../simple/cluster.015.reclock_remove_lock.sh      |  26 +-
 .../simple/cluster.016.reclock_move_lock_dir.sh    |  18 +-
 ctdb/tests/UNIT/cunit/config_test_001.sh           |   4 +-
 ctdb/tests/UNIT/cunit/config_test_004.sh           |  72 +-
 ctdb/tests/UNIT/cunit/config_test_006.sh           |   5 -
 ctdb/tests/UNIT/cunit/protocol_test_101.sh         |   1 +
 ctdb/tests/UNIT/tool/ctdb.getcapabilities.001.sh   |   2 +-
 ctdb/tests/UNIT/tool/ctdb.getcapabilities.002.sh   |   2 +-
 ctdb/tests/UNIT/tool/ctdb.getcapabilities.004.sh   |   6 +-
 .../{ctdb.recmaster.001.sh => ctdb.leader.001.sh}  |   0
 .../{ctdb.recmaster.002.sh => ctdb.leader.002.sh}  |   0
 ctdb/tests/UNIT/tool/ctdb.status.001.sh            |   2 +-
 ctdb/tests/UNIT/tool/ctdb.status.002.sh            |   2 +-
 ctdb/tests/local_daemons.sh                        |  46 +-
 ctdb/tests/scripts/integration.bash                |  44 +
 ctdb/tests/src/fake_ctdbd.c                        | 138 ++-
 ctdb/tests/src/protocol_common_ctdb.c              |  28 +-
 ctdb/tests/src/protocol_ctdb_compat_test.c         |   1 +
 ctdb/tests/src/protocol_ctdb_test.c                |   1 +
 ctdb/tools/ctdb.c                                  | 191 ++++-
 ctdb/utils/ceph/ctdb_mutex_ceph_rados_helper.c     |   2 +-
 ctdb/utils/ceph/test_ceph_rados_reclock.sh         |   4 +-
 ctdb/utils/etcd/ctdb_etcd_lock                     |   4 +-
 65 files changed, 1494 insertions(+), 1486 deletions(-)
 delete mode 100644 ctdb/doc/recovery-process.txt
 create mode 100755 ctdb/tests/INTEGRATION/simple/cluster.001.stop_leader_yield.sh
 create mode 100755 ctdb/tests/INTEGRATION/simple/cluster.002.ban_leader_yield.sh
 delete mode 100755 ctdb/tests/INTEGRATION/simple/cluster.002.recmaster_yield.sh
 create mode 100755 ctdb/tests/INTEGRATION/simple/cluster.003.capability_leader_yield.sh
 create mode 100755 ctdb/tests/INTEGRATION/simple/cluster.006.stop_leader_yield_no_lock.sh
 create mode 100755 ctdb/tests/INTEGRATION/simple/cluster.007.ban_leader_yield_no_lock.sh
 create mode 100755 ctdb/tests/INTEGRATION/simple/cluster.008.capability_leader_yield_no_lock.sh
 rename ctdb/tests/UNIT/tool/{ctdb.recmaster.001.sh => ctdb.leader.001.sh} (100%)
 rename ctdb/tests/UNIT/tool/{ctdb.recmaster.002.sh => ctdb.leader.002.sh} (100%)


Changeset truncated at 500 lines:

diff --git a/WHATSNEW.txt b/WHATSNEW.txt
index c82fa5079ce..a65439c43da 100644
--- a/WHATSNEW.txt
+++ b/WHATSNEW.txt
@@ -74,6 +74,64 @@ listen on port 53. Starting with this version it is possible to configure the
 port using host:port notation. See smb.conf for more details. Existing setups
 are not affected, as the default port is 53.
 
+CTDB changes
+------------
+
+* The "recovery master" role has been renamed "leader"
+
+  Documentation and logs now refer to "leader".
+
+  The following ctdb tool command names have changed:
+
+    recmaster -> leader
+    setrecmasterrole -> setleaderrole
+
+  Command output has changed for the following commands:
+
+    status
+    getcapabilities
+
+  The "[legacy] -> recmaster capability" configuration option has been
+  renamed and moved to the cluster section, so this is now:
+
+    [cluster] -> leader capability
+
+* The "recovery lock" has been renamed "cluster lock"
+
+  Documentation and logs now refer to "cluster lock".
+
+  The "[cluster] -> recovery lock" configuration option has been
+  deprecated and will be removed in a future version.  Please use
+  "[cluster] -> cluster lock" instead.
+
+  If the cluster lock is enabled then traditional elections are not
+  done and leader elections use a race for the cluster lock.  This
+  avoids various conditions where a node is elected leader but can not
+  take the cluster lock.  Such conditions included:
+
+  - At startup, a node elects itself leader of its own cluster before
+    connecting to other nodes
+
+  - Cluster filesystem failover is slow
+
+  The abbreviation "reclock" is still used in many places, because a
+  better abbreviation eludes us (i.e. "clock" is obvious bad) and
+  changing all instances would require a lot of churn.  If the
+  abbreviation "reclock" for "cluster lock" is confusing, please
+  consider mentally prefixing it with "really excellent".
+
+* CTDB now uses leader broadcasts and an associated timeout to
+  determine if an election is required
+
+  The leader broadcast timeout can be configured via new configuration
+  option
+
+    [cluster] -> leader timeout
+
+  This specifies the number of seconds without leader broadcasts
+  before a node calls an election.  The default is 5.
+
+
 REMOVED FEATURES
 ================
 
diff --git a/ctdb/client/client.h b/ctdb/client/client.h
index 88ee5768d76..5f174035e28 100644
--- a/ctdb/client/client.h
+++ b/ctdb/client/client.h
@@ -170,6 +170,28 @@ uint32_t ctdb_client_pnn(struct ctdb_client_context *client);
  */
 void ctdb_client_wait(struct tevent_context *ev, bool *done);
 
+/**
+ * @brief Client event loop waiting for function to return true with timeout
+ *
+ * This can be used to wait for asynchronous computations to complete.
+ * When this function is called, it will run tevent event loop and wait
+ * till the done function returns true or if the timeout occurs.
+ *
+ * This function will return when either
+ *  - done function returns true, or
+ *  - timeout has occurred.
+ *
+ * @param[in] ev Tevent context
+ * @param[in] done_func Function flag to indicate when to stop waiting
+ * @param[in] private_data Passed to done function
+ * @param[in] timeout How long to wait
+ * @return 0 on success, ETIMEDOUT on timeout, and errno on failure
+ */
+int ctdb_client_wait_func_timeout(struct tevent_context *ev,
+				  bool (*done_func)(void *private_data),
+				  void *private_data,
+				  struct timeval timeout);
+
 /**
  * @brief Client event loop waiting for a flag with timeout
  *
diff --git a/ctdb/client/client_connect.c b/ctdb/client/client_connect.c
index 0977d717608..a942871b1d2 100644
--- a/ctdb/client/client_connect.c
+++ b/ctdb/client/client_connect.c
@@ -336,8 +336,10 @@ static void ctdb_client_wait_timeout_handler(struct tevent_context *ev,
 	*timed_out = true;
 }
 
-int ctdb_client_wait_timeout(struct tevent_context *ev, bool *done,
-			     struct timeval timeout)
+int ctdb_client_wait_func_timeout(struct tevent_context *ev,
+				  bool (*done_func)(void *private_data),
+				  void *private_data,
+				  struct timeval timeout)
 {
 	TALLOC_CTX *mem_ctx;
 	struct tevent_timer *timer;
@@ -356,7 +358,7 @@ int ctdb_client_wait_timeout(struct tevent_context *ev, bool *done,
 		return ENOMEM;
 	}
 
-	while (! (*done) && ! timed_out) {
+	while (! (done_func(private_data)) && ! timed_out) {
 		tevent_loop_once(ev);
 	}
 
@@ -369,6 +371,28 @@ int ctdb_client_wait_timeout(struct tevent_context *ev, bool *done,
 	return 0;
 }
 
+static bool client_wait_done(void *private_data)
+{
+	bool *done = (bool *)private_data;
+
+	return *done;
+}
+
+int ctdb_client_wait_timeout(struct tevent_context *ev,
+			     bool *done,
+			     struct timeval timeout)
+
+{
+	int ret;
+
+	ret = ctdb_client_wait_func_timeout(ev,
+					    client_wait_done,
+					    done,
+					    timeout);
+
+	return ret;
+}
+
 struct ctdb_recovery_wait_state {
 	struct tevent_context *ev;
 	struct ctdb_client_context *client;
diff --git a/ctdb/client/client_control_sync.c b/ctdb/client/client_control_sync.c
index 1459dc09b46..c786fc7dbca 100644
--- a/ctdb/client/client_control_sync.c
+++ b/ctdb/client/client_control_sync.c
@@ -615,64 +615,6 @@ int ctdb_ctrl_get_pid(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
 	return 0;
 }
 
-int ctdb_ctrl_get_recmaster(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
-			    struct ctdb_client_context *client,
-			    int destnode, struct timeval timeout,
-			    uint32_t *recmaster)
-{
-	struct ctdb_req_control request;
-	struct ctdb_reply_control *reply;
-	int ret;
-
-	ctdb_req_control_get_recmaster(&request);
-	ret = ctdb_client_control(mem_ctx, ev, client, destnode, timeout,
-				  &request, &reply);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR,
-		      ("Control GET_RECMASTER failed to node %u, ret=%d\n",
-		       destnode, ret));
-		return ret;
-	}
-
-	ret = ctdb_reply_control_get_recmaster(reply, recmaster);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR,
-		      ("Control GET_RECMASTER failed, ret=%d\n", ret));
-		return ret;
-	}
-
-	return 0;
-}
-
-int ctdb_ctrl_set_recmaster(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
-			    struct ctdb_client_context *client,
-			    int destnode, struct timeval timeout,
-			    uint32_t recmaster)
-{
-	struct ctdb_req_control request;
-	struct ctdb_reply_control *reply;
-	int ret;
-
-	ctdb_req_control_set_recmaster(&request, recmaster);
-	ret = ctdb_client_control(mem_ctx, ev, client, destnode, timeout,
-				  &request, &reply);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR,
-		      ("Control SET_RECMASTER failed to node %u, ret=%d\n",
-		       destnode, ret));
-		return ret;
-	}
-
-	ret = ctdb_reply_control_set_recmaster(reply);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR,
-		      ("Control SET_RECMASTER failed, ret=%d\n", ret));
-		return ret;
-	}
-
-	return 0;
-}
-
 int ctdb_ctrl_freeze(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
 		     struct ctdb_client_context *client,
 		     int destnode, struct timeval timeout,
diff --git a/ctdb/client/client_sync.h b/ctdb/client/client_sync.h
index b8f5d905857..5b0ff42e95d 100644
--- a/ctdb/client/client_sync.h
+++ b/ctdb/client/client_sync.h
@@ -124,16 +124,6 @@ int ctdb_ctrl_get_pid(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
 		      int destnode, struct timeval timeout,
 		      pid_t *pid);
 
-int ctdb_ctrl_get_recmaster(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
-			    struct ctdb_client_context *client,
-			    int destnode, struct timeval timeout,
-			    uint32_t *recmaster);
-
-int ctdb_ctrl_set_recmaster(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
-			    struct ctdb_client_context *client,
-			    int destnode, struct timeval timeout,
-			    uint32_t recmaster);
-
 int ctdb_ctrl_freeze(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
 		     struct ctdb_client_context *client,
 		     int destnode, struct timeval timeout,
diff --git a/ctdb/cluster/cluster_conf.c b/ctdb/cluster/cluster_conf.c
index be79d5942a8..bdd64ba112f 100644
--- a/ctdb/cluster/cluster_conf.c
+++ b/ctdb/cluster/cluster_conf.c
@@ -113,6 +113,38 @@ good:
 					  mode);
 }
 
+static bool validate_recovery_lock(const char *key,
+				   const char *old_reclock,
+				   const char *new_reclock,
+				   enum conf_update_mode mode)
+{
+	bool status;
+
+	if (new_reclock != NULL) {
+		D_WARNING("Configuration option [%s] -> %s is deprecated\n",
+			  CLUSTER_CONF_SECTION,
+			  key);
+	}
+
+	status = check_static_string_change(key, old_reclock, new_reclock, mode);
+
+	return status;
+}
+
+static bool validate_leader_timeout(const char *key,
+				    int old_timeout,
+				    int new_timeout,
+				    enum conf_update_mode mode)
+{
+	if (new_timeout <= 0) {
+		D_ERR("Invalid value for [cluster] -> leader timeout = %d\n",
+		      new_timeout);
+		return false;
+	}
+
+	return true;
+}
+
 void cluster_conf_init(struct conf_context *conf)
 {
 	conf_define_section(conf, CLUSTER_CONF_SECTION, NULL);
@@ -129,7 +161,22 @@ void cluster_conf_init(struct conf_context *conf)
 			   validate_node_address);
 	conf_define_string(conf,
 			   CLUSTER_CONF_SECTION,
-			   CLUSTER_CONF_RECOVERY_LOCK,
+			   CLUSTER_CONF_CLUSTER_LOCK,
 			   NULL,
 			   check_static_string_change);
+	conf_define_string(conf,
+			   CLUSTER_CONF_SECTION,
+			   CLUSTER_CONF_RECOVERY_LOCK,
+			   NULL,
+			   validate_recovery_lock);
+	conf_define_integer(conf,
+			    CLUSTER_CONF_SECTION,
+			    CLUSTER_CONF_LEADER_TIMEOUT,
+			    5,
+			    validate_leader_timeout);
+	conf_define_boolean(conf,
+			    CLUSTER_CONF_SECTION,
+			    CLUSTER_CONF_LEADER_CAPABILITY,
+			    true,
+			    NULL);
 }
diff --git a/ctdb/cluster/cluster_conf.h b/ctdb/cluster/cluster_conf.h
index 6b797ef1085..38c378fd571 100644
--- a/ctdb/cluster/cluster_conf.h
+++ b/ctdb/cluster/cluster_conf.h
@@ -26,7 +26,10 @@
 
 #define CLUSTER_CONF_TRANSPORT       "transport"
 #define CLUSTER_CONF_NODE_ADDRESS    "node address"
+#define CLUSTER_CONF_CLUSTER_LOCK    "cluster lock"
 #define CLUSTER_CONF_RECOVERY_LOCK   "recovery lock"
+#define CLUSTER_CONF_LEADER_TIMEOUT  "leader timeout"
+#define CLUSTER_CONF_LEADER_CAPABILITY "leader capability"
 
 void cluster_conf_init(struct conf_context *conf);
 
diff --git a/ctdb/config/ctdb.conf b/ctdb/config/ctdb.conf
index 5440600a435..8e1b3760973 100644
--- a/ctdb/config/ctdb.conf
+++ b/ctdb/config/ctdb.conf
@@ -11,12 +11,12 @@
 	# log level = NOTICE
 
 [cluster]
-	# Shared recovery lock file to avoid split brain.  Daemon
-	# default is no recovery lock.  Do NOT run CTDB without a
-	# recovery lock file unless you know exactly what you are
+	# Shared cluster lock file to avoid split brain.  Daemon
+	# default is no cluster lock.  Do NOT run CTDB without a
+	# cluster lock file unless you know exactly what you are
 	# doing.
 	#
-	# Please see the RECOVERY LOCK section in ctdb(7) for more
+	# Please see the CLUSTER LOCK section in ctdb(7) for more
 	# details.
 	#
-	# recovery lock = !/bin/false RECOVERY LOCK NOT CONFIGURED
+	# cluster lock = !/bin/false CLUSTER LOCK NOT CONFIGURED
diff --git a/ctdb/doc/cluster_mutex_helper.txt b/ctdb/doc/cluster_mutex_helper.txt
index 20c8eb2b51d..4ee018ffc94 100644
--- a/ctdb/doc/cluster_mutex_helper.txt
+++ b/ctdb/doc/cluster_mutex_helper.txt
@@ -5,11 +5,11 @@ CTDB uses cluster-wide mutexes to protect against a "split brain",
 which could occur if the cluster becomes partitioned due to network
 failure or similar.
 
-CTDB uses a cluster-wide mutex for its "recovery lock", which is used
+CTDB uses a cluster-wide mutex for its "cluster lock", which is used
 to ensure that only one database recovery can happen at a time.  For
-an overview of recovery lock configuration see the RECOVERY LOCK
+an overview of cluster lock configuration see the CLUSTER LOCK
 section in ctdb(7).  CTDB tries to ensure correct operation of the
-recovery lock by attempting to take the recovery lock when CTDB knows
+cluster lock by attempting to take the cluster lock when CTDB knows
 that it should already be held.
 
 By default, CTDB uses a supplied mutex helper that uses a fcntl(2)
diff --git a/ctdb/doc/ctdb-etcd.7.xml b/ctdb/doc/ctdb-etcd.7.xml
index 5d7a0e05366..f84989f854f 100644
--- a/ctdb/doc/ctdb-etcd.7.xml
+++ b/ctdb/doc/ctdb-etcd.7.xml
@@ -60,7 +60,7 @@
     <para>
       ctdb_etcd_lock is intended to be run as a mutex helper for CTDB. It
       will try to connect to an existing etcd cluster and grab a lock in that
-      cluster to function as CTDB's recovery lock. Please see
+      cluster to function as CTDB's cluster lock. Please see
       <emphasis>ctdb/doc/cluster_mutex_helper.txt</emphasis> for details on
       the mutex helper API. To use this, include the following line in
       the <literal>[cluster]</literal> section of
@@ -68,7 +68,7 @@
       <manvolnum>5</manvolnum></citerefentry>:
     </para>
     <screen format="linespecific">
-recovery lock = !/usr/local/usr/libexec/ctdb/ctdb_etcd_lock
+cluster lock = !/usr/local/usr/libexec/ctdb/ctdb_etcd_lock
     </screen>
     <para>
       You can also pass "-v", "-vv", or "-vvv" to include verbose output in
diff --git a/ctdb/doc/ctdb.1.xml b/ctdb/doc/ctdb.1.xml
index e0e05d8e542..6f9a1764ee4 100644
--- a/ctdb/doc/ctdb.1.xml
+++ b/ctdb/doc/ctdb.1.xml
@@ -299,10 +299,10 @@
 	  RECOVERY - The cluster databases have all been frozen, pausing all services while the cluster awaits a recovery process to complete. A recovery process should finish within seconds. If a cluster is stuck in the RECOVERY state this would indicate a cluster malfunction which needs to be investigated.
 	</para>
 	<para>
-	  Once the recovery master detects an inconsistency, for example a node 
+	  Once the leader detects an inconsistency, for example a node 
 	  becomes disconnected/connected, the recovery daemon will trigger a 
 	  cluster recovery process, where all databases are remerged across the
-	  cluster. When this process starts, the recovery master will first
+	  cluster. When this process starts, the leader will first
 	  "freeze" all databases to prevent applications such as samba from 
 	  accessing the databases and it will also mark the recovery mode as
 	  RECOVERY.
@@ -316,13 +316,16 @@
 	</para>
       </refsect3>
       <refsect3>
-	<title>Recovery master</title>
+	<title>Leader</title>
 	<para>
-	  This is the cluster node that is currently designated as the recovery master. This node is responsible of monitoring the consistency of the cluster and to perform the actual recovery process when reqired.
+	  This is the cluster node that is currently designated as the
+	  leader. This node is responsible of monitoring the
+	  consistency of the cluster and to perform the actual
+	  recovery process when reqired.
 	</para>
 	<para>
-	  Only one node at a time can be the designated recovery master. Which
-	  node is designated the recovery master is decided by an election
+	  Only one node at a time can be the designated leader. Which
+	  node is designated the leader is decided by an election
 	  process in the recovery daemons running on each node.
 	</para>
       </refsect3>
@@ -343,7 +346,7 @@ hash:1 lmaster:1
 hash:2 lmaster:2
 hash:3 lmaster:3
 Recovery mode:NORMAL (0)
-Recovery master:0
+Leader:0
 	</screen>
       </refsect3>
     </refsect2>
@@ -397,9 +400,9 @@ pnn:1 10.0.0.31        OK
     </refsect2>
 
     <refsect2>
-      <title>recmaster</title>
+      <title>leader</title>
       <para>
-	This command shows the pnn of the node which is currently the recmaster.
+	This command shows the pnn of the node which is currently the leader.
       </para>
 
       <para>
@@ -939,7 +942,7 @@ pnn:3 10.0.0.14        OK
 	Example output:
       </para>
       <screen>
-RECMASTER: YES
+LEADER: YES
 LMASTER: YES
       </screen>
 
@@ -1217,13 +1220,20 @@ DB Statistics: locking.tdb
     </refsect2>
 
     <refsect2>
-      <title>setrecmasterrole on|off</title>
+      <title>setleaderrole on|off</title>
       <para>
-	This command is used to enable/disable the RECMASTER capability for a node at runtime. This capability determines whether or not a node can be used as an RECMASTER for the cluster. A node that does not have the RECMASTER capability can not win a recmaster election. A node that already is the recmaster for the cluster when the capability is stripped off the node will remain the recmaster until the next cluster election.
+	This command is used to enable/disable the LEADER capability
+	for a node at runtime. This capability determines whether or
+	not a node can be elected leader of the cluster. A node that
+	does not have the LEADER capability can not be elected
+	leader. If the current leader has this capability removed then
+	an election will occur.
       </para>
 
       <para>
-	Nodes will by default have this capability, but it can be stripped off nodes by the setting in the sysconfig file or by using this command.
+	Nodes have this capability enabled by default, but it can be
+	removed via the <command>cluster:leader capability</command>
+	configuration setting or by using this command.
       </para>
       <para>
 	See also "ctdb getcapabilities"
@@ -1740,7 +1750,13 @@ HEALTH: NO-HEALTHY-NODES - ERROR - Backup of corrupted TDB in '/usr/local/var/li
     <refsect2>
       <title>ipreallocate, sync</title>
       <para>
-	This command will force the recovery master to perform a full ip reallocation process and redistribute all ip addresses. This is useful to "reset" the allocations back to its default state if they have been changed using the "moveip" command. While a "recover" will also perform this reallocation, a recovery is much more hevyweight since it will also rebuild all the databases.
+	This command will force the leader to perform a full ip
+	reallocation process and redistribute all ip addresses. This
+	is useful to "reset" the allocations back to its default state
+	if they have been changed using the "moveip" command. While a
+	"recover" will also perform this reallocation, a recovery is
+	much more hevyweight since it will also rebuild all the
+	databases.
       </para>
     </refsect2>
 
diff --git a/ctdb/doc/ctdb.7.xml b/ctdb/doc/ctdb.7.xml


-- 
Samba Shared Repository