[SCM] CTDB repository - branch 2.5 created - ctdb-2.5-4-g9381c33

Michael Adam obnox at samba.org
Thu Nov 14 03:37:41 MST 2013


The branch, 2.5 has been created
        at  9381c33dfd40192b7532d942059c2959dfae059d (commit)

- Log -----------------------------------------------------------------
commit 9381c33dfd40192b7532d942059c2959dfae059d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Nov 7 16:01:49 2013 +1100

    tests: Fix calling of ctdb tool from test
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 46615c8e0e63291605d76a6d35f1a93180718c36
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Nov 7 15:54:28 2013 +1100

    Revert "tests: If transaction_start fails, try again"
    
    This reverts commit ed7d999214ee009e480c26410a04fa105028cb8e.
    
    This is not necessary since ctdb_transaction_start() now will return NULL
    only when there is a failure and not when another transaction is currently
    active.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 59489019ad15a5ad6b0f295e742fc9832745a842
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Nov 7 15:54:20 2013 +1100

    client: Make g_lock_lock() wait till lock is obtained
    
    This makes the behaviour of g_lock_lock() similar to that implemented in
    Samba.  Now ctdb_transaction_start() will return NULL only when there are
    failures and not when another transaction is active.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 370022e1ff654db99d0c3ce0c49914c249e57289
Author: Srikrishan Malik <srimalik at in.ibm.com>
Date:   Thu Oct 31 11:54:58 2013 +0530

    eventscript: Fix link creation failure if the link already exist but the target path is missing
    
    Signed-off-by: Srikrishan Malik <srimalik at in.ibm.com>

commit 30a6565a7b476516f3daed0669b5650e1be3cd18
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 16 11:46:54 2013 +1100

    doc: Update NEWS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit a7a844e7600b59d876de94ec5bf7bd1647508cdf
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 30 13:22:21 2013 +1100

    web: Add links to new manpages
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 15b5c6c00c248bc1a8364a6da103296a55d7bfb6
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:26:16 2013 +1000

    doc: Major updates to manual pages
    
    This includes new manpages for ctdb.7, ctdb.conf.5 and ctdb-tunables.7.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit ca5fc3431573c44d55d09d987c715fb53756fc1f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 30 12:37:15 2013 +1100

    tunables: Remove obsolete tunables
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit afd9b51644af074752d74c412cb4e7ec2eba2c69
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 30 12:17:37 2013 +1100

    recoverd: Rebalancing should be done regardless tunable
    
    Rebalance target nodes should be set even if a deferred rebalance is
    not configured.  The user can explicitly cause a takeover run.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 275ed9ebe287e39d891888c13810c70f347af8ac
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 30 11:32:28 2013 +1100

    recoverd: Improve an error message in the election code
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c8b542e059a54b8d524bd430cad9d82e5edd864d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 29 16:38:42 2013 +1100

    Revert "if a new node enters the cluster, that node will already be frozen at start"
    
    This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94.
    Furthermore, if a node doesn't force an election but wins it then it
    can fail to record that it is the new recovery master.  This can lead
    to a reverse split brain where there is no recovery master.
    
    This reverts commit c5035657606283d2e35bea40992505e84ca8e7be.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    
    Conflicts:
    	server/ctdb_recoverd.c

commit eb8ec5681bfccb26c8ffae72952d54bb0ba46249
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 29 14:05:41 2013 +1100

    ctdbd: When a node is connected, log at DEBUG NOTICE not DEBUG_INFO
    
    This is important enough that we should see it when the log level is
    DEBUG_NOTICE.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d1674aad224f8f0c9a03c3cd38a647318ba0f03e
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 28 16:20:44 2013 +1100

    tests/complex: Remove CTDB_NFS_SKIP_SHARE_CHECK test
    
    This is a needlessly complex way of testing the same thing as the
    eventscripts unit tests 60.nfs.monitor.161.sh and
    60.nfs.monitor.162.sh.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 81b94fbb7495ac3204f1a84c673c8babf04663bc
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 28 16:14:40 2013 +1100

    tests/complex: Remove CTDB_SAMBA_SKIP_SHARE_CHECK test
    
    This is adequately covered by eventscripts unit tests
    50.samba.monitor.105.sh and 50.samba.monitor.106.sh.
    
    This test is broken if CTDB_SAMBA_CHECK_PORTS is not specified in the
    CTDB configuration.  Fixing it is hard and involves adding a more
    complex stub for testparm.  We already have that in the eventscript
    unit tests above.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8c6f511254ecb0381a609b37e3a0ee6e5ec5d562
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 28 16:00:54 2013 +1100

    eventscripts: Rewrite the smb.conf cache file handling
    
    The background update is never guaranteed to complete before the cache
    is used, so don't bother trying it at the beginning.  Instead, put a
    timeout on a foreground update.
    
    If the foreground update fails:
    
    * If there's no available cache file then die.
    
    * If there is a previous cache file then use it and log a warning.
    
    * Do a background update at the end of the monitor event.
    
    Also remove commas in the "smb ports" list before use, since (newer?)
    testparm seem to insert commas into the default value.  Update the
    associated test to add a comma.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c072eb1f6488f94f83a6d3a81d88bf29ad866943
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 25 16:25:25 2013 +1100

    tools/ctdb: Fix documentation string for ban command
    
    Ban time of 0 is not supported.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3e41170c78fc7a2bf526129c9b7db3739b61c6bf
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 24 11:13:16 2013 +1100

    Revert "recoverd: Disable takeover runs on other nodes for 5 minutes"
    
    5 minutes is too long to leave the cluster in limbo if the recovery
    daemon dies during a takeover run, even though this is quite unlikely.
    We need a new recover master to be able to do takeover runs fairly
    quickly.
    
    This reverts commit 71080676bb4acbd0d9b595a30cf7fe6dddbf426f.

commit 01a46205c3a3d6609dc0b0324319b89667dffa32
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 24 14:15:53 2013 +1100

    tools/onnode: Fix healthy/ok node handling
    
    This bit-rotted a long time ago when the "ThisNode" column was added
    to "ctdb -Y status" output.  The fake "ctdb -Y status" output in the
    test was never updated to reflect this change.
    
    Instead of making sure that all columns are "0", just check that
    they're not "1".  This implicitly ignores "Y" and "N" in this
    "ThisNode" column without having to do anything else clever.
    
    Also update associated tests.  The main "ctdb ok" test had a duplicate
    opening line for a here document, which was tickled by this change.
    
    This fixes samba bz#8122.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    
    onnode test fixup
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 56486d1c01cc8ad0e4b8cee7a22429e72e50f03d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 28 18:49:51 2013 +1100

    daemon: Change the default recovery method for persistent databases
    
    Use sequence numbers to do recovery for persistent databases instead of
    RSNs.  This fixes the problem of registry corruption during recovery.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c7450f9e22133333bf82c88a17ac25990ebc77ab
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 23 15:37:41 2013 +1100

    packaging: Create runtime directories for CTDB
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit b63f6fd2d295c8e18cbf3420ab05fce07b727f31
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 23 11:28:26 2013 +1100

    initscript: Update systemd configuration to put PID file in /run/ctdb
    
    Elsewhere we're moving the socket to /var/run/ctdb.  We might end up
    with PID files and sockets for other daemons later, so let's call the
    directory "ctdb" instead of "ctdbd".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit dc67a4e24af9d07aead2a1710eeaf5d6cc409201
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Oct 3 15:19:05 2013 +1000

    build: Move the default CTDB socket from /tmp to /var/run/ctdb
    
    Use /var/run/ctdb/ctdbd.socket because there might be other daemons
    that need sockets in the future.
    
    The local daemons test code to create a link for the default
    convenience socket has to be removed because the link can't be created
    as a regular user in the new location.  This should be OK since all
    calls to the ctdb tool in the test code should be wrapped in onnode.
    When debugging tests, a developer will have to set CTDB_SOCKET by
    hand.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-programmed-with: Martin Schwenke <martin at meltin.net>

commit 2c09aac71188f43cd592572b10ea30b7a2969678
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Oct 3 15:47:30 2013 +1000

    packaging: Move ctdb/ directory from /var to /var/lib
    
    Introduce CTDB_VARDIR variable that points to /var/lib/ctdb by default.
    This makes CTDB_VARDIR consistent across C code and scripts.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:36:36 2013 +1100

    ctdbd: Simplify database directory setting logic
    
    No need to check if the options are set.  The options are always set
    via static defaults.
    
    No need to talloc_strdup() the values via wrapper functions.  The
    options aren't going away.  Remove now unused ctdb_set_tdb_dir() and
    similar functions.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit d73d84346488a2ed54e6a86f9d7ec641c8e33ace
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:36:36 2013 +1100

    ctdbd: Remove duplicate database directory setting logic
    
    Defaults for ctdb->db_directory and similar variables are currently
    set in 2 places.
    
    Change this to set them in only 1 place and make the directories at
    initialisation time instead of waiting until later.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 7b971df79b0b63f83555205eacf48d49ca3a273a
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:29:39 2013 +1100

    common: New function ctdb_mkdir_p_or_die()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit afe2145d91725daf1399f0a24f1cddcf65f0ec31
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:08:52 2013 +1100

    common: New function mkdir_p()
    
    Behaves like mkdir -p.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit b9b9f6738fba5c32e87cb9c36b358355b444fb9b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Oct 3 15:13:41 2013 +1000

    tcp: Create socket lock in /var/run/ctdb instead of /tmp
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-programmed-with: Martin Schwenke <martin at meltin.net>

commit 6a5469a63547029f4fc704a4d4075543e06c36d1
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Oct 24 14:26:12 2013 +1100

    doc/examples: Add CTDB configuration examples
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a0b965bb73777dde7a4abf80c5c4742581bce520
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Thu Aug 29 08:20:05 2013 +0200

    Add missing $remote_fs LSB dependency

commit cea81bdd503f6ef8b5bbd3582a8e0085bb02bc9f
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Thu Aug 29 07:42:12 2013 +0200

    Improved check_ctdb
    
    - increase verbosity with "-v"
    - concat error messages (if there are several)
    - handle 255 return code as warning (as it is the return code when any of the node is missing)
    - read /etc/ctdb/nodes remotely (ctdb_check can be run on a non-ctdb host)

commit 1f6cc8764e28058c56d0350147032b6e30cb355d
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Thu Aug 15 20:23:57 2013 +0200

    Add missing events.d/99.timeout

commit 58ca2c3e7e3a27023ad86660f01a2052e2a19635
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Oct 24 14:37:41 2013 +1100

    eventscripts: Instead of listing all tunables, query EventScriptTimeout
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1f327401f2e181780937aa3f6c479376ff787f3f
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 23 00:46:34 2013 +0200

    ctdb_client.h: fix build on AIX by removing C++-style comments
    
    Reported by John P Janosik <jpjanosi at us.ibm.com>
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit a3d63a9db89d08bb284b3b3a6db773422f21b477
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:52:01 2013 +1100

    ctdbd: Pass the public address file location in ctdb context
    
    No need to pass it as an extra argument to ctdb_start_daemon.
    
    Also ensure options.public_address_list gets a nice static default.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c11803e3dcc905a45a08d743595e63f9ca445f0d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 1 15:13:29 2013 +1000

    ctdbd: Debug locks by default with override from enviroment variable
    
    Default is debug_locks.sh, relative to CTDB_BASE.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 21b4d1aba00902f1eee0cbf4f082b0794fd5b738
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 14:10:58 2013 +1100

    ctdbd: Default for event_script_dir should use CTDB_BASE
    
    Also get rid of ctdb_set_event_script_dir().  It creates an
    unnecessary copy of something that will be around for the lifetime of
    the process.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 20e705e63bd3b20837cc3ac92fdcf2a9650ccfc8
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:33:10 2013 +1100

    ctdbd: Add nodes_file member to struct ctdb_context
    
    This allows ctdb_load_nodes_file() to move to ctdb_server.c and
    ctdb_set_nlist() to become static.
    
    Setting ctdb->nodes_file needs to be done early, before the nodes file
    is loaded.  It is now set from CTDB_BASE instead ETCDIR, so setting
    CTDB_BASE also needs to be done earlier.
    
    Unhack ctdbd_test.c - it no longer needs to define
    ctdb_load_nodes_file().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 2b6dc0d2799f3563b767622b6f9246450aa4036b
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:43:47 2013 +1100

    tools/ctdb: CTDB_BASE is the default location of configuration files
    
    Ensure that environment variable CTDB_BASE is set.
    
    Update defaults for nodes and natgw_nodes to use CTDB_BASE.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 30ca419aa1c78008f81839497921bbfba480e7fc
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 14:02:31 2013 +1100

    ctdbd: Don't check CTDB_BASE before setting it, just don't override
    
    That's what the 3rd argument to setenv(3) is for...  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 913f229508302378212678d98c22606a4954b09c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 22 15:36:30 2013 +1100

    tests/integration: Pass --valgrinding option when running under valgrind
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1c0a627df1b510f49c65ffeb4474240c8856cdf2
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 21 19:42:32 2013 +1100

    ctdbd: Fix some errors in the popt configuration
    
    That 4th argument isn't a default or similar, so consistently make it 0.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 30d9b634b16c3cc740e5e453ea5c21012b1fde88
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 18 16:43:26 2013 +1100

    initscript: New configuration variable CTDB_DBDIR_STATE
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 516cdea0e73cf3f63b3303e22809834c8cbc64e4
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 18 13:24:03 2013 +1100

    scripts: Make detect_init_style() more readable
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 45e2bc66abf9fcfeadcc279a656ed7fd1838920a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 16:44:24 2013 +1100

    eventscripts: Rework the iSCSI eventscript
    
    * It should run on "ipreallocated" instead of "recovered"
    * Variable name NODE -> ip since that's what it is
    * Simplify some logic
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1152215fc69217e4292762e28d193b7ea0e06ee3
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 16:20:18 2013 +1100

    eventscripts: Don't update static routes on "recovered" event
    
    Routes only need to be updated when IPs have moved.  IP takeover runs
    will generate "ipreallocated", which is enough.  "recovered" always
    follows "ipreallocated" anyway, so avoid the redundancy.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 542c70d6281d636ecd51502fbbf219f418bfac66
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 16:17:26 2013 +1100

    eventscripts: NAT gateway script doesn't need to handle "recovered" event
    
    Any time a node changes flags in any significant way there will be a
    takeover run, which will generate an "ipreallocated" event.  The
    "recovered" event always happens straight after a takeover run so we
    update the NAT gateway twice.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 00736a21fc268c10b6a718731e56b3dbb7e60554
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 16:14:14 2013 +1100

    eventscripts: Delete placeholder "recovered" and "shutdown" events
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2ea9d3acfe7e8665685f54294f5edc9b8ffc2f3f
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 16:13:21 2013 +1100

    eventscripts: Clean up comment at the top of 00.ctdb
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 41df1637c1d8a7b2f5a9974408db71b1f74cb2f2
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 16:00:39 2013 +1100

    eventscripts: Remove reconfigure check from samba and winbind eventscripts
    
    There is no reconfigure code for these scripts so no need to check for
    reconfiguration.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5b77fd95bda5f1960aca952e1b759231890b56f3
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 15:58:25 2013 +1100

    eventscripts: Remove reconfigure code from httpd eventscript
    
    Nothing ever (or has ever) set the "needs reconfigure" flag, so this
    code is unnecessary.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 044d302b41a2040642355401e3236fcecc3a620a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 15:23:35 2013 +1100

    eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports()
    
    A generic framework is no longer needed now that the "ctdb" checker is
    the only one left.  Simplify the code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 50e330d0679614bee2e7bab028436e929f74ca50
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 11:02:54 2013 +1100

    eventscripts: Remove TCP port checks other than the built-in CTDB one
    
    "ctdb checktcpport" is no longer experimental so the other checkers
    are no longer required.
    
    Remove tests related to the removed checkers.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cfbff39e22e42f3997f637290748290833525714
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 10:52:00 2013 +1100

    scripts: Remove setting of PATH from functions file
    
    The current setting is inconsistent with settings on most systems,
    putting /bin before /sbin.  Use of /usr/local/bin, which may be
    required on some systems, is also overridden.  This can make it
    difficult to do interactive debugging of script problems.
    
    Rely on the system PATH instead.
    
    If system-specific changes need to be made then this can be done in a
    configuration file.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9437d4809bfbbb5c6a32a610665333d2f641881d
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 17 10:39:09 2013 +1100

    tests/eventscripts: Run scripts under sh by default
    
    Some scripts are disabled by default so are no executable.  Explicitly
    running them under sh allows them to be run without having to mess
    around and make them executable or similar.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 212d4b201c30804f69cffe4b7150d4b74bf2e54f
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 16:44:45 2013 +1100

    tests/eventscripts: New tests for 20.multipathd
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 49f077c475b078889ff0492fe7d567a64d6cb87c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 16:42:45 2013 +1100

    eventscripts: Clean up 20.multipathd
    
    Reduce the complexity, including the depth of background processes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e574b30257126679704b088c4334a8e7a53a9c3f
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 12:00:13 2013 +1100

    eventscripts: NAT gateway script should export CTDB_NATGW_NODES
    
    Otherwise calls to "ctdb natgwlist" will not behave as expected if a
    non-standard file is used, since that command will use the default
    file location.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 79e2029f9bc078126e865aa715100a3870c7604b
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 11:57:28 2013 +1100

    scripts: Simplify script_log() to just look at CTDB_SYSLOG variable
    
    The old logic was actually wrong.  If CTDB_LOGFILE is unset then a
    default is used, not syslog.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e55f3a1577eff0182802b0341d865d961aeae1c7
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 11:54:58 2013 +1100

    scripts: Remove support for CTDB_OPTIONS configuration variable
    
    Allowing people to put random options in CTDB_OPTIONS complicates some
    logic (particularly around use of syslog).  If we're going to have
    variables for options then let's make sure we have a variable for each
    option and make people use them.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit bda0da41aaf629a252cc361b73ebc5328f26ed04
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 11:31:12 2013 +1100

    scripts: Remove unused configuration variable CTDB_MANAGES_SCP
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f12658aff125996ae45eea23241d8c3d0567b893
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 15 11:29:23 2013 +1100

    eventscripts: Deprecate NFS_SERVER_MODE, use CTDB_NFS_SERVER_MODE instead
    
    All CTDB configuration variables should start with CTDB_.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4a5d5935f4410a93a3343d85a24dbcddae2c4c20
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 14 13:54:39 2013 +1100

    recoverd: Remove function reload_nodes_file()
    
    It is a 1 line wrapper around ctdb_load_nodes_file(), so use that
    instead.  We need less code...  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 25fd05505f61dc595c0ef25bb6e332274d5530e8
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 14 12:50:08 2013 +1100

    Revert "null out the pointer before we reload the nodes file"
    
    This reverts commit 4b0f32047e8bece0a052bdbe2209afe91b7e8ce3.
    
    This is not necessary.  It just causes a memory leak.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f3413fb8b90c4d9f0c2c2a69825c66d080117193
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 11 15:53:40 2013 +1100

    client: Fix a format string argument compiler warning
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 484c46eaae056480baf050fd91868f2fd0537985
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Sep 27 18:02:39 2013 +1000

    recoverd: Ignore failed flag updates on inactive nodes
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-programmed-with: Martin Schwenke <martin at meltin.net>

commit 7764cf67a61bbf1caad5aa8e2d75a262b9da654c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Sep 26 18:47:27 2013 +1000

    common/util: Use AIX specific code for setting high priority for CTDB daemon
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit b9af66032f3d96f2fe12b7a4fcc5e71d4a282365
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 11 15:09:11 2013 +1100

    git: Ignore generated documentation files
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 63924ff372b066cd878b79e71f06de4c24c814a2
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 11 15:05:54 2013 +1100

    tests: When running local tests with run_tests.sh, use fixed TEST_VAR_DIR
    
    Otherwise we end up with lots of useless temporary directories.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0a79ba2f1277a776347e2c3f04ce8419e0be62de
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Sep 26 20:58:50 2013 +1000

    eventscripts: Fix comment - CTDB_TCP_PORT_CHECKS -> CTDB_TCP_PORT_CHECKERS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d0dec5b8e60316701fdd02150c4dd8f01aacbfda
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:24:46 2013 +1000

    tests/integration: Tweak ctdbd startup options
    
    * --public-interface is not needed
    
    * Add --sloppy-start to speed up restarts
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 588172bcb6bf267339e2bd09e23d2c4904a27a41
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Sep 26 13:11:04 2013 +1000

    recoverd: Fix the VNN lmaster consistency check
    
    It does cope with node that don't have the lmaster capability.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit ed7d999214ee009e480c26410a04fa105028cb8e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 1 11:54:35 2013 +1000

    tests: If transaction_start fails, try again
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit af4b6b8b3222d2a3c425fcc6833db579d0cd7ffa
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 1 11:53:57 2013 +1000

    tests: Make sure test exits with zero status on successful completion
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 929045335212e825deb645cc6c7f97b8a40fdbb3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Sep 27 11:26:27 2013 +1000

    tests: Re-enable transaction test code
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 14bfd22fad1a5fd27eede1be7fccbaed9466e13e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Sep 24 13:10:31 2013 +1000

    tools/ctdb: Remove setdbseqnum command
    
    This command was added to test persistent database recovery with sequence
    numbers.  With the new persistent transaction code, sequence numbers get
    updated automatically, so there is no need for this command.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 961dd5d0acbb971756944ea9f69992020ea7d9fc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Sep 24 13:08:48 2013 +1000

    tests: No need to set sequence number when modifying persistent database
    
    With the new persistent transaction code, sequence numbers will be
    automatically updated whenever a record is updated.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 41bdbcfd72092cdd25da87e60689c087bca97933
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Sep 25 19:16:53 2013 +1000

    client: Remove old persistent transaction code
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4e0f1971792c9431d8d51dc57d54ecc9e4576dd5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Sep 23 18:30:04 2013 +1000

    client: Reimplement persistent transaction code using TRANS3_COMMIT
    
    Implementing persistent trasnaction code from Samba.
    
    Persistent transaction code was reimplemented in Samba using g_lock.tdb
    to hold transaction locks and using TRANS3_COMMIT control.
    
    Implementation details:
    
    1. When starting a transaction, create a record with "transaction-<dbid>"
       as key and store current server_id in the structure.
    
    2. If a record already exists, some other client has already started a
       transaction.  Verify that the process corresponding to server_id stored
       in the record really exists or it's a stale record and overwrite it.
    
    3. All modifications to the actual persistent database are stored in a
       marshal buffer.
    
    4. When transaction is committed, read the sequence number of the
       persistent database and increment it.  Sequence number record is also
       stored in the marshal buffer.
    
    5. Send the changed records (marshal buffer) in TRANS3_COMMIT control
       to all the active nodes.
    
    6. If all controls succeed, verify that the sequence number has been
       incremented.  Commit is successful.  If any of the controls fail,
       abort the transaction.
    
    7. In case sequence number has not yet been incremented, then database
       recovery has been triggered.  So repeat from step 5.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 40589ae5259880431f358250c1f0d07bcaa21d1f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Oct 4 15:38:04 2013 +1000

    client: Add functions to parse g_lock.tdb records
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 55f91ea4373c54ddb5faad87fa2826d86a4b6172
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Oct 4 15:37:24 2013 +1000

    client: Add functions to handle server_id structure
    
    server_id records are stored in g_lock.tdb for persistent transactions.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Sep 12 16:43:43 2013 +1000

    ctdbd: Remove transaction code related to TRANS2 commits
    
    This removes data types and structure elements related to TRANS2
    persistent transaction code.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7d176352986317e63696d74252ff5d8eccb2fee5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Sep 12 16:27:39 2013 +1000

    ctdbd: Deprecate TRANS2 commit controls
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 3c892ea1b5aa42686adb82ce29b9fcfdf9d204a1
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Sep 12 16:36:09 2013 +1000

    ctdbd: Create a utility function to log error for "not implemented" controls
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2ce3a48cc969d563c26dd295723416c0d7b077a2
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Sep 12 16:35:17 2013 +1000

    include: Remove unused set_dmaster structure
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 6182bd0c19f215a997efe5272e633b1b1bd0c882
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 14:27:03 2013 +1000

    tests/tool: Remove references in libctdb in file and function names
    
    Main changes are:
    
      libctdb_test.c -> ctdb_test_stubs.c
      ctdb_tool_libctdb.c -> ctdb_functest.c
    
    ctdb_tool_stubby.c is gone, replaced with existing ctdb_test.c.
    
    Functions starting with "libctdb_test_" now start with
    "ctdb_test_stubs_".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 10aac42f30cc0d56dca42ece17d04ccbc321056d
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 14:01:00 2013 +1000

    tests/tool: Rework test programs so they no longer expect libctdb
    
    Instead, override controls using preprocessor magic.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 59bd4ede15a5958b87e0d253461eb9111885bd2f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 13:43:53 2013 +1000

    tests/tool: Fix some comment typos
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3296559c43e70f755fcf2c06677891e0319c8142
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 13:40:52 2013 +1000

    tools/ctdb: Stop return value from being clobbered in control_lvsmaster()
    
    ret is initialised too early and is clobbered by the call to
    ctdb_ctrl_getcapabilities().  Initialising it later means that the
    function returns -1 when no LVS master is found.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5619754343003016ede27014567dbb4701f97928
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 13:40:10 2013 +1000

    client: Fix some format string compiler warnings
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 299fa487549e36572b757852d21471f9e23f6e8f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 30 23:38:15 2013 +1000

    common: Fix setting of debug level in the client code
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c5a7f2b4ff011e1393c4ff34864f85e6b472ff07
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Sun Aug 25 21:44:59 2013 +1000

    libctdb: Remove incomplete libctdb
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1585a8e275b0143e5e46311b3d5e9785119f735f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Aug 27 14:46:08 2013 +1000

    tools/ctdb: Pass memory context for returning nodes in parse_nodestring
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ae0d8f432ef98a72c85a6cd42c503b718bef0e4e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Sun Aug 25 21:43:29 2013 +1000

    tests: Do not use libctdb code in tests
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit cd66282c635cf53386d8970b89c895076ea21cbd
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Aug 29 17:22:38 2013 +1000

    tools/ctdb: Do not use libctdb for commandline tool
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 8cb1fbbfe88327c9c7ab68e8eded586dff611e57
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 23 16:52:24 2013 +1000

    client: Add ctdb_ctrl_getdbseqnum() function
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1e7fca5cdc1d7205cf084e35aace1a5dc46ea294
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 23 16:52:02 2013 +1000

    client: Add ctdb_ctrl_getdbstatistics() function
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c9a9d14c91f203ce964a426a8a1e2c1715af2098
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 23 16:51:26 2013 +1000

    client: Add ctdb_client_check_message_handlers() function
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 962eb63c6d500e29a03ae087757d81be449888c6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 23 16:49:46 2013 +1000

    client: Remove extra whitespaces
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 873b9cadbcc363a9e5f450b0a1feb1cf2ce1e6c9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 23 17:21:24 2013 +1000

    tests: Remove unused test program ctdb_fetch_lock_once
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d94a10f93a0925b17458d009e604966666b3d880
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Aug 29 16:58:47 2013 +1000

    tools/ctdb: When printing TDB data as a string, use correct length of the string
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 8b238852884004a56f76a1762199c338864d1249
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 23 16:57:40 2013 +1000

    tools/ctdb: Remove un-implemented ctdb vacuum command
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 713c9ecc791e3319a2d109838471833de5a158c8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Sep 25 19:10:13 2013 +1000

    tests: Add a simple test to test cluster wide database traverse
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 37e22fc3ac3eb64732f2e67058f5b7b06c093fbf
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Sep 9 12:46:26 2013 +1000

    traverse: Send traverse end record from traverse child process
    
    Traverse records are sent directly from traverse child process, but
    the last empty record signalling end of traverse is sent from ctdbd.
    This creates a race condition between ctdbd and traverse child.
    There are two fds from traverse child to ctdbd - a pipe to track status
    of the child process and unix socket connection for sending records.
    It's possible that last few records are sitting in unix socket buffer
    when ctdbd reads the status written from traverse child.  This will
    be interpreted as end of traverse and ctdbd will send the last empty
    record to originating node before it has processed the pending packets
    in unix socket connection.
    
    The race is avoided by sending the last empty record marking end of
    traverse from the child process.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 482ac708cb79cb6378d814a79c2cf13f88435bc4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Sep 10 17:52:26 2013 +1000

    traverse: Wait till all data has been flushed from output queue
    
    To improve the traverse performance, records are directly sent from
    traverse child process to the originating node.  Make sure that all the
    data is sent via socket, before informing ctdbd that traverse is complete.
    
    Without waiting for all the packets to be flushed from the queue,
    child process can incorrectly signal ctdbd that traverse has ended.
    This will cause the pending records in the queue never to make it to
    the originating node and traverse information will not be complete.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 25e9cf86328252f96215b54b94551dd7bbdd2db4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Sep 13 13:28:31 2013 +1000

    traverse: Use ctdb local variable for convenience
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit abd51a9f41ebb178c4ea4491bdedf9a9433e7232
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Sep 6 18:11:40 2013 +1000

    traverse: Check if local traverse failed or succeeded
    
    By passing the result of tdb_traverse_read() allows ctdbd to determine
    if the local traverse succeeded or not.  In case of a problem with local
    traverse, ctdbd can log an error.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit e4aba8598b00a810e721de64ac44dccc9af04ab6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Sep 6 14:51:54 2013 +1000

    traverse: Log information when traverse starts and ends
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9e18f3c173863919587e25d704f66372624ed8ed
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:23:36 2013 +1000

    tool/ltdbtool: -h option does not require an argument
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8f660d0dd52013e5876806be908e8e603aa6e968
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:22:36 2013 +1000

    scripts: Add support for optional ctdbd.conf configuration file
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c700dd0c7b6b43b61b3e231643b5d7cbe2f9592a
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:21:30 2013 +1000

    utils: Make debug level strings case-insensitive
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 49c87699fad151933a0aefebfee968fc850e6383
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:20:42 2013 +1000

    tools/ctdb: Fix help messages for ctdb commands
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c8a6e5ce579e2fe320c40268e7e9ddfe68b8cd30
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 23 16:19:52 2013 +1000

    tools/ctdb: Ban time of 0 is invalid
    
    Apparently it used to mean a permanent ban but it is unclear if this
    was ever supported.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ff41ce5ef202f8f6342e285d195bb5df61d848ce
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Sep 16 14:35:13 2013 +1000

    eventscripts: Load CTDB configuration settings in 70.iscsi
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 71080676bb4acbd0d9b595a30cf7fe6dddbf426f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 17:07:32 2013 +1000

    recoverd: Disable takeover runs on other nodes for 5 minutes
    
    60 seconds might not be long enough to kill all connections and
    release IPs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b39aa2e401fbb581207d986bac93778e9c01acdc
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 17:06:16 2013 +1000

    recoverd: Improve logging for takeover runs
    
    Takeover runs are currently silent when they succeed.  However, they
    are important, so log something by default.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 6d44657a5e5b0df22bab2d487a503dd1c5ba79b4
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 18 16:35:18 2013 +1000

    tools/ctdb: Use the standard long timeout when disabling takeover runs
    
    This means that takeover runs will be disabled for about as long as the
    reloadips control can take to complete.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0846c00597adb66bba8c9dbf63443d0c2f91a7d1
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 13:20:26 2013 +1000

    tools/ctdb: Fix arguments/semantics of rebalance node
    
    There's no reason why specifying a node should be compulsory.  This is
    a cluster-wide operation because it is implemented by the recovery
    master so multiple nodes should not be specified using -n.  However,
    the command should be able to specify multiple nodes so let it have
    its own nodestring argument.
    
    This change should be backward compatible with the old requirement of
    specifying a single node via -n.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ac946ee4ad01b1e5cd1006930b9f8a190a0a58ba
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 13:19:09 2013 +1000

    tools/ctdb: Make rebalancenode more robust
    
    Use a broadcast instead of trying to win the race of determining the
    recovery master and then sending the message before the recovery
    master changes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d921b2756d5f1c4ad7a35fe120f6fda9f5bf5686
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 11:29:14 2013 +1000

    tests/simple: Fix the reloadips test to cope with changes to reloadips
    
    Specifying nodes to reload no longer uses -n.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e81589b7084c661adf617e166cc2c25b4939f841
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 11:23:07 2013 +1000

    recoverd: Be careful about freeing the list of IP rebalance target nodes
    
    It can change during a takeover run.  If it does then don't free it.
    
    There are potentially fancier solutions (e.g. check what PNNs are new
    to the list) to this issue but this is the simplest.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ceb30432a9a550778aed0b422a654fc5287b82a3
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 11:21:10 2013 +1000

    recoverd: reloadips should rebalance target nodes for new IPs
    
    Otherwise, if existing IPs are added to extra nodes (that have,
    perhaps, been disconnected) then those IPs will not be rebalanced
    across the extra nodes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 85a5b544ec032173e98c9cc3b5402a76b961aa3b
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Sep 5 15:56:51 2013 +1000

    ctdbd: Make ctdb_reloadips_child send controls asynchronously
    
    Deleting IPs can take a while because IPs are released and connections
    are killed.  This can take a while so do them in parallel.  In fact,
    since the set of IPs being added and deleted will be disjoint, send
    all the adds/deletes at the same time and then wait.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c51c1efe5fc7fa668597f2acd435dee16e410fc9
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 4 14:30:04 2013 +1000

    recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE
    
    The current implementation has a few flaws:
    
    * A takeover run is called unconditionally when the timer goes even if
      the recovery master role has moved.  This means a node other than
      the recovery master can incorrectly do a takeover run.
    
    * The rebalancing target nodes are cleared in the setup for a takeover
      run, regardless of whether the takeover run succeeds.
    
    * The timer to force a rebalance isn't cleared if another takeover run
      occurs before the deadline.  Any forced rebalancing will happen in
      the first takeover run and when the timer expires some time later
      then an unnecessary takeover run will occur.
    
    * If the recovery master role moves then the rebalancing data will
      stay on the original node and affect the next takeover run to occur
      if the recovery master role should come back to the original node.
    
    Instead, store an array of rebalance target nodes in the recovery
    master context.  This is passed as an extra argument to
    ctdb_takeover_run() each time it is called and is cleared when a
    takeover run succeeds.  The timer hangs off the array of rebalance
    target nodes, which is cleared if the node isn't the recovery master.
    
    This means that it is possible to lose rebalance data if the recovery
    master role moves.  However, that's a difficult problem to solve.  The
    best way of approaching it is probably to try to stop the recovery
    master role from jumping around unnecesarily when inactive nodes join
    the cluster.
    
    The long term solution is to avoid this nonsense completely.  The IP
    allocation algorithm needs to cache state between runs so that it
    knows which nodes have just become healthy.  This also needs recovery
    master stability.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4cd727439a0824ebb8dbcf737d9888ffc3c41184
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 28 15:46:27 2013 +1000

    recoverd: Remove unused CTDB_SRVID_RELOAD_ALL_IPS and handler
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d66a072d9b120c78c47e726e9f29a3c1cfdd87ce
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 28 15:38:48 2013 +1000

    tools/ctdb: Reimplement reloadips
    
    This implementation disables takeover runs on all nodes before trying
    to reload IPs.  It also takes "all" or the list of PNNs as an argument
    to the command instead of to -n.  -n can still be specified with a
    single node indicating that node should be considered the current node
    - that might be confusing so could be removed.
    
    This implementation does not use CTDB_SRVID_RELOAD_ALL_IPS, so it can
    be removed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 428f800bcdf3dbfe19de8bb36099fbf01ebeaab4
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 28 11:50:23 2013 +1000

    recoverd: Defer ipreallocated requests when takeover runs are disabled
    
    The takeover run will fail anyway but deferring seems like a cleaner
    option.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 28 11:32:54 2013 +1000

    recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK
    
    Use disable_takeover_runs_handler() instead of maintaining duplicate
    logic.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 15:04:40 2013 +1000

    recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS
    
    This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK.  It stops
    the IP checks but also causes any attempted takeover runs to fail and
    be rescheduled.
    
    This is meant to completely stop IP movements.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 52050e1c75b21961dafe2bc410268b44240ab24e
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 18:47:51 2013 +1000

    tools/ctdb: Add a wait_for_all option to srvid_broadcast()
    
    This will be useful for other SRVIDs.
    
    The error checking in the handler depends on the SRVID responding with
    a uint32_t where <0 indicates an error and >=0 is a PNN that
    succeeded.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a566fb5e70282c4e9f76654b1be4dc80829dced0
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 17:06:23 2013 +1000

    tools/ctdb: Factor out SRVID broadcast code from ipreallocate()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c58ee0eddf7ae3283e3ca8bd25575e6e677e1b17
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 16:25:28 2013 +1000

    tools/ctdb: Change ipreallocate() to use a local done flag
    
    Instead of the current global variable.  This is in anticipation of
    abstracting the code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e4eae6e3291baa299a1d0f733ab11b138ee699a3
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 20:02:34 2013 +1000

    recoverd: Factor out the SRVID handling code
    
    The code that handles IP reallocate requests can be reused.
    
    This also changes the result back to a SRVID caller to the PNN on
    success or a negative error code on failure.  None of the callers
    currently look at the result so this is harmless... but it will be
    useful later.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d9c22b04d5aa7938a3965bd3144568664eb772ce
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 20:10:10 2013 +1000

    recoverd: Make the SRVID request structure generic
    
    No need for a separate one for each SRVID.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 48b603fbf16311daa47b01e7a33d477ed51da56d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Sep 3 11:21:09 2013 +1000

    recoverd: Move disabling of IP checks into do_takeover_run()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Sep 3 11:20:01 2013 +1000

    recoverd: do_takeover_run() should mark when a takeover run is in progress
    
    Nested takeover runs should never happens so they should fail.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e5f94c7857405bdeac233069003c3769b3dc3616
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 12:19:18 2013 +1000

    recoverd: takeover_fail_callback() doesn't need to set rec->need_takeover_run
    
    It is set on every failure anyway.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 53722430ad35f80935aabd12fa07654126443b8b
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 9 12:13:11 2013 +1000

    recoverd: Fail takeover run if "ipreallocated" fails
    
    Previously flagging a failure was probably avoided because of attempts
    to run "ipreallocated" events on stopped and banned nodes, which would
    fail because they are in recovery.  Given the change to a new control
    and that fallback only retries the old method on active nodes, this
    should never fail in reasonable circumstances.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 12:14:34 2013 +1000

    recoverd: New function do_takeover_run()
    
    Factor the calling sequence for ctdb_takeover_run() into a new
    function and call it instead.  This changes rec->need_takeover_run to
    false for each successful takeover run and that seems to be the right
    thing to do.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f0f48f22f45e4c82eba2582efae307e25385de81
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Sep 17 12:00:26 2013 +1000

    recoverd: Stabilise the recovery master role
    
    On rare occasions when a node that has been inactive it will trigger
    an election when it becomes active again.  If that node has been up
    for the longest then it will win the election and the recovery master
    role will spuriously move.
    
    While a node remains inactive we reset the priority time to discourage
    it from winning elections.  The priority time will now reflect roughly
    how long the node has been active rather than how long it has been up.
    That means the most stable node is more likely to win elections.
    
    Having a stable recovery master means that disabling takeover runs
    while reloading IPs is more likely to succeed.  It also improves the
    chances of being able to cache information in the recovery master -
    for example, between takeover runs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 403938804caf1322f9773d63197e4303a7b2a788
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 4 13:54:23 2013 +1000

    recoverd: Banned nodes should not be told to run "ipreallocated" event
    
    They will reject it because they are in recovery.  This can result in
    extra banning credits being applied to banned nodes.
    
    This corresponds to commit 9132e6814ed927fa317f333f03dedb18f75d0e5b
    from the 1.2.40 branch.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c0bb147ca09e82019b05ec22995623cffc3184e2
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 9 16:16:24 2013 +1000

    common: Make parse_ip() valgrind-clean
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 36de63843de10a1f2a9ccdbbee24cc1d08542984
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 15:27:30 2013 +1000

    recoverd: Remove an orphaned comment
    
    This should have been removed with the associated code in commit
    14bd0b6961ef1294e9cba74ce875386b7dfbf446.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ea5576071b22e1877903ec0921d375626a23e13b
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 15:24:17 2013 +1000

    recoverd: Update a comment to use current terminology
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d8a76cf79f07dfb5a93c6c9a13f16e3268c7dd57
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 15:16:51 2013 +1000

    client: Remove unused function list_of_active_nodes_except_pnn()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d4e206fb818048b7fab4797c877b854bdbb1ab70
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 15:14:10 2013 +1000

    tools/ctdb: list_of_active_nodes_except_pnn() -> list_of_nodes()
    
    list_of_active_nodes_except_pnn() is only used here and can be removed
    if we remove this call.  Less is more...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8753a094b97340deb26dd44f6ea345ca0a642a95
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 28 15:36:27 2013 +1000

    tools/ctdb: Fix a memory leak in parse_nodestring()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4a388fc6bf54636b7e1f6da8e6aa451cddd574f7
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 16:37:52 2013 +1000

    tests/eventscripts: Tests for memory checking in 00.ctdb
    
    ... plus updates to test infrastructure to support.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 16fcff0d1993b7a0479341862ea44d10bd5c6d6d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 6 12:13:31 2013 +1000

    eventscripts: Clean up monitoring of system memory in 00.ctdb
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 09940255011b119dc6af3304f5d3e9568e6006fd
Author: Michael Adam <obnox at samba.org>
Date:   Thu Aug 22 16:17:09 2013 +0200

    server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..
    
    This was the comment block I was touching and meant to adapt in
    commit 00d3bf092e2f72eda330978c75ec85f17e870553.
    My search was apparently not unique...
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit c446579fc442955ecc74f5566eaa0635c3171498
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 14:01:25 2013 +1000

    doc: Update NEWS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit eb8575718400c45626cd1b2e0fd247bc3ebff655
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Aug 22 17:59:31 2013 +1000

    build: Fix build dependencies for ctdb_lock_tdb
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 618ea3660e36e7bd92b686e1ca8728cf63c3c068
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 22 14:04:59 2013 +1000

    tests/simple: Minimise the chance of a monitor event being cancelled
    
    A monitor event following a "ctdb delip" might reconfigure services.
    If the monitor event is cancelled then a service might be stopped but
    not yet restarted and this could result in the subsequent monitor
    events failing.
    
    This obviously needs to be fixed in CTDB itself.  This will happen by
    making "ctdb reloadips" the supported way of reconfiguring IPs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3ffca990a18cbd31c8bd3ae01c6671d60da58f58
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 17:24:03 2013 +1000

    packaging: Remove pushd/popd from maketarball.sh, don't need bash
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f0d69a9079b7aecc68f1d2d8510702046b618b19
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 16:48:21 2013 +1000

    tools/ctdb_diagnostics: Add output of "ctdb getdbmap"
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 406e1cb1fdd17ddd239774d0228e3657b73ae68f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 16:38:17 2013 +1000

    tools/ctdb_diagnostics: Safer temporary file creation
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 81833052d7ee8f76b1e98376a0273448640cfa8e
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 14:34:49 2013 +1000

    eventscripts: Avoid using a temporary file in 62.cnfs
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4b914d7e217202f3d11a8e95f9f74bc17869475b
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 14:27:39 2013 +1000

    scripts: Remove gdb_backtrace
    
    This uses potentially insecure temporary files and is not referenced
    anywhere else.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b1d8732b5da18ae80aea1df0e66b0b5cdcd919bc
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 19 14:40:52 2013 +1000

    tools/ctdb: Make most non-auto-all commands abort if run with -n all
    
    Or if run with -n A,B,...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7b3f7eea2465efb099a2faf3e42174bc97b13a16
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 15 05:02:37 2013 +1000

    tools/ctdb: Remove more non-essential fetching of PNN from daemon
    
    The useful cases are either CTDB_CURRENT_NODE, in which case
    ctdb_get_pnn() does the job, or a PNN, which is... ummm... a PNN!  :-)
    
    This works because parse_nodestring() validates PNNs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 123a4677528cb46bee1c6dad8a5162eba9880bc1
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 19 13:54:49 2013 +1000

    tools/ctdb: Improve auto-all settings for some commands
    
    * ipreallocate is cluster-wide so should not be auto-all
    
    * enablescript, disablescript, getreclock, setreclock, natgwlist can
      all be auto-all without issues
    
    * xpnn, ipiface a local-only so don't work with -n, so might as well
      not be auto-all
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit da22d5e60dc023009854025cc9e6bc4b0a84c60e
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 20:27:25 2013 +1000

    recoverd: Remove an unused temporary talloc context
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit db57261d7dc264e161659a8c547f44fbd9e88eeb
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 16 14:10:57 2013 +1000

    recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c
    
    This is an internal structure.  It was moved into ctdb_private.h a
    long time ago to allow unit testing.  Unit test compilation was
    changed shortly afterwards to make this unnecessary.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 15 17:04:01 2013 +1000

    recoverd: Log more information when interfaces change
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 256b157232c60bc432c94e54b1fae9699f737557
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 11 16:00:30 2013 +1000

    traverse: Log when database traverse is started
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4ed2efb838d2ac97746666f614ebef5fdf3cdd5e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Aug 22 15:12:17 2013 +1000

    ctdbd: Finish eventscript callback processing before debugging hung script
    
    This ensures that the result of eventscripts is updated and callback is
    processed before debugging hung script.  So "ctdb scriptstatus" output
    will be useful from debug hung script.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit 7677fb263f06a97398e2c546e32273fb96edca69
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 23 16:00:15 2013 +1000

    ctdbd: Make sure call data is freed if doing an early return
    
    This should avoid memory bloat when a request bounces between nodes.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 92939c1178d04116d842708bc2d6a9c2950e36cc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Aug 21 14:42:06 2013 +1000

    common/io: Limit the queue buffer size for fair scheduling via tevent
    
    If we process all the data available in a socket buffer, CTDB can stay busy
    processing lots of packets via immediate event mechanism in tevent.  After
    processing an immediate event, tevent returns without epoll_wait.  So as long
    as there are immediate events, tevent will never poll other FDs.  CTDB will
    report this as "Event handling took xx seconds" warning.  This is misleading
    since CTDB is very busy processing packets, but never gets to the point of
    polling FDs.
    
    The improvement in socket handling made it worse when handling traverse
    control.  There were lots of packets filled in the socket buffer quickly and
    CTDB stayed busy processing those packets and not polling other FDs and timer
    events.  This can lead to controls timing out and in worse case other nodes
    marking busy node as disconnected.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d8b094e804efc53fae9f44c6ef961b7b5797d290
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Aug 20 14:20:09 2013 +1000

    Revert "common/io: Keep queue buffer size multiple of 4K"
    
    This reverts commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9.
    
    This is not the best approach.  Allowing queue buffer size to grow
    indefinitely causes large number of CTDB packets to be queued up very
    quickly which when processed via immediate events will block CTDB from
    processing events from other FDs.  If there are immediate events queued
    up, tevent will never process any of the FDs till all immediate events
    are processed.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ac417b0003f0116f116834ad2ac51482d25cfa0d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Aug 19 15:04:46 2013 +1000

    Revert "LACOUNT:  Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node"
    
    This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.
    
    This is a premature optimization.  Record can bounce between nodes
    very quickly if it is a contended record.  There is no need to hold a
    record on a node unnecessarily.  In case record contention becomes bad,
    enabling sticky records on a database is a better idea.
    
    Conflicts:
    	include/ctdb_private.h
    	server/ctdb_tunables.c
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 48f40985f4592c28402303ccbb458756f4914f75
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 15 15:39:47 2013 +1000

    ctdbd: Print a log message when a key becomes hot
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit df83ae7a047dab4803e0d94b1c11df48ae17ca96
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 9 17:22:55 2013 +1000

    ctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster
    
    Empty record with rsn=0 should not be written on any other node other than
    dmaster.  This is however not true for persistent databases.  So currently
    apply the check only for volatile databases.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5cdad2b8ebd71a5e458c301d00eac00a211feeb3
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 17:00:10 2013 +1000

    tools/ctdb: Fix message in showban when node is banned
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0fe79662e20e347d9e1cb12a42cd356e33572402
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 16:58:42 2013 +1000

    tools/ctdb: Reimplement ban/unban using update_flags_wait_and_ipreallocate()
    
    This has the side effect of making these commands more resilient to
    control timeouts.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 444521c852749558f39dc6131acce9e47eefd489
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 16:34:59 2013 +1000

    tools/ctdb: Factor out common pattern used in disable/enable/stop/continue
    
    Now we will only have one set of bugs.  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 4bf0b1c9d21986eecb7682f935bd6154c65533cc
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 15:41:37 2013 +1000

    tools/ctdb: Factor, simplify and improve robustness of ipreallocate code
    
    Having other functions call control_ipreallocate() suggests that the
    it might look at the argv/argv arguments that are passed.  This is not
    the case.  Change the callers so they call the new ipreallocate()
    function instead.
    
    Broadcast CTDB_SRVID_TAKEOVER_RUN to all connected nodes.  Inactive
    nodes will ignore it.  This is safe since we only want 1 reply.  If we
    didn't get a response, we don't actually care if there's no active
    recovery master - just fire, wait, retry, ...
    
    Ignore some failures on the basis that they might be transient, so it
    is probably worth retrying.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d8eb2e7fdd7645719370dad4f2faa5c3fffa8249
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 15 04:38:02 2013 +1000

    tools/ctdb: Use ctdb_get_pnn() to get PNN of the current node
    
    This has already been stored at connect time and can't fail.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f9556a6f1fe0046308c8b363e6dcaf3f7ce6f2b7
Author: Michael Adam <obnox at samba.org>
Date:   Mon Aug 19 16:54:06 2013 +0200

    util: In passing the code, fix a space vs. tab in set_close_on_exec().
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 00d3bf092e2f72eda330978c75ec85f17e870553
Author: Michael Adam <obnox at samba.org>
Date:   Mon Aug 19 17:07:19 2013 +0200

    server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit cb3a1c5af3b796dba30cae07118670d3c9e57df7
Author: Michael Adam <obnox at samba.org>
Date:   Tue Aug 13 10:17:45 2013 +0200

    server: fix wording and punctuation in comment block for ctdb_reply_dmaster().
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 7b7aa7b599536cd60ebb84d363607bb4e953248a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Aug 14 11:44:12 2013 +1000

    recoverd: Improve log message when nodes disagree on recmaster
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1c9025fdd08d1cea342af7487d0123015e08831b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 2 11:05:08 2013 +1000

    common: Null terminate process name string so valgrind doesn't complain
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f0853013655ac3bedf1b793de128fb679c6db6c6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Aug 12 15:50:30 2013 +1000

    vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)
    
    This is caused by corruption of a record header such that the records
    on two nodes point to each other as dmaster.  This makes a request for
    that record bounce between nodes endlessly.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a610bc351f0754c84c78c27d02f9a695e60c5b0f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Aug 12 15:51:00 2013 +1000

    vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)
    
    This is caused by corruption of a record header such that the records
    on two nodes point to each other as dmaster.  This makes a request for
    that record bounce between nodes endlessly.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 60cb40d090e45ff6134c098a238fac7ad854f134
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Aug 6 14:37:13 2013 +1000

    db_wrap: Make sure tdb messages are logged correctly
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit e9ef93f7b6dad59eabaa32124df81f3e74c651ef
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 12 11:36:25 2013 +1000

    eventscripts: Become unhealthy faster on nfsd failure
    
    Anecdotal evidence suggests that most nfsd RPC check failures are due
    to cluster filesystem or storage problem.  Apparently these are rarely
    helped by attempting to restart the NFS service because the restart
    tends to hang.
    
    Fail after 2 nfsd RPC check failures, instead of waiting for 6
    failures.  Restart on every 10th failure to try to bring the node back
    to good health.
    
    Update unit tests to match.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b49c4f39666d5b1596213bf41bcdc47ed3c327ae
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 11:56:29 2013 +1000

    tools/ctdb: Increase default control timeout to 10 seconds
    
    The current 3 second timeout is arbitrary and users trip over it
    sometimes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ff5f0d1e29af2b293e30cdc54bed03a644be7038
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 8 16:02:44 2013 +1000

    eventscripts: Improve message logged when a counter hits a limit
    
    It should print the actual number of consecutive failures rather than
    the limit.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 11fbf4789d783dd0bac22754b374dd9ea4b03bad
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 6 12:42:13 2013 +1000

    eventscripts: Print a message when waiting for TCP connections to be killed
    
    This makes the gaps in the logs more obvious.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1d61988af9e4fa3621a3e2d06a859bcb53df2d67
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 5 15:12:14 2013 +1000

    eventscripts: New configuration variable $CTDB_RPCINFO_LOCALHOST
    
    Passing "localhost" to the rpcinfo command causes overheads, like
    reading /etc/services multiple times.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit f4ef83a256f59eeb00b9a5bc10c28347e1ad1031
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 2 15:18:47 2013 +1000

    eventscripts: Add modulo (%) operator to ctdb_check_counter()
    
    Also add it to the corresponding eventscript unit test infrastructure.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e8b531405665885196c95fe1608db33a255bf761
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 2 16:05:46 2013 +1000

    eventscripts: Separate out RPC service restart code
    
    While doing this:
    
    * Explicitly assign RPC program and version information in
      _nfs_check_rpc_common().  This is more lines of code but is easier
      to read.
    
    * Don't print the options when starting a service.  Trying to print it
      makes the code messy for little benefit.
    
      Update the eventscript unit testing code and a Ganesha test to
      reflect this.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3ba933d806106d12bc48b83b22d0f314d9d1e5e5
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 2 16:03:42 2013 +1000

    tests/eventscripts: Override background_with_logging(), just prepend "&"
    
    That is, output that goes through background_with_logging() just gets
    "&" prepended to each line.  This is cleaner than having the tests
    grovel through logs.
    
    Update some 49.winbind/50.samba tests to deal with this.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1a1be43f8466d46913dcdfe6dcedb94316cd28ad
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 30 16:24:24 2013 +1000

    eventscripts: Remove support for RPC service 'q' and 's' restart flags
    
    They're hard to maintain and provide very little benefit.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c7332526b1b488abefeb4be78a7cd3f2f9abc451
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 30 16:21:36 2013 +1000

    eventscripts: When restarting the nfslock service only show output of start
    
    That is, /dev/null the "stop" output.  This is consistent with the way
    CTDB generally deals with the output when stopping a service.
    
    It also makes updating the eventscript unit tests easier.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 63be516673c5d9c0d543617bf1bb8bca919956a8
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 15:27:24 2013 +1000

    tests/simple: Unreachable node test should wait for recovery to complete
    
    This should minimise the chances of a control timing out.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 4e3bd06916bd3adac213fb18c7c2a24854b02d45
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 15:09:23 2013 +1000

    tests/simple: Fix the missing IP test
    
    Update the missing IP test to wait until restarts are complete.
    Otherwise a service restart can collide with the following monitor
    event and cause chaos.
    
    Also, do not disable 10.interface until it matters.  Disabling it too
    early can cause even more chaos if something goes wrong with the
    monitor step.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 2fc6b6403707a292d134140fc0b9145b454992c5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Aug 13 14:02:46 2013 +1000

    recoverd: Use TDB_INCOMPATIBLE_HASH when creating volatile databases
    
    When creating missing databases either locally or remotely, recovery
    master calls ctdb_ctrl_createdb().  Recovery master always passes 0
    for tdb_flags.  For volatile databases, if TDB_INCOMPATIBLE_HASH is not
    specified, then they will be attached without using jenkins hash causing
    database corruption.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ca61eb776ab862bd269e45ee0f9f96e7e1e0e001
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Aug 13 13:55:47 2013 +1000

    Revert "recoverd: Use correct tdb flags when creating missing databases"
    
    This reverts commit 10a057d8e15c8c18e540598a940d3548c731b0b4.
    
    This approach would not work when creating local databases since currently
    there is no control to receive TDB flags for remote databases.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Aug 5 17:28:47 2013 +1000

    common/io: Keep queue buffer size multiple of 4K
    
    Currently queue buffer size is realloc'd every time we need to extend the
    buffer.  Small increments can cause memory fragmentation.  Instead always
    extend buffer in multiples of 4K.  This should reduce multiple talloc_realloc
    calls when there are lots of packets in the socket buffer.
    
    Also, if queue buffer has grown larger than 64K, throw away the buffer once
    all the requests in the queue have been processed.  That way queue does not
    hold on to large buffers.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 867afb247bd8cc86c8d738f051a44cc534cafacf
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 26 13:57:03 2013 +1000

    packaging: Allow setting custom release number in RPM spec file
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-Programmed-With: Amitay Isaacs <amitay at gmail.com>

commit 44a64d1c388bfe3c3388b191edfaedecfb7bb831
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jul 31 15:59:11 2013 +1000

    ctdbd: When a record is made sticky, log only once
    
    Instead of logging from ctdb_request_call(), log the message from
    ctdb_make_record_sticky().  That way if the record is already sticky, the
    message is not repeated unnecessarily.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9cde47e1a5bf1b9ca3b4da8c2db94caac2b1aa5e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 15 17:34:31 2013 +1000

    ctdbd: Improve high hopcount log messages when request is redirected
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 81d7ce03b28d592a1337639e14d9ea141e20bfff
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 6 16:11:40 2013 +1000

    scripts: Do not run ctdb tool commands when debugging hung "init" event
    
    CTDB daemon is not ready to accept clients in INIT runstate (init event).
    CTDB daemon will start accepting connections in SETUP runstate (setup event)
    and later.
    
    Also, minor log formatting changes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d7f6bc3fed2dc61e6e587b4c0ec0ac27d533bbbe
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Aug 5 17:38:42 2013 +1000

    ctdbd: Avoid leaking file descriptor if talloc fails
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9e99e0eb072e2b845914ee3896acbc66b96138d7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Aug 5 14:08:28 2013 +1000

    eventscript: Wait for debug hung script to finish or timeout before continuing
    
    Currently if the debug hung script takes long time to finish, the subsequent
    monitor event can collide with the previous event which is not yet finished.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 44eb86e6042adb6efe75d2a5528b82a0f21d496d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 2 15:49:06 2013 +1000

    eventscripts: Use configured RECLOCK file instead of asking CTDB
    
    On cluster where recovery lock file is not being used, asking CTDB daemon
    is unnecessary overhead.  And if CTDB is using recovery file, then changing
    configuration without restarting is *stupid*.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit ebecc3a18f1cb397a78b56eaf8f752dd5495bcc9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 2 10:54:38 2013 +1000

    locking: Do not create multiple lock processes for the same key
    
    If there are multiple lock helper processes waiting for the same record, then
    it will cause a thundering herd when that record has been unlocked.  So avoid
    scheduling lock contexts for the same record.  This will also mean that
    multiple requests will get queued up behind the same lock context and can be
    processed quickly once the lock has been obtained.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 68af5405acc123b5a90decd2123e2a02961a8fcf
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 2 10:51:45 2013 +1000

    locking: Move function find_lock_context() before ctdb_lock_schedule()
    
    So that ctdb_lock_schedule() can call this function without requiring extra
    prototype declaration.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 824dcec35ec461d78e22b2ea109473b32bfe3972
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 30 14:17:55 2013 +1000

    ctdbd: Print set db sticky message after it's set
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f6b066a23610fb0092298861c21a9b354b91e2f1
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Dec 4 18:27:10 2012 +1100

    tests: Add a test program to hold a lock on a database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 10a057d8e15c8c18e540598a940d3548c731b0b4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 30 12:45:01 2013 +1000

    recoverd: Use correct tdb flags when creating missing databases
    
    When creating missing databases either locally or remotely, make sure
    to use the correct tdb flags from other nodes.  Without this, volatile
    databases can get attached without TDB_INCOMPATIBLE_HASH flag.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7e7e59c4047c78159387089eca65d90037bcf722
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Aug 1 11:07:59 2013 +1000

    client: Always use jenkins hash when attaching volatile databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 32c83e209823e9a4d6306bb7fd63d4500f3e2668
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 29 13:50:44 2013 +1000

    recoverd: Make sure to use jenkins hash for recovery databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fcf77dec5af973a0e32f3999bc012053a6f47a96
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 22 17:26:28 2013 +1000

    recoverd: Assemble up-to-date node flags information from remote nodes
    
    Currently nodemap used by recovery master is the one obtained from the local
    node.  This information may have been updated while processing main loop.
    Before comparing node flags on all the nodes, create up-to-date node flags
    information based on the information received from all the nodes.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 049d9beb3783482490e6273a434ccbad23f85f0a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 15 16:35:30 2013 +1000

    tools/ctdb: Only print the hot records with non-zero hopcount
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ab35773518ad15588013f4d859f7bee790437450
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 15 16:32:40 2013 +1000

    ctdbd: Don't consider a hot record if the hopcount is zero
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fde4b4db5a57f75c5efa5647c309f33e0d5a68f3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jul 12 17:33:13 2013 +1000

    ctdbd: Fix updating of hot keys in database statistics
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit e73b2e12adc9db1dedb48d32bba3a8406a80f4cd
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 15 15:24:11 2013 +1000

    ctdbd: Remove incomplete ctdb_db_statistics_wire structure
    
    Instead of maintaining another structure, add an element as place holder for
    marshall buffer of hot keys.  This avoids duplication of the structure.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 023ca2e84f5ed064a288526b9c2bc7e06674dd81
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 15 14:52:07 2013 +1000

    Revert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure"
    
    The structure cannot be removed without adding support for marshalling keys
    for hot records.
    
    This reverts commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 57aa2dffea60abd73a95233f8b761cc676adebb6
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 26 15:09:24 2013 +1000

    doc: Update XML files to use standard DocBook DTD
    
    This simplifies building since we don't use any of the Samba
    extensions.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 37ccc7c6cc43a80aaa92291aea7a438f4225488a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 26 11:20:47 2013 +1000

    initscript: The wrapper script should export CTDB_SOCKET
    
    This ensures that any invocation of the ctdb tool (within the wrapper)
    gets the desired value.  This at least ensures that ctdbd will be
    started.
    
    If a non-standard value is set for CTDB_SOCKET then command-line users
    will still need the variable in their environment.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 782814288bb560099ee44b607bf35f3eddf37f82
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 25 16:17:07 2013 +1000

    ctdbd: Kill client process without checking for tracked child
    
    Commit f73a4b1495830bcdd094a93732a89dd53b3c2f78 added a safety check
    to ensure that CTDB never kills unrelated processes.  However, client
    processes are unrelated.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 25 13:40:43 2013 +1000

    eventscripts: kill_tcp_connections() should send connections to stdin
    
    This avoids issuing multiple "ctdb killtcp" commands to terminate tcp
    connections, one per connection.  This will considerably reduce the
    time when there is a large number of tcp connections.  This also makes
    it possible to avoid calling "ctdb killtcp" when there are no connections.
    
    Add a couple of unit tests for killtcp and update eventscript unit
    test infrastructure to support.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit af5aa369c266430fe912df0c26116b68bac3572e
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 25 13:28:26 2013 +1000

    tools/ctdb: Allow killtcp to read connections from standard input
    
    This will allows eventscripts to send information about multiple tcp
    connections to a single "ctdb killtcp" command, saving the overhead of
    setting up a client connection per tcp connection.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit a69e03a5e4671e998d45b4fef8611a421bbdb3e1
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 22 20:11:58 2013 +1000

    tests: Always tally the number of passed/failed tests
    
    Regardless of whether a summary is being printed!
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit bf4a7c1ad87e0e848296d15d63eb8cd901ca5335
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 22 16:39:46 2013 +1000

    recoverd: Call takeover fail callback only once per node
    
    Currently the fail callback is called once per (takeip/releaseip) control
    failure.  This is overkill and can get a node banned much too quickly.
    
    Instead, keep track of control failures per node and only call fail
    callback once per failed node.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 1b016b2dfc5d7d3f2a42ce4dfe569608e90eb714
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 22 15:08:32 2013 +1000

    scripts: Run scriptstatus for hung event
    
    The timeout information printed by ctdbd is less than useful because
    it refers to the cumulative time taken by the eventscripts run so far.
    Adding scriptstatus output indicates where time was actually spent.
    
    Since there is now quite a bit of output, serialise the calls to this
    script using flock.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit e0f3fa1020e13b84bdd672538168d148f1847d57
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 22 15:06:52 2013 +1000

    ctdbd: Pass event name to hung script debugger
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 29e98017221326bdc9b1c4f7c05b3b495c1de29b
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 22 14:32:13 2013 +1000

    tests/complex: Fix NFS tests to work with root_squash
    
    Refactor the NFS test setup/cleanup code into new common functions.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 9d6e1c147bd036d832b98c155f405ee2a5d6f57f
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 19 19:59:43 2013 +1000

    tests: Fix exit status of run_tests when a single test is run with -H
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ae3c03d80264e997b7da9f3279d7810e18b8a1df
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 19 15:33:38 2013 +1000

    tests/simple: Add -p in onnode test to help show groups of connections
    
    Change the command from "true" to "hostname" since the former won't
    produce any output when used in combination with "onnode -p".  This
    could just be changed to "echo" but the hostname might actually be
    useful.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 90d792cf28d6a823141e4c417b6978f02a9cf596
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 17 11:14:37 2013 +1000

    ctdbd: Sleep at exit to allow time for log messages to flush
    
    Register print_exit_message() earlier so that it covers most of the
    early exits.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 3dd5b925dcf0e9a5b877638e471c5ecf36b46c58
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 19 15:36:29 2013 +1000

    ctdbd: Exit if something is already listening on CTDB socket
    
    Don't blindly remove the socket.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 53e4eca74429f76adc81d98e3d11d1bd61194d71
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 16 19:57:18 2013 +1000

    tests/eventscripts: Add tests for monitoring of missing interfaces
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 501f19b16fd6d67fbb754248868c38ee5bcf79ef
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 12 12:48:34 2013 +1000

    eventscripts: A missing interface should cause monitoring to fail
    
    A missing interface is at least as bad as an interface with a link
    that is down so should have a similar effect.
    
    This couldn't be done previously because orphaned interfaces used to
    be listed for monitoring.  This was worked around in 10.interface in
    commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443 and fixed in ctdbd in
    commit cc1a3ae911d3fee8b87fda5de5ab6d9499d7510a.
    
    If $CTDB_PARTIALLY_ONLINE_INTERFACES="yes" then monitoring won't
    actually fail but the interface is still marked as down.
    
    While we're touching this code, use "ip link" instead of "ip addr".
    It is marginally cheaper but not enough for a separate patch.  ;-)
    
    This effectively reverts d67955b42f7627be9dae995230c8fcbb8a948ec2.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c6ab0f9405d5fa5b0b1693bc92e59da0d555a9d7
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 12 12:33:36 2013 +1000

    eventscripts: Get list of configured interfaces using "ctdb ifaces"
    
    This was previosuly changed because ctdbd didn't garbage collect
    orphaned interfaces.  This was fixed in commit
    cc1a3ae911d3fee8b87fda5de5ab6d9499d7510a.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 57ef5d3827ea3417a32703e259a53ce6fd10ac45
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jun 24 15:49:48 2013 +1000

    ctdbd: Allow extra recovery to repair persistent DBs during first recovery
    
    Commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28 introduced a potential
    regression because a node may not have completed the "recovered" event
    (so might still be in CTDB_RUNSTATE_FIRST_RECOVERY) when another node
    becomes healthy.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5740155cc5de1a223412e8529aa1a383a5412514
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 16 12:53:16 2013 +1000

    packaging: Bundle debug_locks.sh script in RPM
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 67c227a5d30cb8487b20b19b20bdfa4613906609
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 16 12:52:00 2013 +1000

    packaging: No need to check for existence of scripts, they always do
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 412bc0e20bef694d4e911dc9c984fd7716231f1f
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 11 14:26:38 2013 +1000

    scripts: ctdbd_wrapper logs a message to syslog if syslog is not being used
    
    It can be very disconcerting when logging to syslog is expected but
    nothing is being logged there.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a4afe7af9c9391048d6f80135bbd5e15367770c7
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Fri Jun 7 19:01:06 2013 +0200

    Update Nagios check to work with ctdb versions past 30 Aug 2011
    
    Because of commit a779d83a6213e2ba
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 40f2825d6e818dc8c745b6385a545969dfb45fbc
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 11 13:01:13 2013 +1000

    recoverd: Really fix bogus info in message about changed flags
    
    Commit 9119a568c2b4601318f7751f537dca2f92a7230b attempted to fix this.
    However, this was wrong because old_flags and new_flags were confused.
    The latter has since been fixed in commit
    7eb2f89979360b6cc98ca9b17c48310277fa89fc so this can now be fixed
    properly.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 76703514040b804b880cab909f6ff52576f80f89
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 10 14:44:56 2013 +1000

    doc: Update NEWS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0930a3b806977555509c3228726e2250aef1f971
Author: Sumit Bose <sbose at redhat.com>
Date:   Mon Nov 19 18:45:37 2012 +0100

    Print deleted nodes as well
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a81edf7eb908659a379f0cb55fd5d04551dc2c37
Author: Sumit Bose <sbose at redhat.com>
Date:   Thu Sep 1 15:18:46 2011 +0200

    IPv6 neighbor solicit cleanup
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit da87395d29f5d11ecfedaf36b53fa060a9140bfd
Author: Sumit Bose <sbose at redhat.com>
Date:   Mon Nov 19 11:13:03 2012 +0100

    Fix memory leak in ctdb_send_message()
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 05bfdbbd0d4abdfbcf28e3930086723508b35952
Author: Sumit Bose <sbose at redhat.com>
Date:   Wed Aug 10 17:53:56 2011 +0200

    Fixes for various issues found by Coverity
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5cdcc3d45d358ddbcd7e864898eed9cbd9935429
Author: Sumit Bose <sbose at redhat.com>
Date:   Mon Nov 19 11:20:31 2012 +0100

    Check return value of tdb_delete()
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ed9ba1d3dcfcb51aa69bf4d7a74b95063743d8d9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 11 13:46:18 2013 +1000

    web: Update webpages
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9ffcd6a91287d86bae7b0c73aa129c81126e08e7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 11 11:34:46 2013 +1000

    Tests: Correct the arguments to memset
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 14141b02b61d2783b750ee5b30f9520253e88f09
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jul 10 14:44:56 2013 +1000

    doc: Update NEWS
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-programmed-with: Martin Schwenke <martin at meltin.net>

commit e43a4b7b69a21c4cec2453dcac436b64bf5d7f06
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 10 17:19:55 2013 +1000

    packaging: Add systemd support
    
    Based on an original patch by Sumit Bose <sbose at redhat.com>.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 30a0040fbb7c4d97d107f0e55c600295c2603a68
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 10 16:35:53 2013 +1000

    build: Turn off all deprecation warnings
    
    The "‘tevent_loop_allow_nesting’ is deprecated" warnings will be
    around for a while and are annoying.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b6bbfb4c464c39e322830cbbebcc51c225508584
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 10 16:30:29 2013 +1000

    build: Remove -DTEVENT_DEPRECATED_QUIET=1 from CFLAGS
    
    This reverts the last part of 788cdbddbc902a5b076d23473450065b551d274d
    - the rest of this has been implicitly reverted via tevent syncs.
    This is just leftover noise.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e3abc7eebab5cceddc4ce7817890dd5db9be3450
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 9 15:22:07 2013 +1000

    initscript: Simpify initscript and control CTDB via new ctdbd_wrapper
    
    Currently the initscript is very complex.  This makes it hard to read
    and hard to add support for new init systems, such as systemd.
    
    Create a wrapper called ctdbd_wrapper to be installed alongside ctdbd.
    This is called by the initscript to start and stop ctdbd.  It does the
    ctdbd option construct and waits until ctdbd is properly initialised
    before it exits.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c6fded59fa4da67f738a90fdacb51900e41801f9
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 8 12:45:31 2013 +1000

    recoverd: Recovery daemon should use ctdb_get_pnn, which can't fail
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 846109169ee5e3d03135156e45c8dac93aa2e95b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jul 10 12:23:30 2013 +1000

    ctdbd: Print tdb flags when logging attached to database message
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2493f57ce268d6fe7e4c40a87852c347fd60d29e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 9 12:32:53 2013 +1000

    ctdbd: Set process names for child processes
    
    This helps distinguish processes in process list in top, perf, etc.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fc3689c977f48d7988eed0654fb8e5ce4b8bfc8b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 9 12:24:59 2013 +1000

    common/system: Add ctdb_set_process_name() function
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit dc834d5e78c3fb97ae15cddf1139b3c4a4051a7c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 6 16:29:04 2013 +1000

    traverse: Remove unused start_time field
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1a74192aa7d51ed99553e7292860027f06b6ef37
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 6 16:26:25 2013 +1000

    traverse: Send records directly from traverse child to srcnode
    
    Currently CTDB daemon reads records from a child process and then sends them to
    srcnode via TRAVERSE_DATA control.  This ties up main CTDB daemon and also
    requires an extra copy of the record in the CTDB daemon.  Instead send records
    directly from traverse child process.
    
    The control from child process still goes via local CTDB daemon as there
    is no infrastructure currently to open a TCP socket to the srcnode.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit faabce1b99fb3de9ff03bf54d303e7656538fee3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 6 16:12:07 2013 +1000

    traverse: Pass reqid and srcnode information to local database traverse
    
    So that traverse child process can directly send the TRAVERSE_DATA control to
    the srcnode without first sending it to local node.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 8225b3e77e140db34b52571a95d553d1e59e3f1e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 8 16:14:59 2013 +1000

    packaging: When building with system libraries, add dependency for them
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2211cd94bea266547d3e6f167d3160a6b23bec88
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 8 15:49:58 2013 +1000

    ctdbd: No need for DeadlockTimeout tunable
    
    The code for deadlock detection and killing smbd process causing deadlock
    has been removed and replaced with external debug script.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a415a1986900135f889efc25ecaf2761b1dae81a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 8 15:57:22 2013 +1000

    initscript: Export CTDB_DEBUG_LOCKS variable
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c711ff4702c5f95b75e4bf030665fc2afffc2f9e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 8 15:56:30 2013 +1000

    scripts: Add an example debug_locks.sh script to debug locking issue
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2bfb8499366d530f16515b08928056bbda40f781
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 8 15:46:53 2013 +1000

    locking: Use external script to debug locking issues
    
    Use an external script to parse /proc/locks and log useful debugging
    information about locks rather than doing that in C code.
    
    To use this feature, add configuration variable to /etc/sysconfig/ctdb:
    
      CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 6fc36a7036933237d09151a0baf4d8ccd2bc2c99
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jul 3 11:01:21 2013 +1000

    locking: Update locking bucket intervals
    
     0   < 1 ms
     1   < 10 ms
     2   < 100 ms
     3   < 1 s
     4   < 2 s
     5   < 4 s
     6   < 8 s
     7   < 16 s
     8   < 32 s
     9   < 64 s
    10   >= 64 s
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit dcc42a75b4638b3aa40c44ed9e0aaae26483e2b0
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jul 3 11:46:53 2013 +1000

    locking: Update locks latency in CTDB statistics only for RECORD or DB locks
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 594c421f90ce132c75fbd985872114e4967f92b5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jun 25 15:36:13 2013 +1000

    tools/ctdb: Fix the format of DB statistics output
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jun 25 15:25:16 2013 +1000

    ctdbd: Remove incomplete ctdb_db_statistics_wire structure
    
    Send the ctdb_db_statistics directly instead of first copying it to
    duplicate ctdb_db_statistics_wire structure.  This simplifies the
    implementation of the control to get database statistics.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 545a46437dfb2b755bb2fddb11dea8c4ccce3ed7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 09:04:49 2013 +1000

    ctdbd: Update debug messages for setting readonly property on database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 41182623891d74a7e9e9c453183411a161201e67
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jul 5 14:04:20 2013 +1000

    recoverd: Fix buffer overflow error in reloadips
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit e1cf1f728236d808bb41265e74bc65f54bf1c133
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 4 20:02:29 2013 +1000

    tests/eventscripts: Add some rudimentary tests for 60.ganesha
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f606df4f2db754592e6d1a16c26e155cacb2beef
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 4 16:05:01 2013 +1000

    eventscripts: New configuration variable $CTDB_SKIP_GANESHA_NFSD_CHECK
    
    This allows 60.ganesha to be unit tested, except for the core Ganesha
    monitoring code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ceb5b2d37f7ab4894908ec26f3812b3bed991525
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 4 16:00:33 2013 +1000

    eventscript: Move Ganesha nfsd monitoring to a function
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 520914e7ee1b879c1080e5857fda18ed5b973fd6
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 4 15:11:54 2013 +1000

    eventscripts: Drop RPC service version from nfs_check_rpc_service() calls
    
    Support for this was removed in commit
    77302dbfd85754e02559eccb2dd6c090db0b6b9f and I overlooked its use in
    60.ganesha.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 4d0f26b306fc465d551d340b0e7dce4412eae3fd
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 2 14:43:17 2013 +1000

    ctdbd: Log something when releasing all IPs
    
    At the moment this is silent and it can be confusing to see IPs just
    disappear.
    
    Also, this message:
    
      Been in recovery mode for too long. Dropping all IPS
    
    can cause anxiety when all IPs should already have been dropped.
    Adding a comforting message saying that 0 IPs were dropped relieves
    such anxiety.  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0a292fa8939a1343e44cadaa8ed9f3c0f18ca82f
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 19:00:36 2013 +1000

    recoverd: Minor style improvements for ctdb_reload_remote_public_ips()
    
    * Add a variable to the loop to make the code more readable and have
      it generally fit into 80 columns.
    
    * Improve comments.
    
    * Improve log messages.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f0942fa01cd422133fc9398f56b4855397d7bc86
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 18:45:46 2013 +1000

    recoverd: Clean up log messages in remote IP verification
    
    The log messages in verify_remote_ip_allocation() are confusing
    because they don't include the PNN of the problem node, because it is
    not known in this function.
    
    Add the PNN of the node being verified as a function argument and then
    shuffle the log messages around to make them clearer.
    
    Also fold 3 nested if statements into just one.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 298c4d2c3b4ea3d900c91f5a0a5aca2952a13d61
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:57:33 2013 +1000

    recoverd: Fix an unclear log message - "Restart recovery process"
    
    When the recovery master notices a node in recovery mode it starts the
    recovery process, it doesn't restart it.
    
    Update documentation to match.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9f6cd8b0bea619991c9f3bf35188c5950dabf8f4
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:53:37 2013 +1000

    recoverd: Fix an incorrect comment
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 035bf3eecf99337c84d4ad16cdbf297b1fa037db
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:48:01 2013 +1000

    ctdbd: Use ctdb_die() on "setup" event failure
    
    This is slightly easier to read because it all fits on 1 line.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3af2d833b63af9931792106db71797f3692669a8
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:43:52 2013 +1000

    ctdbd: Avoid a core dump when "init" event fails
    
    The "init" event only really fails in the scripts, which should log
    something useful on failure.  Therefore, a core dump isn't terribly
    useful and sometimes attracts unwanted attention.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c0a9456692c88a7a5542cd893d8f326524d3f94e
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:42:11 2013 +1000

    util: New function ctdb_die()
    
    This is like ctdb_fatal() but exits cleanly without dumping core or
    generating a backtrace.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ce04f1c107b4392ca955d9f29b93aaaae62439ce
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jun 24 19:03:26 2013 +1000

    eventscripts: When replaying monitor status, don't log empty output
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c5797f2942e83da24df548ea07196fbbac0eab20
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jun 24 16:05:03 2013 +1000

    ctdbd: Release IP callback should fail if the IP is still hosted
    
    At the moment there (at least) are 2 bugs that cause rogue IPs:
    
    * A race where release_ip_callback() runs after a "subsequent" take IP
      has completed.  The IP is back on an interface but we unset
      vnn->iface in the callback.
    
    * A "releaseip" eventscript times out.  We ignore the timeout and call
      it success, deleting the VNN even if the IP is still hosted.
    
      We could decide not to ignore the timeout and ban the node, but
      killing TCP connections can take a long time and that might result
      in a lot of manning.  We probably won't reinstate banning on
      "releaseip" until killing TCP connections has been optimised.
    
    In both cases, a rogue IP can be avoided by leaving vnn->iface set and
    simply failing the control.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit f1f1b0c24b9b6cd24b83a4e4da16e179287ec6ac
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jun 24 15:49:48 2013 +1000

    ctdbd: Log warnings in release IP when unexpected interface is encountered
    
    Previous code changes work around a potential problems but do not
    provide useful information when the a problem occurs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 16afe36de52561a62372c14b567683dc898369d5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 17:37:05 2013 +1000

    ping_pong: Validate num_locks argument > 0
    
    This fixes the floating point error if num_locks = 0.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d48eecd748830598f4f080952f2bf05d6f92738c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 17:27:00 2013 +1000

    tests: If connection to ctdb daemon fails, exit
    
    This fixes the segmentation error if any of the test code fails to
    connect to CTDB daemon.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5408c5c4050539e5aa06a5e82ceb63a6cb5cef0c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 17:00:23 2013 +1000

    build: Fix compiler warnings for uninitialized variables
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9aa13bcedd83d463c871e3cf1f3a65da3cd83992
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 15:36:29 2013 +1000

    recoverd: Send the result from child process only once
    
    The result has been sent before the child keeps waiting for parent
    ctdbd process.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9b529189f8456fad7868fc154ae27a6fd87e93b3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 15:31:52 2013 +1000

    packaging: Enable compiler optimizations
    
    This reverts d09570c70551aa40390ce9ceffe7bc234e1afafe.
    
    ... hoping the segv has been found in last 6 years. :-)
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bb54f3924ff19cd089b0a166fe8368db162ad709
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 15:14:10 2013 +1000

    packaging: Allow building RPMs with system tdb/talloc/tevent
    
    To build CTDB RPMs with system installed libraries, use following command:
    
      ./packaging/RPM/makerpms.sh \
        --with system_talloc \
        --with system_tdb \
        --with system_tevent
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1b0faae9c939a2f8da3cacba715ca62a5830d190
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 14:29:09 2013 +1000

    packaging: Do not mark /etc/ctdb/functions as configuration file
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 53d34eb2f9e5434dea4e7182b6af566a3a96a368
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 13:19:56 2013 +1000

    packaging: Install README.notify.d using %doc directive
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit 6fe584d05543eebd24abd19bab502dc4da04e921
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 12:45:32 2013 +1000

    packaging: Install docs using %doc directive
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit 7e53fbf92b6dd5211d918ea0e23126b7dfa50c42
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 4 11:33:38 2013 +1000

    packaging: Remove ctdb_transaction from docdir
    
    It's bundled in ctdb-tests package.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 145b1966c1b34f1667a175235e1df2741294391c
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:23:08 2013 +1000

    doc: Add a disclaimer for the EnableBans tunable
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b4c06e8ec8b227c1e6c01444038c3b15b5f9e606
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 30 17:22:06 2013 +1000

    doc: Add banning bug fixes to NEWS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ef1c4e99ca66e7a990bc557f34abb624c315e6ba
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 2 12:40:37 2013 +1000

    ctdbd: Don't ban self if init or shutdown event fails
    
    There is no point in banning the node if init or shutdown event times
    out since it's going to quit anyway.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fcd5e1f04c5fe6c98399429b8f0918b8779acba6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 17:46:43 2013 +1000

    doc: The second half of monitoring is only for recovery master
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 932360992b08a5483d90c0590218ba0fd756119e
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 26 09:23:22 2013 +0200

    recoverd: when the recmaster is banned, use that information when forcing an election
    
    When we trigger an election because the recmaster considers itself inactive,
    update our local nodemap with the recmaster's flags before calling
    force_election(). This way, we don't send the inactive node freeze commands
    (e.g.) that may fail and then lead to ourselves getting banned.
    
    The theory is that this should help avoiding banning loops.
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 741944f118e98f178b860194eecb215180949d18
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 26 07:11:51 2013 +0200

    recoverd: fix a comment typo
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95
Author: Michael Adam <obnox at samba.org>
Date:   Fri Jun 21 17:57:37 2013 +0200

    recoverd: fix a comment in main_loop
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit df30c0a05ed908fc2a997c56ff5484736b23b70f
Author: Michael Adam <obnox at samba.org>
Date:   Fri Jun 21 14:06:22 2013 +0200

    recoverd: eliminate some trailing spaces from ctdb_election_win()
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 28 16:31:07 2013 +1000

    recoverd: Don't continue if the current node gets banned
    
    Can not continue with recovery or monitoring cluster.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:31:02 2013 +1000

    recoverd: Refactor code to ban misbehaving nodes
    
    Since we have nodemap information, there is no need to hardcode the
    limit of 20.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit ae1693905036ecdbc4594fde1f12500faae4a554
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 16:01:16 2013 +1000

    recoverd: Move code to ban other nodes after we get local node flags
    
    If a node gets banned first, then it should not ban other nodes.
    
    This code was moved up in main_loop to avoid waiting for nodemap
    from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795).
    
    To prevent a banned node from banning other nodes, we need to first get
    nodemap information from local node, so trying to ban other nodes can
    fail if we are already banned.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 593a17678fbd3109e118154b034d43b852659518
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:44:27 2013 +1000

    recoverd: Delay the initial election if node is started in stopped state
    
    Since there is an early exit if a node is stopped or banned, we can wait till
    the node becomes active to start initial election.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 93bcb6617e1024f810533e12390a572f51703ca0
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:33:49 2013 +1000

    recoverd: Update capabilities only if the current node is active
    
    Since we do an early return if a node is stopped or banned, move update
    capabilities code below the early return and just before we check the
    capabilities of current recovery master.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 815ddd3341b7e9db39e05a3a3fcd9a1420f053bc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:46:04 2013 +1000

    recoverd: No need to check if node is recovery master when inactive
    
    If a node is stopped or banned, it will cause early return from the
    main_loop, so this check is redundent.  The election will called by an
    active node.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2396981c4bcf30530aeb7f4395093cc202105b50
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:39:15 2013 +1000

    recoverd: Always do an early exit from main_loop if node is stopped or banned
    
    A stopped or banned node cannot do anything useful.  So do not participate
    in any cluster activity and do not cause any unnecessary network traffic.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 38304f88e0c634e97d4687c25adef975f71537b8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:10:47 2013 +1000

    recoverd: Do not set banning credits on a node if current node is inactive
    
    If the current node is banned or stopped, then it should not assign banning
    credits to other nodes since the current node will not have up-to-date flags
    of other nodes.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a60f228f8380f222f838eb619d2ab55f96f11ac2
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 17:40:36 2013 +1000

    banning: Do not come out of ban if databases are not frozen
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 297d93cecc3c0655e72ecac38508e113bdbeab9c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 14:33:32 2013 +1000

    banning: No need to check if banned pnn is for local node
    
    If the banned pnn is not the local node, the function returns early.
    So no need for additional check.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bb178338658b4ae32382a1f62f7c21cee1d4878f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:04:18 2013 +1000

    banning: Make ctdb_local_node_got_banned() a void function
    
    When this function is called, we are already committed to banning
    and there is no point in failing this function.  In case, freezing of
    databases fails, it will be fixed from recovery daemon.

commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:02:44 2013 +1000

    recoverd: Also check if current node is in recovery when it is banned
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 8d622660a14c929e365d306147b378ea6ab92175
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:09:35 2013 +1000

    recoverd: Set node_flags information as soon as we get nodemap
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 34af2cdf686d5d77854cbaa7bbcd8f878e9171c7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 26 16:02:23 2013 +1000

    recovered: Remove old comment as the code corresponding to that has gone away
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c6f8407648abb37f2ed781afa5171dad8c9f59e9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 14:31:50 2013 +1000

    banning: Log ban state changes for other nodes at higher debug level
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 46efe7a886f8c4c56f19536adc98a73c22db906a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 16:28:04 2013 +1000

    freeze: Make ctdb_start_freeze() a void function
    
    If this function fails due to memory errors, there is no way to recover.
    The best course of action is to abort.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 87716e8f504d659515d3dbcf93badbf106873bc8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 16:21:00 2013 +1000

    freeze: If priority is invalid here, it's time to abort
    
    ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the
    priority if it's 0 and return error if it's invalid.  Other callers of
    ctdb_start_freeze() are internal to CTDB.  So if priority is invalid in
    ctdb_start_freeze(), definitely something is seriously wrong.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 478e24bceda3fedfba54ccb48faa115df726b819
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 13:26:33 2013 +1000

    freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()
    
    This ensures that whenever databases are frozen either via sending
    control or by calling ctdb_start_freeze(), the action is logged.
    Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of
    message in early return condition if databases are already frozen.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4be8dff3a4451192f838497b4747273685959bed
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 14:18:58 2013 +1000

    recoverd: Print banning message only after verifying pnn
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 26 15:22:46 2013 +1000

    recoverd: When updating flags on nodes, send updated flags and not old flags
    
    This was broken by commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa.
    Instead of a SRVID_SET_NODE_FLAGS message to recovery daemon, a control
    was sent to the local daemon which in turn informed the recovery daemon.
    And while doing this change old flags were sent via CONTROL_MODIFY_FLAGS.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4f87925a287f612a6ab3b5da1a387a31c7bea28f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 26 14:34:47 2013 +1000

    tools/ctdb: Add "force" option to "recover" command
    
    At the moment there is no easy way to force a recovery when attempting
    to reproduce certain classes of bugs.  This option is added without
    documentation because it is dangerous until the bugs are fixed!  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 733fc909425860f6a02c205c2d8f34a731853922
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 17:37:15 2013 +1000

    client: Exit with non-zero status when unix socket is closed
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit abeb65ef02d018a7c14d4f8cea71e15c6cf9e357
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 21 14:49:20 2013 +1000

    doc: Fix ctdb ping entry in manpage
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5d0215be5aefe492258a92c7bff2d41960379580
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 21 14:47:20 2013 +1000

    doc: Fix documentation for NoIPTakeover in ctdbd manpage
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4ba7c73eeab98296c9168e0b0fed1f6bb9f32733
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 21 14:33:12 2013 +1000

    doc: Update notification script section in ctdbd manpage
    
    The example notification script is now much more useful.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4369c8e6ead9062ef7855ada375df74262acf925
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 21 14:32:50 2013 +1000

    doc: Add nodestatus command to the ctdb manpage
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cd6227aa38d3bb4e5043faeffe436004e27b6d06
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 21 10:52:05 2013 +1000

    doc: Update NEWS
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b7aaa28b3a6a2de923417f3d143f8d516447711e
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 20 16:43:10 2013 +1000

    tests: Integration tests use "ctdb nodestatus" for healthy cluster check
    
    Also check that we're not in recovery mode.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b953524185632d7f96a76d8f3bbed7ac1d143d40
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 20 16:42:30 2013 +1000

    tests: Integration test infrastructure should do only a single recovery
    
    No need for 2 recoveries after a restart.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f1b7ca8dc3f34a59c7b3e55748f974ac9ed8f458
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat Jun 22 15:44:28 2013 +1000

    ctdbd: Fix panic on overlapping shutdowns
    
    The runstate can't be set to SHUTDOWN twice, so the current naive code
    causes a panic on the 2nd shutdown.  This regression was introduced in
    commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b32fd04bfbf33062d45365b37a7247e272a76ceb
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 19 10:58:14 2013 +1000

    ctdbd: Refactor shutdown sequence
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9ea57af557028b1d2e5c560e7bcf4d014b9a8b1e
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 16 21:01:43 2013 +1000

    eventscripts: "setup" event doesn't need to wait for SETUP runstate
    
    The "setup" event isn't called until ctdbd is in CTDB_RUNSTATE_SETUP
    anyway...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit aabf0bf41cb8ec344f06b69492fb6c2a27f9e900
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 18 15:07:26 2013 +1000

    tests/eventscripts: New tests for 00.ctdb "init" event
    
    These test dropping of IPs and TDB checking.
    
    New stubs for date, tdbdump, tdbtool.
    
    Enhance ip stub to handle "ip addr show to ..."
    
    Tweak some infrastructure.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 3b11b27f3e22e99947bc2d6c49c4427bd7a0e332
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 18 15:02:05 2013 +1000

    eventscripts: 13.per_ip_routing should not try hard to find public_addresses
    
    This essentially reverts d4621277240721e6d130a930b0100506b64467ea.
    This was added for testing but the test code was actually broken.
    CTDB itself will only process public IPs if $CTDB_PUBLIC_ADDRESSES is
    set, so no code should try to be more flexible than that!
    
    The test code has been fixed instead.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c3e7a6e10d486ba0dbafdf110db540675b2317bc
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 18 15:05:39 2013 +1000

    tests/eventscripts: setup_ctdb() should always set $CTDB_PUBLIC_ADDRESSES
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit f3dd2eec200d6eeada2ea19cd7e76f1edfad6167
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jun 17 15:14:53 2013 +1000

    logging: Notify parent when logging daemon is up
    
    Messages are lost until it is really up because syslogd_is_started is
    set too early.  Adding a pipe to do the notification allows the parent
    to wait and only set syslogd_is_started when the logging daemon is
    actually ready.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 3bc93f312b8464fbfa2b2c44fffedc591fe5a3e0
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jun 17 10:14:24 2013 +1000

    scripts: Move TDB checking from initscript to "init" event
    
    It makes sense to do this in the "init" event and make the initscript
    less complicated.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0b77cceb49a30a181063adc7868d42d2851318e8
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 16 20:29:33 2013 +1000

    scripts: Move dropping of all IPs from initscript to "init" event
    
    It makes sense to do this in the "init" event and make the initscript
    less complicated.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5ffce65a1ad659b198ddf647622b899bdde45c72
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 18 14:53:17 2013 +1000

    scripts: drop_ip() should use delete_ip_from_iface()
    
    Otherwise secondary addresses that aren't owned by CTDB could be
    dropped.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0b67397ef5419c781a35916575151da7b7e7cc27
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 16 20:24:10 2013 +1000

    scripts: drop_all_public_ips() now prints messages to stdout, not log
    
    Change all callers to maintain current behaviour.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0a0c8543f167e11b75a622513367b083e42cbd3f
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jun 16 19:49:02 2013 +1000

    ctdbd: "init" event should run earlier in daemon initialisation
    
    It should run before:
    
    * the transport is started;
    * databases are attached; and
    * processing configuration files (e.g. nodes, public_addresses).
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c48583fd238496a81ddc46a21892f0b49559036a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jun 18 14:27:34 2013 +1000

    tools/ctdb: Do not exit prematurely on control timeout if retrying in a loop
    
    This avoids premature exits from "ctdb stop" and "ctdb continue" due to
    intermittent control (e.g. getpnn, getnodemap) timeouts.
    
    This needs a proper fix to distinguish between timeout and failure
    conditions and take appropriate action.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5f8547b1531bba4950b3d873a997585c3a16d31e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 13 12:55:29 2013 +1000

    packaging: Update the minimum required library versions
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 02c63c591cc273122b3a547bb301b92f0e4bd217
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 7 11:24:17 2013 +1000

    build: Enable VERBOSE option to display build command line
    
    make V=1 or make VERBOSE=1 will display build commands.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Thu Jun 6 21:58:02 2013 +0200

    build: Fix tdb.h path to enable building with system TDB library

commit 14a79c0f3967c88f8ffc8200d122f6c5ffdb63a8
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Thu Jun 6 21:43:08 2013 +0200

    libctdb: Include config.h in libctdb/ctdb.c
    
    Bug-Debian: http://bugs.debian.org/703551

commit edb2a3556d03e248b42f63dd2c62382b723bc98f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 6 16:42:02 2013 +1000

    ctdbd: Make sure we don't kill init process by mistake
    
    If getpgrp() fails, it will return -1 and that will send KILL signal to init
    process (PID 1).  This does not happen on RHEL, but does on AIX.
    
    Reported-by: Chris Cowan <cc at us.ibm.com>
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit cd4358b01c6c3d413b431f5760029d2b163b9c03
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 13 16:32:06 2013 +1000

    tests/eventscripts: Unit tests for $CTDB_NFS_DUMP_STUCK_THREADS
    
    Includes minor test infrastructure updates.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0e2b5a8f89440a53f996482ac0c98b31a4f2cad3
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 13 16:30:45 2013 +1000

    tests/eventscripts: Fix -X tracing in iterate_test()
    
    ... and delete a bogus comment.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ce2ef2be8aa22c0baf868daac8d4cf27246baa14
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 13 15:50:44 2013 +1000

    tests/eventscripts: Add unit tests for $CTDB_MONITOR_NFS_THREAD_COUNT
    
    Includes minor test infrastructure updates.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2503245db10d567af708a04edd3a3b488c24f401
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 13 11:56:25 2013 +1000

    eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS
    
    If some nfsd threads are still alive after a shutdown during a restart
    then this indicates the maximum number of threads for which a stack
    trace should be dumped.  This can be useful for trying to determine
    why nfsd is stuck.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 99b0d8b8ecc36dfc493775b9ebced54539c182d2
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 13 10:17:20 2013 +1000

    eventscripts: Add new option $CTDB_MONITOR_NFS_THREAD_COUNT
    
    Consider the following example:
    
    1. There are 256 nfsd threads configured.
    2. 200 threads are "stuck" in system calls, perhaps waiting for the
       underlying filesystem when an attempt is made to restart NFS.
    3. 56 threads exit when NFS is stopped.
    4. 56 new threads are started when NFS is started.
    5. 200 "stuck" threads exit leaving only 56 threads running.
    
    Setting this option to "yes" makes the 60.nfs monitor event look for
    this situation and try to correct it.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c429394afbabaee09f9216dc743419adddf523ea
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 31 14:55:07 2013 +1000

    recoverd: Log node that causes takoever run to fail
    
    Extend takeover_fail_callback() to just log (and not do any ban
    processing) when the callback data is NULL.  Always call
    ctdb_takeover_run() with the callback so that useful errors are always
    logged.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit ac0892d3a57adb0587a37de0f94fa686bed8970f
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 24 15:38:54 2013 +1000

    doc: Add release notes for 2.2
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 78cff9d54f241fb6a2943e50346f9c2ad9decc78
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 15:14:42 2013 +1000

    build: Fix extra whitespaces
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 82d61f77c01df0fbb42743593937b175ce22a445
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 14:12:14 2013 +1000

    tevent: Sync to tevent 0.9.18 from upstream

commit 506b27c944b4031e8a325816bd12abddd442a0bb
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 14:44:03 2013 +1000

    replace: Sync to latest replace from upstream
    
    The latest commits affecting lib/replace remove autoconf build from
    Samba tree.  So using following commit as a sync point.
    
      commit 9ddfd7d8784e6f546628f48990b69ee2850be52d
      Author: Andrew Bartlett <abartlet at samba.org>
      Date:   Wed May 22 17:23:30 2013 +1000
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bb3a32ec055432afc7225c9fd7504fb187694bda
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 14:05:50 2013 +1000

    tdb: Sync to tdb 1.2.11 from upstream

commit 3bffca8c17e441364525df115ee2ac16b5969e24
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 13:53:38 2013 +1000

    talloc: Sync to talloc 2.0.8 from upstream
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit db31dc48bd3135e9242af08bb79b67a17a2b1668
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 12:11:49 2013 +1000

    ctdbd: Log node state transitions at higher debug level
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ca7ba26362eabfbcc329c66919d9c4da79c3b799
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 14:17:59 2013 +1000

    git: Ignore generated ctdb.spec file
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 641f539ffc7dd9542e669a3ec20c004f8bbcbf1e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 29 14:17:00 2013 +1000

    git: Ignore ctdb_version.h file
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fa757b49374e44c2380d4457e9b0eb3582981fac
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 24 15:25:52 2013 +1000

    build: Use REPLACE_OBJ and CTDB_EXTERNAL_OBJ to simplify build rules
    
    This fixes the build on AIX where libreplace is required to build
    ctdb_lock_helper, ctdb_fetch_lock_once, ctdb_fetch_readonly_once.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2cf95741fdab2ee5f724950a0b1ef257d6aeade7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 24 15:14:20 2013 +1000

    build: Support for building on AIX xlc compiler
    
    xlc does not support -fPIC, -Wno-format-zero-length
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1734562a7b3512853b9e0232880c42d50c1c2e4c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 23 23:44:45 2013 -0500

    tests: Do not use err() to support AIX
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 0320bb4f8ca8171812ec7f41556aed847c74bfb4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 24 14:52:09 2013 +1000

    tests: Include system/time.h to support building on AIX
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2c19fa78ce0b25c3615b23664df32233bdbdea42
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 24 14:51:46 2013 +1000

    libctdb: Do not include sys/time.h to support build on AIX
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit b091f09ea01482823bd850d1d4e2329e0a19c959
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 23 23:42:23 2013 -0500

    util: Do not stop build if backtracing is not supported
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1b5968f6be084590667f4f15ff3bef13ed9a2973
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 28 12:01:57 2013 +1000

    eventscripts: Fix statd-callout update handling
    
    60.nfs and 60.ganesha touch $statd_update_trigger every time they're
    run.  This stops the statd-callout updates from ever being called.
    
    Make this logic self-contained and move it to new function
    nfs_statd_update() in the functions file.  Call this in 60.nfs and
    60.ganesha with the appropriate update period as the only argument.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reported-by: Poornima Gupte <poornima.gupte at in.ibm.com>

commit 25a6fd784cde96f3d20a79f70b5589b5c4aca675
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 28 11:26:17 2013 +1000

    tests/integration: Improve debug output for unhealthy cluster after restart
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 80b3cf2c652c6098390cdd0dbb3edc648f7df487
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 27 15:16:28 2013 +1000

    tests/scripts: Delete unused $rows and $ww variables from run_tests
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 85e11b9b13b3add88c1b8957be51793cc1db4f2d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 28 14:19:32 2013 +1000

    packaging: Create separate package for pcp pmda
    
    To build ctdb-pcp-pmda package, run packaging/RPM/makerpms.sh script with
    "--with pmda" option.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 194f7a0dec26d693a5f3e6734b1c82f61f8e4d19
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 28 14:16:02 2013 +1000

    build: Separate autoconf macros for pmda
    
    The pmda stuff is no longer built by default even if the headers are
    available.  To build, run "configure --enable-pmda".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 11af486754bb04899e3dc544157bf70530e66cd1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 28 14:16:25 2013 +1000

    build: Fix install paths for pcp pmda
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit f2ef3510407fbad29908195c58e4160d5a81e8a4
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 27 14:43:03 2013 +1000

    packaging: makerpms.sh can take multiple arguments for rpmbuild
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0ca7a98ffef50cbd06849cfbf65fb4a3d668b7bd
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 27 12:56:41 2013 +1000

    eventscripts: Stop NAT gateway's delete_all() from polluting the log
    
    Every time a node that wasn't the NAT gateway master gets reconfigured
    something like this appears in the log:
    
      ctdbd: 11.natgw: Failed to del 10.0.1.139 on dev eth1
    
    Since this usually fails it is better to mute the error than to have
    it pollute the log.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b2654853ce9b7c18c5874b080bc94d3118078a5d
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 27 11:29:42 2013 +1000

    recoverd: Backward compatibility for nodes without IPREALLOCATED control
    
    Consider the case of upgrading a cluster node by node, where some
    nodes are still running older versions of CTDB without the
    IPREALLOCATED control.  If a "new" node takes over as recovery master
    and a failover occurs, then it will attempt to send IPREALLOCATED
    controls to all nodes.  The "old" nodes will fail in a fairly
    nondescript way (result == -1).
    
    To try to handle this situation, fall back to the EVENTSCRIPT control
    to handle "ipreallocated".  Only do this on the failed nodes.
    However, do not do this on nodes that timed out (they've probably
    implemented the control and we should call the regular fail_callback
    to get those nodes banned) or for stopped nodes (since they can't
    actually run the "ipreallocated" event via the EVENTSCRIPT control).
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b2b572e9049c7138bd223226475bef8fe3e01f10
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat May 25 19:57:24 2013 +1000

    scripts: Provide mktemp function for platforms without mktemp command
    
    This is needed for AIX and possibly others.
    
    Also provide a cheaper mktemp function is needed in the run_tests
    script.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c9e36f596c63c9af7f80d7cb8d7a5c6dcca4860a
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat May 25 19:08:49 2013 +1000

    tests: Fix integration tests to use real private IPs
    
    192.0.2.x was a typo.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e5a5ab53173d9aa4190ddf68c4ae316d4473eb56
Author: David Disseldorp <ddiss at samba.org>
Date:   Fri May 24 16:11:12 2013 +0200

    pmda: handle new ctdb_statistics format
    
    The ctdb_statistics structure was recently changed. Update the PMDA to
    dereference the new structure member names.
    
    Signed-off-by: David Disseldorp <ddiss at samba.org>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 75a620c516e384f042b5d675183b3a1b48fd6115
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Apr 5 20:47:47 2013 +1100

    tests/takeover: New test with 900 IPs

commit cfd1371d3a1f78a0ed86485d83bd4d311727c3d4
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Apr 5 20:45:08 2013 +1100

    tests/takeover: Takeover tests can use up to 1024 and checks limits
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ef35c8889d90220929e48e66eb62da9ea2025ede
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 8 14:37:44 2013 +1000

    tests/takeover: LCP2 tests for weird, unbalanced corner-cases
    
    2 tests to show a bad result and a 3rd test for the fix.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 954ae6f84cb06a8dcbc12456d4752280072be5bf
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 8 14:37:08 2013 +1000

    tests/takeover: Allow takeover runs with differing IP allocations per node
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 32723c9efdad1c6ca4aa53f308ccd9bef1aadfff
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 24 18:07:39 2013 +1000

    vacuum: Reduce the priority of non-critical error
    
    Since the complete database is not locked when the receive_records
    control is received, it's possible that we may not be able to obtain
    lock on a chain.  We will try again to store this record.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit b697625b184227dad1be31a41b7a3fd9bd312e29
Author: Michael Adam <obnox at samba.org>
Date:   Fri May 17 11:05:44 2013 +0200

    ctdbd: fix comment explaining redirection of CTDB_REQ_CALL redirection.
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit d9e24782a90d9ce29c0e6584b75d2b186142174d
Author: Michael Adam <obnox at samba.org>
Date:   Fri May 17 11:01:31 2013 +0200

    ctdbd: remove a nonempty blank line
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 9a21d417c51fb9cad8f2e87e00ca54d379aef860
Author: Michael Adam <obnox at samba.org>
Date:   Fri May 17 11:00:32 2013 +0200

    ctdbd: update comment describing ctdb_call_send_redirect()
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit c57430998a3bdedc8a904eb3a9cdfde1421aff50
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 6 20:31:08 2013 +1000

    tests/takeover: New tests to check runstate handling
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f15dd562fd8c08cafd957ce9509102db7eb49668
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 6 15:36:29 2013 +1000

    recoverd: Nodes can only takeover IPs if they are in runstate RUNNING
    
    Currently the order of the first IP allocation, including the first
    "ipreallocated" event, and the "startup" event is undefined.  Both of
    these events can (re)start services.
    
    This stops IPs being hosted before the "startup" event has completed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c0c27762ea728ed86405b29c642ba9e43200f4ae
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 23 19:03:11 2013 +1000

    recoverd: Handle errors carefully when fetching tunables
    
    If a tunable is not implemented on a remote node then this should not
    be fatal.  In this case the takeover run can continue using benign
    defaults for the tunables.
    
    However, timeouts and any unexpected errors should be fatal.  These
    should abort the takeover run because they can lead to unexpected IP
    movements.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1190bb0d9c14dc5889c2df56f6c8986db23d81a1
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 23 19:01:01 2013 +1000

    recoverd: Set explicit default value when getting tunable from nodes
    
    Both of the current defaults are implicitly 0.  It is better to make
    the defaults obvious.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 40e34773b8063196457746ffe7a048eb87d96d61
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 23 16:09:38 2013 +1000

    client: async_callback() sets result to -ETIME if a control times out
    
    Otherwise there is no way of treating a timeout differently to a
    general failure.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 03fd90d41f9cd9b8c42dc6b8b8d46ae19101a544
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 21 15:41:56 2013 +1000

    ctdbd: Update the get_tunable code to return -EINVAL for unknown tunable
    
    Otherwise callers can't tell the difference between some other failure
    (e.g. memory allocation failure) and an unknown tunable.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 473cfcb019f0cb4a094bf10397f7414f7923ee57
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 22 17:19:34 2013 +1000

    recoverd: Whitespace improvements
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f6792f478197774d2f3b2258c969b67c83e017ab
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 22 20:56:03 2013 +1000

    recoverd: Use talloc_array_length() for simpler code
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c50eca6fbf49a6c7bf50905334704f8d2d3237d7
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 18:02:51 2013 +1100

    ctdbd: When the "setup" event fails log an error and exit, don't abort
    
    The "setup" event can fail when one of the eventscripts fails to run
    its "setup" event.  If this occurs then the eventscript should log an
    error.  The stack trace and core file generated when we abort provides
    no useful information.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 39a43feae7c7de07ddaf2d6cb962f923d47d0c19
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 16:02:31 2013 +1100

    eventscripts: 11.natgw should not call ctdb tool in "init" event
    
    The current code calls "ctdb setnatgwstate ..." on every event.
    However, calling the ctdb tool in the "init" event is not permitted.
    
    Instead, update the capability when it is needed and at regular
    intervals via the "monitor" event.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Apr 18 20:30:14 2013 +1000

    ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY
    
    This adds more serialisation to the startup, ensuring that the
    "startup" event runs after everything to do with the first recovery
    (including the "recovered" event).
    
    Given that it now takes longer to get to the "startup" state, the
    initscript needs to wait until ctdbd gets to "first_recovery".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 4a2effcc455be67ff4a779a59ca81ba584312cd6
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 14:09:14 2013 +1100

    tools/ctdb: "ctdb runstate" now accepts optional expected run state arguments
    
    If one or more run states are specified then "ctdb runstate" succeeds
    only if ctdbd is in one of those run states.
    
    At the moment, if the "setup" event fails then the initscript succeeds
    but ctdbd exits almost immediately.  This behaviour isn't very
    friendly.
    
    The initscript now waits until ctdbd is in "startup" or "running" run
    state via the use of "ctdb runstate startup running", meaning that ctdbd
    has successfully passed the "setup" event.
    
    The "setup" event code in 00.ctdb now waits until ctdbd is in the
    "setup" run state before proceeding via the use of "ctdb runstate setup".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit bf20c3ab090f75f59097b36186347cedb1c445d4
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 14:07:12 2013 +1100

    tools/ctdb: New command runstate to print current runstate
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 21 16:18:28 2013 +1000

    ctdbd: New control CTDB_CONTROL_GET_RUNSTATE
    
    Also new client function ctdb_ctrl_get_runstate().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f43fe3a560d5915c1a9893256f4e7bfe3d7e290a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 16:48:39 2013 +1100

    ctdbd: Start logging process earlier
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c31feb27dcdb748b5333321c85fe54852dfa1bcf
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 16:33:36 2013 +1100

    ctdbd: Only start recovery daemon and timed events after setup event
    
    This deconstructs ctdb_start_transport(), which did much more than
    starting the transport.
    
    This removes a very unlikely race and adds some clarity.  The setup
    event is supposed to set the tunables before the first recovery.
    However, there was nothing stopping the first recovery from starting
    before the setup event had completed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 16:06:25 2013 +1100

    ctdbd: Replace ctdb->done_startup with ctdb->runstate
    
    This allows states, including startup and shutdown states, to be
    clearly tracked.  This doesn't include regular runtime "states", which
    are handled by node flags.
    
    Introduce new functions ctdb_set_runstate(), runstate_to_string() and
    runstate_from_string().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 9e7b7cd04adc5e66e2ffa4edf463a682aaea379b
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 23 16:06:47 2013 +1000

    tools/ctdb: Remove duplicate command definition for "sync"
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit dbb7c550133c92292a7212bdcaaa79f399b0919b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 8 23:29:55 2013 +1000

    logging: Make sure ringbuffer messages are terminated with a newline
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 29911fa44a480c17c701528ef46919b2a962a366
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 8 16:25:30 2013 +1000

    tests: Fix output of run_tests usage

commit 80fbe9364350d42658f7f8af250ac87eb1afbc21
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 8 13:45:55 2013 +1000

    locking: Set lock helper path once
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c660f33c3eaa1b4a2c4e951c1982979e57374ed4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 8 10:42:08 2013 +1000

    locking: Remove functions that are not used anymore
    
    These functions were used in locking child process to do the locking.  With
    locking helper, these are not required.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 6ea3212a7b177c6c06b1484cf9e8b2f4036653d9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 15:13:44 2013 +1000

    locking: Remove functions that are not used anymore
    
    These functions were used in locking child process to do the locking.  With
    locking helper, these are not required.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7cde53a6cbe74b1e46f7e1bca298df82c08de866
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 15:07:49 2013 +1000

    locking: Use separate locking helper binary for locking
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f665e3d540c90579952e590caa5828acb581ae61
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:32:46 2013 +1000

    locking: Create commandline arguments for locking helper
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a08b6ac19506160f3fb5925ea025027dce07781d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Apr 22 15:36:27 2013 +1000

    locking: Add a standalone helper to lock record/db
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7630ca4116b476636c27407748088ea335f1a06c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:14:16 2013 +1000

    locking: Use database iterator for unmarking databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit adc113055de98fae276f9b501aff5c03cd25ddc8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:16:07 2013 +1000

    locking: Add handler function for unmarking a database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit e8ea65b2713417db4a618a9f4633991cfaa93fe6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:12:40 2013 +1000

    locking: Use database iterator for marking databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f120e40533780e02ff1cdc41cc6d3af1c4c83258
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:07:11 2013 +1000

    locking: Add handler function for marking a database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 187ed83f9701c7fa8d3cc476d47c5d2a87d5c308
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:10:06 2013 +1000

    locking: Use database iterator for unlocking databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 725239535f40ca2cca445bb5bf2e181351b330e9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:06:46 2013 +1000

    locking: Add handler function for unlocking a database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d2634d72d9ca0ceeb72cbb1adc95017a234480fd
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:08:51 2013 +1000

    locking: Use database iterator for locking databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2a1c933ef7c78ee071e2a640ea10941f1c12e32a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 14:06:27 2013 +1000

    locking: Add handler function for locking a database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a3275854812aca86032704134fdf6a129069c86a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 30 13:23:59 2013 +1000

    locking: Refactor code to iterate over databases based on priority
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d98a861716d5f8c1f4387d21666396d3164551b3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 1 12:55:22 2013 +1000

    locking: Add newline to debug logs
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 0577ce3c68e4febf49a1ef5093e918db9d5ec636
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 23 13:04:06 2013 +1000

    tools/ctdb: Fix racy ipreallocate code
    
    This code tried to find the recovery master and send an ipreallocate
    request to that node.  When a node is stopped, this code asked the
    stopped node for recovery master.  Stopped node does not have up-to-date
    information on the current recovery master.  So ipreallocate requests
    were sent to the wrong node and ignored by that node which is not the
    recovery master.
    
    Send ipreallocate request to all active nodes.  That way we guarantee
    that the current recovery master will see it and respond to it.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit 9d4524d13cbba21bfaf61bd35667984359b379b3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 22 15:37:46 2013 +1000

    ctdbd: Print version string in the daemon startup
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d18fcfff674e876abde8d51afec92d9c4a090d2f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 22 14:23:17 2013 +1000

    build: Rename version.h to ctdb_version.h
    
    This avoids clash with version.h from Samba tree.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 939d12b96a0cbebbe6269fa2b14f584058dd6174
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 9 15:43:10 2013 +1000

    logging: Fix a bug in ringbuffer
    
    When ringbuffer is full, it does not return any entries.  Simplify
    ringbuffer logic by keeping track of number of log entries rather than
    last entry.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit 14bd0b6961ef1294e9cba74ce875386b7dfbf446
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 13 15:27:04 2013 +1000

    recoverd: takeover_run_core() should not use modified node flags
    
    Modifying the node flags with IP-allocation-only flags is not
    necessary.  It causes breakage if the flags are not cleared after use.
    ctdb_takeover_run() no longer needs the general node flags - it only
    needs the IP flags.
    
    Instead of modifying the node flags in nodemap, construct a custom IP
    flags list and have takeover_run_core() use that instead of node
    flags.  As well as being safer, this makes the IP allocation code more
    self contained and a little bit clearer.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a8605f7e06076e7edf84e0cc160fd3d9ab5c4b64
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 20 10:47:07 2013 +1000

    ctdbd: Update confusing log message
    
    Inactive can also mean stopped.  To add information, just print the
    flags instead.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3105f9e291d0792199ac9e689f6d0e0a47ee4b0d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 17 16:46:41 2013 +1000

    Packaging: maketarball.sh should be a bash script due to pushd use
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d29e9a420b133088bf23a847c8d1dbce56c25eb0
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 17 16:42:25 2013 +1000

    scripts: Rework notify.sh to use notify.d/ directory
    
    This makes it easier to add notification handlers.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1f96ea08f9a39dfe537c9b957ac512c84dc76f91
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 14 16:20:32 2013 +1000

    ctdbd: Log a message when recovery master changes
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-Programmed-With: Amitay Isaacs <amitay at gmail.com>

commit 3c3df1d6afec7e3e721f9bcd4e8b8e008fd6e50b
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 14 15:38:08 2013 +1000

    ctdbd: Log add and delete of IPs
    
    At the moment, when someone deletes all the IPs on a node, all we see
    are the release IP messages and we have to guess why.
    
    Some would argue that add/release are more significant than
    take/release so they should be logged.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4a8d90d0812a3242f58a2a0e2aa0f528f60f7013
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 14 15:30:53 2013 +1000

    ctdbd: Removed bogus comment in ctdb_find_iface()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f1619a36c1beba11533052dc5728fa3adaa08870
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 14 14:56:26 2013 +1000

    eventscripts: Fix regression in _loadconfig()
    
    fff88940f71058e4eefd65f50a6701389c005c17 introduced a regression.
    Without $service_name set by default, the CTDB configuration is no
    longer loaded when loadconfig() is called without any arguments.
    That's bad.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e6b6b793f61556c21e8daf34abf89ee7b388ecfb
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 9 20:44:11 2013 +1000

    initscript: If CTDB doesn't become ready, print a message before killing
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0c0752515b66661ffae24be5f138bd2fab4dec5c
Author: Christian Ambach <ambi at samba.org>
Date:   Wed May 8 08:45:09 2013 +0200

    build: Create sudoers.d dir during make install
    
    otherwise make install into non-standard prefix will fail
    
    Signed-off-by: Christian Ambach <ambi at samba.org>

commit b0cae7d5a00ef3764bae187affc8e9a252f4b329
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue May 14 23:18:32 2013 +1000

    eventscripts: Do not use bashism for string comparison
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit e143abd16ccde2e0edfe103673d31a5fb06b6aef
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 9 12:53:48 2013 +1000

    recoverd: Move IP flags into ctdb_takeover.c
    
    These should never be seen outside the IP allocation code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 45c776958017ea7001f061842c9e0f60e4a25f23
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 9 12:51:57 2013 +1000

    recoverd: Clear IP flags after IP allocation algorithm has run
    
    If these flags are left set they will confuse other recovery daemon
    code.
    
    Factor the clearing code into new function clear_ipflags().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit d0a3822573db296e73cc897835f783c8abc084b3
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 20:46:15 2013 +1000

    recoverd: Remove unused mask argument and initial mask calculation
    
    This has been replaced by set_ipflags() and associated functionality.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 65e0ea6c2c0629e19349ba4b9affa221fde2b070
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 20:41:32 2013 +1000

    recoverd: When calculating rebalance candidates don't consider flags
    
    This is really a check to see if a node is already hosting IPs.  If
    so, we assume it was previously healthy so it isn't considered as a
    rebalance candidate.  There's no need to limit this to healthy node,
    since this is checked elsewhere.
    
    Due to this the variable newly_healthy is renamed everywhere to
    rebalance_candidates.
    
    The mask argument is now completely unused.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 107e656bbe24f9d21fbaf886a3e9417da4effe5a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 20:13:40 2013 +1000

    recoverd: Remove unused mask argument from IP allocation functions
    
    This is a no-op and is in a separate commit to make the previous
    commit less cumbersome.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7cf63722873a6a7baafd77aa3d8a1989b221dee9
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 15:57:21 2013 +1000

    tests/takeover: Add takeover tests, mostly for NoIPHostOnAllDisabled
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 1308a51f73f2e29ba4dbebb6111d9309a89732cc
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 16:59:20 2013 +1000

    recoverd: Fix tunable NoIPTakeoverOnDisabled, rename to NoIPHostOnAllDisabled
    
    This really needs to be per-node.  The rename is because nodes with
    this tunable switched on should drop IPs if they become unhealthy (or
    disabled in some other way).
    
    * Add new flag NODE_FLAGS_NOIPHOST, only used in recovery daemon.
    
    * Enhance set_ipflags_internal() and set_ipflags() to setup
      NODE_FLAGS_NOIPHOST depending on setting of NoIPHostOnAllDisabled
      and/or whether nodes are disabled/inactive.
    
    * Replace can_node_servce_ip() with functions can_node_host_ip() and
      can_node_takeover_ip().  These functions are the only ones that need
      to look at NODE_FLAGS_NOIPTAKEOVER and NODE_FLAGS_NOIPHOST.  They
      can make the decision without looking at any other flags due to
      previous setup.
    
    * Remove explicit flag checking in IP allocation functions (including
      unassign_unsuitable_ips()) and just call can_node_host_ip() and
      can_node_takeover_ip() as appropriate.
    
    * Update test code to handle CTDB_SET_NoIPHostOnAllDisabled.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 12aef10e9889760d98f58c8d916f19d069fa381a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 16:56:24 2013 +1000

    recoverd: Factor out new function all_nodes_are_disabled()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a1addd89fd9c0390912604097acd028cc24d3483
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 15:55:01 2013 +1000

    tests/takeover: Allow per-node tunable settings
    
    Implemented for CTDB_SET_NoIPTakeover.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 1fb5352d2b6918fcc6f630db49275d25a3eebe8d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 16:21:16 2013 +1000

    recoverd: Refactor code to get NoIPTakeover tunable from all nodes
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 9721aae001b3023e9c8b4af2b143c0db3442d623
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 15:53:13 2013 +1000

    tests: Unit test diff output should use filtered output
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 91405282ba4abad4ad8e8c5f7ee4c83c75f38280
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 3 15:41:26 2013 +1000

    recoverd: Add debug message when dropping IPs in IP allocation
    
    Update tests accordingly.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0eb351ff4c7ee096de7c5e0a59561067091fa32e
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 12:30:33 2013 +1000

    eventscripts: NFS RPC checks no longer support "knfsd"
    
    No longer used, support removed from test infrastructure.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7e792d6768d9ca420ce3713cb122e63afd594b15
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 12:17:31 2013 +1000

    eventscripts: 60.nfs uses nfs_check_rpc_services() to check NFS RPC services
    
    * New directory nfs-rpc-checks.d/ replaces hardcoded rules in 60.nfs
    
    * Installation and packaging additions to handle nfs-rpc-checks.d/
    
    * Unit test updates, including deleting 1 test that sanity checked
      test infrastructure
    
    * Test infrastructure changes to use nfs-rpc-checks.d/
    
    Note that this removes support for $CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK in
    60.nfs.  To get the equivalent behaviour, edit 20.nfsd.check and
    remove/comment all lines.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d9775fcbd6e30eef8382bea68e2f9bad2309f2c1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 11:14:48 2013 +1000

    eventscripts: NFS RPC checks allows "nfsd" in addition to "knfsd"
    
    Want nfs_check_rpc_services() to support filenames without the 'k'.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9bc8fbee6550ed2814fb35c70d57fab21ef1b8fd
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 06:42:54 2013 +1000

    eventscripts: New function nfs_check_rpc_services()
    
    This is intended to replace nfs_check_rpc_service(), which builds
    configuration into eventscripts.
    
    nfs_check_rpc_services() uses a directory of configuration checks that
    can be edited by an administrator.  The files have one limit check and
    a set of actions per line.  The program name is extracted from the
    file name.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5a717fd495ba5a2bfd481d69f38b68fa4576716f
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 06:28:27 2013 +1000

    eventscripts: nfs_check_rpc_action() should be _nfs_check_rpc_action()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cc3bb42e48bbdabd19187c231846b98589b4f4f3
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 06:27:02 2013 +1000

    eventscripts: Factor out common code from nfs_check_rpc_service()
    
    This creates new function _nfs_check_rpc_common().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 887733dd7be53158bfe07b30ef31b611d0f8122f
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 06:17:15 2013 +1000

    eventscripts: Remove ganesha support from nfs_check_rpc_service()
    
    This is unused so doesn't need to be maintained.  An attempt to use it
    now will explicitly fail rather than implicitly fail via bitrot.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 77302dbfd85754e02559eccb2dd6c090db0b6b9f
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 06:14:43 2013 +1000

    Revert "Eventscript functions: add optional version to nfs_check_rpc_service()"
    
    This reverts commit 92f74fd589467b46c758e116e97417edfe8773d7.
    
    This change is unused and is just complicating the function.
    
    Conflicts:
    	config/functions

commit 15b0f78cbf8d6ba481b7eba9e4fe3f4270214c72
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 05:54:12 2013 +1000

    eventscripts: Move rpc.statd existence check into nfs_check_rpc_service ()
    
    The code in 60.nfs is going to be genericised, so make all the checks
    look the same.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4b4e7d8f0e8dcbab987e374d06ffaa21c06da0d3
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 22 15:45:13 2013 +1000

    eventscripts: Factor NFS RPC check action code into nfs_check_rpc_action()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a8ef00608e48a551a334aded206146807aeb4c5a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 15:33:12 2013 +1000

    eventscripts: Remove unused function ctdb_check_counter_limit()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit bb2cdff77e8ec79e7d319159b9c9848ecfaaa0f1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 15:23:20 2013 +1000

    eventscripts: Use ctdb_check_counter() instead of ctdb_check_counter_limit()
    
    ctdb_check_counter_limit() can soon be removed...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ef2cf75e95ff382c65524a4d77eb00ab8411d2fc
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 15:19:52 2013 +1000

    eventscripts: Might as well try to stat the reclock file first
    
    It is in the background but it still might cause the counter to be
    reset before it is checked.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 136abd4604dc68f7c696704bac708bae53cf1940
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 15:16:44 2013 +1000

    eventscripts: Make the early exit in 01.reclock earlier
    
    That way we don't even check the counter...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 25ef4f655f1efc833deb5e244f9fff461e92f439
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon May 6 16:23:25 2013 +1000

    eventscripts: Minor cleanups for killtcp/tickle functions
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 60a08eb96e1d97aab31e9bd4af01683c650541c2
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 11:39:46 2013 +1000

    eventscripts: Tweak the timeout check in kill_tcp_connections()
    
    This has 2 advantages:
    
    1. It uses get_tcp_connections_for_ip() to check for leftover
       connections, instead of custom code.
    
    2. It checks for the timeout condition before sleeping.  The current
       code sleeps and then checks, so wastes a second.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 319c1b68d5aa78f82a68febcad233a7c78afc887
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 06:31:30 2013 +1000

    eventscripts: In killtcp/tickle functions, $_failed should be boolean
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8514ca56830b30e7f0eb5018632640daaf8ff65d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 06:27:58 2013 +1000

    eventscripts: Remove unused $_killcount from tickle_tcp_connections()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a621622903c7ef17764b15293d6ea8df5a53c7e1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 06:25:26 2013 +1000

    eventscripts: Refactor connection listing in killtcp and tickle functions
    
    Uses new function get_tcp_connections_for_ip().  This avoids using a
    temporary file and running netstat twice.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 10e4db8f796d1e3259733180494db3b4bbad291a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 06:19:18 2013 +1000

    eventscripts: Reimplement kill_tcp_connections_local_only()
    
    ... using kill_tcp_connections()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 23c0f5f48e3e5a0c1a3254c582299f7893cf0d33
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 06:14:01 2013 +1000

    eventscripts: Change handling of one-way kills in kill_tcp_connections()
    
    This change is a no-op.  However, In a subsequent commit we'll merge
    kill_tcp_connections_local_only() with this function.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3eae161472e6352f7f656851c73dc056f95113eb
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 06:05:52 2013 +1000

    eventscripts: Remove unnecessary variables from killtcp/tickle functions
    
    Setting these variables spawns lots of unnecessary processes, which
    would surely slow down these functions on a busy system.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9e25fb261447a196de05937052779b36e75e7215
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 03:54:17 2013 +1000

    eventscripts: Clean up ctdb_check_command()
    
    * Command is now multiple arguments, preserving quoting
    * $service_name no longer printed, no longer an argument
    * Debug output from failed command
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d9e6cb945c5edac9ca6405c9228bf647fab814f5
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 03:48:51 2013 +1000

    eventscripts; Cleanup up ctdb_check_directories()
    
    The documentation comments are wrong... and remove option
    $service_name argument.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3d0a7d83ddc824961d876fc9afba829c90aef3e7
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 03:45:21 2013 +1000

    eventscripts: Assert that $service_name is set in a few key places
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit fff88940f71058e4eefd65f50a6701389c005c17
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 15:31:27 2013 +1000

    eventscripts: counters default to $script_name if $service_name not set
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 27aab8783898a50da8c4bc887b512d8f0c0d842c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 03:32:29 2013 +1000

    eventscripts: Simplify handling of $service name in "managed" functions
    
    Complicated argument handling was introduced to deal with multiple
    services per eventscript.  This was a failure and we split 50.samba.
    
    This simplifies several functions to use global $service_name
    unconditionally instead of having an optional argument.
    
    $service_name is no automatically longer set in the functions file.
    This means it needs to be explicitly set in 13.per_ip_routing because
    this script uses ctdb_service_check_reconfigure().
    
    Eventscript unit test infrastructure needs to set $service_name during
    fake service setup, and policy routing tests need to be updated
    accordingly.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b5802c4735e1c719a5cf9ce69489d5947bd5e8c5
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 03:18:01 2013 +1000

    eventscripts: Simplify handling of $service name in start/stop functions
    
    Complicated argument handling was introduced to deal with multiple
    services per eventscript.  This was a failure and we split 50.samba.
    
    This simplifies several functions to use global $service_name
    unconditionally instead of having an optional argument.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e24baac0d2952e86d5ff31235901f06e2f2b2449
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 03:13:36 2013 +1000

    eventscripts: Simplify handling of $service name in service_management
    
    Complicated argument handling was introduced to deal with multiple
    services per eventscript.  This was a failure and we split 50.samba.
    
    This simplifies several functions to use global $service_name
    unconditionally instead of having an optional argument.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c2ea72ff565222f9edab408638bd45dbba6e8ff7
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 02:59:41 2013 +1000

    eventscripts: Simplify handling of $service name in reconfigure functions
    
    Complicated argument handling was introduced to deal with multiple
    services per eventscript.  This was a failure and we split 50.samba.
    
    This simplifies several functions to use global $service_name
    unconditionally instead of having an optional argument.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit fd536a26b310b5bf9628da62cca0b425f4a54030
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Apr 24 17:14:32 2013 +1000

    eventscripts: Remove unused function ctdb_check_counter_equal()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9dee4c84273633b9ad82e94dabbf0e6f86edbcef
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 13:56:15 2013 +1000

    scripts: Fix script_log() regression
    
    5940a2494e9e43a83f2bca098bd04dfc1a8f2e93 makes script_log() always
    pass a message to logger, so script_log() can no longer log stdin.
    
    Put all the tag fu in the actual tag so the message argument is empty
    if no message was passed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c74cc0442eb90d859eae270b59456d28605817c4
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 23 13:49:28 2013 +1000

    initscript: Look for tdbtool/tdbdump using which, not in fixed locations
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cd87ba85fc6c375758c7d3dfa8dbd4d8a02074b0
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 22 14:55:33 2013 +1000

    ctdbd: Log CTDB startup before creating the PID file
    
    Otherwise the messages are in a stupid order...  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reported-by: Amitay Isaacs <amitay at gmail.com>

commit c2bb8596a8af6406ef50e53953884df9d6246a96
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Feb 21 14:28:13 2013 +1100

    ctdbd: Remove the "stopped" event
    
    It isn't used, superceded by "ipreallocated".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 978d4a0d6d8c9877b23f72e3a7b78c1245d16908
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Feb 21 14:17:09 2013 +1100

    eventscripts: Remove use of "stopped" event
    
    Use "ipreallocated" instead.  The "stopped" event pre-dates the
    "ipreallocated" event.  The only way of stopping a node is via the
    ctdb tool, which explicitly causes a takeover run to occur after the
    node is stopped.  The takeover run will generate an "ipreallocated"
    event.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 83b61f7414b1f7a3424497ac987ca0724fba9eaa
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Feb 21 13:13:09 2013 +1100

    recoverd: ctdb_takeover_run() uses CTDB_CONTROL_IPREALLOCATED
    
    This means "ipreallocated" is now run on stopped nodes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 27a44685f0d7a88804b61a1542bb42adc8f88cb1
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Apr 19 13:05:02 2013 +1000

    ctdbd: New control CTDB_CONTROL_IPREALLOCATED
    
    This is an alternative to using ctdb_run_eventscripts() that can be
    used when in recovery.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 05f785b51cfd8b22b3ae35bf034127fbc07005be
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 17:22:23 2013 +1000

    ctdbd: Avoid freeing non-monitor event callback when monitoring is disabled
    
    When running a non-monitor event, check is made for any active monitor
    events.  If there is an active monitor event, then the active monitor
    event is cancelled.  This is done by freeing state->callback which is
    allocated from monitor_context.
    
    When CTDB is stopped or shutdown, monitoring is disabled by freeing
    monitor_context, which frees callback and then stopped or shutdown event
    is run.  This creates a new callback structure which is allocated at
    the exact same memory location as the monitor callback which was freed.
    So in the check for active monitor events, it frees the new callback
    for non-monitor event.  Since the callback function flags successful
    completion of that event, it is never marked complete and CTDB is stuck
    in a loop waiting for completion.
    
    Move the monitor cancellation to the top of the function so that this
    can't happen.
    
    Follow log snippest highlights the problem.
    
    2013/04/30 16:54:10.673807 [21505]: Received SHUTDOWN command. Stopping CTDB daemon.
    2013/04/30 16:54:10.673814 [21505]: Shutting down recovery daemon
    2013/04/30 16:54:10.673852 [21505]: server/eventscript.c:696 in remove_callback 0x1c6d5c0
    2013/04/30 16:54:10.673858 [21505]: Monitoring has been stopped
    2013/04/30 16:54:10.673899 [21505]: server/eventscript.c:594 Sending SIGTERM to child pid:23847
    2013/04/30 16:54:10.673913 [21505]: server/eventscript.c:629 searching for callback 0x1c6d5c0
    2013/04/30 16:54:10.673932 [21505]: server/eventscript.c:641 running callback
    2013/04/30 16:54:10.673939 [21505]: server/eventscript.c:866 in event_script_callback
    2013/04/30 16:54:10.673946 [21505]: server/eventscript.c:696 in remove_callback 0x1c6d5c0
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Feb 21 10:43:35 2013 +1100

    recoverd: Interface reference count changes should not cause takeover runs
    
    At the moment a naive compare of the all the interface data is done.
    So, if any IPs move then the reference counts for the the relevant
    interfaces change, interfaces appear to have changed and another
    takeover run is initiated by each node that took/released IPs.
    
    This change stops the spurious takeover runs by changing the interface
    comparison to ignore the reference counts.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b5a8791268e938d7e017056e0e2bd2cbec1fa690
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 19 16:24:32 2013 +0200

    recover: use CTDB_REC_RO_FLAGS where appropriate
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit c7eab97c7a939710b73aae2d75b404b235a998f5
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 19 16:23:16 2013 +0200

    ctdb_daemon: use CTDB_REC_RO_FLAGS where appropriate
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit f99eb2f56d8ca27110a45ae0e1c4bff40ac7a60e
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 19 16:22:49 2013 +0200

    ctdb_call: use CTDB_REC_RO_FLAGS where appropriate
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit a62775334aa20d1d850d2df705eb70303b04ac5c
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 19 16:09:34 2013 +0200

    vacuum: use  CTDB_REC_RO_FLAGS in the vacuuming code
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 61f17e53576197def46bc61fdf0cdb5282333a3e
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 19 15:55:38 2013 +0200

    ltdb_server: use CTDB_REC_RO_FLAGS where appropriate
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit c7924ce6404bb18641b00d5fbd2fe9da9aaf7959
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 19 16:01:45 2013 +0200

    include: define CTDB_REC_RO_FLAGS - all read-only related record flags
    
    This is used for some checks
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 61264debba58355b9716ac1637fdedef5ed249c8
Author: Michael Adam <obnox at samba.org>
Date:   Fri Feb 22 16:12:17 2013 +0100

    vacuum: Update (C)
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 06de786c786f1cab4c6721adf47c2cb1e8a72adb
Author: Michael Adam <obnox at samba.org>
Date:   Sat Dec 29 17:23:27 2012 +0100

    vacuum: extend the header comment for ctdb_process_delete_list()
    
    Describe the (new) process more precisely.
    And mention that is the last step of the vacuuming process
    that is performed on the lmaster.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit eee23d44b6427be8ab49bbfcee3abb62f37dfcc7
Author: Michael Adam <obnox at samba.org>
Date:   Sat Jan 5 01:20:18 2013 +0100

    vacuum: turn the vacuuming on lmaster into a three-phase process.
    
    More precisely, before locally deleting an empty record, that has been
    migrated with data and that we are dmaster and laster for, we now perform
    the deletion on the other nodes in two steps instead of a single step.
    
    - First send out the list of records to be deleted to all
      other nodes with the new RECEIVE_RECORDS control to store
      the lmaster's current empty copy.
    - Then send those records that could be deleted on all nodes
      to all nodes again with the TRY_DELETE_RECORDS control
      as before for deletion.
    - Finally delete those records locally that were successfully
      deleted remotely in the previous step.
    
    This fixes an old race where a recovery that hits the vacuum process
    square between the eyes can create gaps in the record's history and
    hence let the records resurrect. In the case of the locking.tdb,
    that could mean that a file that was already closed, was recorded as
    being open and locked again, so samba clients were locked out of that
    file until samba was restarted.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit e397702e271af38204fd99733bbeba7c1db3a999
Author: Michael Adam <obnox at samba.org>
Date:   Fri Dec 21 00:24:47 2012 +0100

    vacuum: introduce the RECEIVE_RECORDS control
    
    This in preparation of turning the vacuming on the lmaster into
    into a two phase process:
    
    - First the node sends the list of records to be vacuumed
      to all other nodes with this new RECEIVE_RECORDS control.
      The remote nodes should store the lmaster's empty current copy.
    - Only those records that could be stored on all other nodes
      are processed further. They are send to all other nodes with
      the TRY_DELETE_RECORDS control as before for deletion.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit e3740899c1af6962f93c85ad7d1cb71bddce45c6
Author: Michael Adam <obnox at samba.org>
Date:   Sat Dec 29 18:32:39 2012 +0100

    vacuum: reorder some of ctdb_process_delete_list() more intuitively
    
    Now that the nodemap and its talloc children don't hang off of the
    delete_records_list talloc context, we can build the nodemap
    and earlier, and move the construction of the delete_records_list
    to where it is more obvious what it is used for.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit b7c3b8cdf92c597e621e3dae28b110d321de5ea8
Author: Michael Adam <obnox at samba.org>
Date:   Sat Dec 29 17:16:33 2012 +0100

    vacuum: add explicit temporary memory context to ctdb_process_delete_list()
    
    This removes the implicit artificial talloc hierarchy and makes the
    code easier to understand.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 59a887e12469266e514ad7d4e34810e7ea888ba3
Author: Michael Adam <obnox at samba.org>
Date:   Sat Jan 5 01:19:06 2013 +0100

    vacuum: fix indentation in ctdb_process_delete_list()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 11d728465a9c635e1829abaae17e2f7720433b69
Author: Michael Adam <obnox at samba.org>
Date:   Mon Dec 17 17:31:55 2012 +0100

    vacuum: free temporary allocated memory correctly in ctdb_process_delete_list().
    
    Add a common exit point for cleanup.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 3710dd0f313f551f1b302b4961e0203243e3d661
Author: Michael Adam <obnox at samba.org>
Date:   Mon Dec 17 17:26:22 2012 +0100

    vacuum: move variable into scope of use in ctdb_process_delete_list()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 4640979b526b6dac69a6a0555bfce75fe0206dac
Author: Michael Adam <obnox at samba.org>
Date:   Mon Dec 17 13:07:21 2012 +0100

    vacuum: move variable into scope of use in ctdb_process_delete_list()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit f3e6e7f8ef22bd70dd2f101d818e2e5ab5ed3cd8
Author: Michael Adam <obnox at samba.org>
Date:   Mon Dec 17 13:03:42 2012 +0100

    vacuum: simplify ctdb_process_delete_list(): reduce indentation
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 817c77a3d0a3546bf46389cec5f6b54778dd1693
Author: Michael Adam <obnox at samba.org>
Date:   Wed Apr 3 14:12:27 2013 +0200

    vacuum: add DEBUG to skip conditions in delete_record_traverse()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 3f7e35ff0db740cdcb6d27c43a59bb6ca6066efb
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 5 17:14:43 2013 +0200

    vacuum: break line for RO-flags check in delete_record_traverse() for readability
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit e72a5e11845fe445baaee4730bb0bea8588ee9e3
Author: Michael Adam <obnox at samba.org>
Date:   Mon Apr 22 10:21:02 2013 -0400

    client: fix ctdb_control() to be able to cope with CTDB_CTRL_FLAG_NOREPLY
    
    This was apparently not used before in this context, and the bug hence
    not detected. It becomes necessary when ctdb_local_schedule_for_deletion()
    is called from a client ctdbd (the vacuuming child), hence needs to send
    the SCHEDULE_FOR_DELETION control to its parent.
    
    Pair-Programmed-With: Stefan Metzmacher <metze at samba.org>
    
    Signed-off-by: Stefan Metzmacher <metze at samba.org>
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit dc4ca816630ed44b419108da53421331243fb8c7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Apr 19 13:29:04 2013 +1000

    ctdbd: Set num_clients statistic from ctdb->num_clients
    
    This fixes the problem of "ctdb statisticsreset" clearing the number of
    clients even when there are active clients.
    
    Values returned in statistics for frozen, recovering, memory_used are based on
    the current state of CTDB and are not maintained as statistics.  This should
    include num_clients as well.
    
    Currently ctdb->num_clients is unused. So use that to track the number of
    clients and fill in statistics field only when requested.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bfed6a8d1771db3401d12b819204736c33acb312
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 22 13:52:04 2013 +1000

    ctdbd: Log PID file creation and removal at NOTICE level
    
    Unexpected removal of this file can have serious consequences, so it
    is best if this is logged at the default level.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5940a2494e9e43a83f2bca098bd04dfc1a8f2e93
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 22 13:48:06 2013 +1000

    scripts: Ensure even external scripts get tagged in logs as "ctdbd"
    
    Our practice is to search logs for "ctdbd:".  We want to make sure we
    find everything.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0076cfc4666e5a96eb2c8affb59585b090840e00
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 22 06:52:49 2013 +1000

    eventscripts: Ensure directories are created
    
    Previous commits stopped the top level of the script from creating
    certain directories but some functions assume that required
    directories exist.
    
    Create those directories instead.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 700cf95a1f29b4b88460a00a55d57a9e397011e0
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Apr 17 13:26:04 2013 +1000

    scripts: Clean up update_tickles() and handling of associated directory
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 85efa446c7f5c5af1c3a960001aa777775ae562f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Apr 17 13:12:32 2013 +1000

    scripts: Use $CTDB_SCRIPT_DEBUGLEVEL instead of something more complex
    
    The current logic is horrible and creates an unnecessary file.  Let's
    make the script debug level independent of ctddb's debug level.
    
    * Have debug() use $CTDB_SCRIPT_DEBUGLEVEL directly
    
    * Remove ctdb_set_current_debuglevel()
    
    * Remove the "getdebug" command from ctdb stub in eventscript unit
      tests
    
    * Update relevant eventscript unit tests to use
      $CTDB_SCRIPT_DEBUGLEVEL
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d254d03f69cbdc3e473202b759af6e1392cbb59c
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Apr 19 13:10:27 2013 +1000

    scripts: Ensure service command is in $PATH in ctdb-crash-cleanup.sh
    
    Move the use of the service command below inclusion of functions file,
    which sets $PATH.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e7a4b7e35a1e4b826846e2494a3803abb57065ee
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 15 19:15:22 2013 +1000

    initscript: Remove duplicate setting of $ctdbd
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 1e989894764e4cd1d551c44784d91cb295cd790d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 16 11:40:55 2013 +1000

    util: Removed unused declaration of ctdbd_start()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit abb64f62efaa70df4b87c030b96300eafd98e6a3
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 15 13:31:42 2013 +1000

    include: Move ctdb_start_daemon() from ctdb_client.h to ctdb_private.h
    
    It really is internal.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 90cb337e5ccf397b69a64298559a428ff508f196
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 15 15:42:55 2013 +1000

    scripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is running
    
    "ctdb ping" can time out.  How many times should we try?
    
    Instead, depend on the initscript to implement something sane.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 687e2eace4f48400cf5029914f62b6ddabb85378
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 15 15:18:12 2013 +1000

    initscript: Use a PID file to implement the "status" option
    
    Using "ctdb ping" and "ctdb status" is fraught with danger.  These
    commands can timeout when ctdbd is running, leading callers to believe
    that ctdbd is not running.  Timeouts could be increased but we would
    still have to handle potential timeouts.
    
    Everything else in the world implements the "status" option by
    checking if the relevant process is running.  This change makes CTDB
    do the same thing and uses standard distro functions.
    
    This change is backward compatible in sense that a missing
    /var/run/ctdb/ directory means that we don't do a PID file check but
    just depend on the distro's checking method.  Therefore, if CTDB was
    started with an older version of this script then "service ctdb
    status" will still work.
    
    This script does not support changing the value of CTDB_VALGRIND
    between calls.  If you start with CTDB_VALGRIND=yes then you need to
    check status with the same setting.  CTDB_VALGRIND is a debug
    variable, so this is acceptable.
    
    This also adds sourcing of /lib/lsb/init-functions to make the Debian
    function status_of_proc() available.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 15 13:32:57 2013 +1000

    ctdbd: Add --pidfile option
    
    Default is not to create a pid file.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit ba8866d40125bab06391a17d48ff06a4a9f9da89
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 15 16:14:40 2013 +1000

    util: ctdb_fork() should call ctdb_set_child_info()
    
    For now we pass NULL as the child name.  Later we'll give ctdb_fork()
    and friends an extra argument and pass that through.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 16 11:11:11 2013 +1000

    util: New functions ctdb_set_child_info() and ctdb_is_child_process()
    
    Must be called by all child processes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 06ac62f890299021220214327f1b611c3cf00145
Author: Michael Adam <obnox at samba.org>
Date:   Wed Apr 17 13:08:49 2013 +0200

    tests: add a comment to recovery db corruption test
    
    The comment explains that we use "ctdb stop" and "ctdb continue"
    but we should use "ctdb setcrecmasterrole off".
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit b1577a11d548479ff1a05702d106af9465921ad4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Apr 11 16:59:36 2013 +1000

    tests: Add a test for subsequent recoveries corrupting databases
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 2438f3a4944f7adbcae4cc1b9d5452714244afe7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Apr 11 16:58:34 2013 +1000

    tests: Support waiting for "recovered" state in tests
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit cad3107b12e8392f786f9a758ee38cf3a3d58538
Author: Michael Adam <obnox at samba.org>
Date:   Wed Apr 3 12:02:59 2013 +0200

    ctdb_call: don't bump the rsn in ctdb_become_dmaster() any more
    
    This is now done in ctdb_ltdb_store_server(), so this
    extra bump can be spared.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit feb1d40b21a160737aead22e398f3c34ff3be8de
Author: Michael Adam <obnox at samba.org>
Date:   Wed Apr 3 11:40:25 2013 +0200

    Fix a severe recovery bug that can lead to data corruption for SMB clients.
    
    Problem:
    Recovery can under certain circumstances lead to old record copies
    resurrecting: Recovery selects the newest record copy purely by RSN. At
    the end of the recovery, the recovery master is the dmaster for all
    records in all (non-persistent) databases. And the other nodes locally
    hold the complete copy of the databases. The bug is that the recovery
    process does not increment the RSN on the recovery master at the end of
    the recovery. Now clients acting directly on the Recovery master will
    directly change a record's content on the recmaster without migration
    and hence without RSN bump.  So a subsequent recovery can not tell that
    the recmaster's copy is newer than the copies on the other nodes, since
    their RSN is the same. Hence, if the recmaster is not node 0 (or more
    precisely not the active node with the lowest node number), the recovery
    will choose copies from nodes with lower number and stick to these.
    
    Here is how to reproduce:
    
    - assume we have a cluster with at least 2 nodes
    - ensure that the recmaster is not node 0
      (maybe ensure with "onnode 0 ctdb setrecmasterrole off")
      say recmaster is node 1
    - choose a new database name, say "test1.tdb"
      (make sure it is not yet attached as persistent)
    - choose a key name, say "key1"
    - all clustere nodes should ok and no recovery running
    - now do the following on node 1:
    
    1. dbwrap_tool test1.tdb store key1 uint32 1
    2. dbwrap_tool test1.tdb fetch key1 uint32
       ==> 1
    3. ctdb recover
    4. dbwrap_tool test1.tdb store key1 uint32 2
    5. dbwrap_tool test1.tdb fetch key1 uint32
       ==> 2
    4. ctdb recover
    7. dbwrap_tool test1.tdb fetch key1 uint32
       ==> 1
       ==> BUG
    
    This is a very severe bug, since when applied to Samba's locking.tdb
    database, it means that for SMB clients on clustered Samba there is
    the potential for locking out oneself from previously opened files
    or even worse, data corruption:
    
    Case 1: locking out
    
    - client on recmaster opens file
    - recovery propagates open file handle (entry in locking.tdb) to
      other nodes
    - client closes file
    - client opens the same file
    - recovery resurrects old copy of open file record in locking.tdb
      from lower node
    - client closes file but fails to delete entry in locking.tdb
    - client tries to open same file again but fails, since
      the old record locks it out (since the client is still connected)
    
    Case 2: data corruption
    
    - clien1 on recmaster opens file
    - recovery propagates open file info to other nodes
    - client1 closes the file and disconnects
    - client2 opens the same file
    - recovery resurrects old copy of locking.tdb record,
      where client2 has no entry, but client1 has.
    - but client2 believes it still has a handle
    - client3 opens the file and succees without
      conflicting with client2
      (the detached entry for client1 is discarded because
       the server does not exist any more).
    => both client2 and client3 believe they have exclusive
      access to the file and writing creates data corruption
    
    Fix:
    
    When storing a record on the dmaster, bump its RSN.
    
    The ctdb_ltdb_store_server() is the central function for storing
    a record to a local tdb from the ctdbd server context.
    So this is also the place where the RSN of the record to be stored
    should be incremented, when storing on the dmaster.
    
    For the case of the record migration, this is currently done in
    ctdb_become_dmaster() in ctdb_call.c, but there are other places
    such as in recovery, where we should bump the RSN, but currently
    don't do it.
    
    So moving the RSN incrementation into ctdb_ltdb_store_server fixes
    the recovery-record-resurrection bug.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-By: Amitay Isaacs <amitay at gmail.com>

commit 4c0cbfbe8b19f2e6fe17093b52c734bec63dd8b7
Author: Michael Adam <obnox at samba.org>
Date:   Mon Apr 15 12:50:42 2013 +0200

    logging: fix comment typo
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 2e92deef5221ee651028ef87138b3113f1fece91
Author: Michael Adam <obnox at samba.org>
Date:   Wed Apr 3 14:03:32 2013 +0200

    ctdbd: unimplement the unused SET_DMASTER control
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 9f01b8db72780acf2f88f1392bc0a796dd4c6176
Author: Michael Adam <obnox at samba.org>
Date:   Fri Mar 22 17:48:00 2013 +0100

    recoverd: remove bogus comment "qqq" from "add prototype new banning code"
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit e96acf19b4d1e0f951ab92b88869a01ff06398be
Author: Michael Adam <obnox at samba.org>
Date:   Fri Apr 5 16:55:18 2013 +0200

    build: silence building of porting_test
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 5808f0778b39b79ab7a5c7f53ad27947131386ec
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Apr 11 13:20:09 2013 +1000

    traverse: Ensure backward compatibility for CTDB_CONTROL_TRAVERSE_ALL
    
    This makes sure that CTDB_CONTROL TRAVERSE_ALL is compatible with older versions
    of CTDB (i.e. 1.2.39 and 1.2.40 branches).
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit e691df43d20871468142c8fb83f7c7303c4ec307
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Apr 11 13:18:36 2013 +1000

    traverse: Add CTDB_CONTROL_TRAVERSE_ALL_EXT to support withemptyrecords
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 043e18a8324ccb2c8ddd7b323ebedb5b0de1298d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Apr 11 16:58:59 2013 +1000

    tests: Fix typo in variable name
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 35264e42ade4676468cf7713fa339c784e932953
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Mar 27 12:32:43 2013 +1100

    tools/ltdbtool: Fix handling of -e option
    
    Also, include description of -e option in usage.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1c7adbccc69ac276d2b957ad16c3802fdb8868ca
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Apr 5 13:34:06 2013 +1100

    recoverd/takeover: Use IP->node mapping info from nodes hosting that IP
    
    When collating IP information for IP layout, only trust the nodes that are
    hosting an IP, to have correct information about that IP.  Ignore what all the
    other nodes think.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fe8c4880b371492a38554868d4ca10918c54e412
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Apr 3 14:44:08 2013 +1100

    statd-callout: Make sure statd callout script always runs as root
    
    In RHEL 6+, rpc.statd runs as "rpcuser" instead of root as on RHEL 5. This
    prevents CTDB tool commands talking to daemon since "rpcuser" cannot access
    CTDB socket.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit 524ec206e6a5e8b11723f4d8d1251ed5d84063b0
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Mar 18 13:45:08 2013 +1100

    client: Set the socket non-blocking only after connect succeeds
    
    If the socket is set non-blocking before connect, then we should catch
    EAGAIN errors and retry. Instead of adding a random number of retries,
    better to wait for connect to succeed and then set the socket to
    non-blocking.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 74acc2c568300ef42740cf11299a1b2507047f60
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Apr 5 13:19:34 2013 +1100

    Revert "client: handle transient connection errors"
    
    This reverts commit dc0c58547cd4b20a8e2cd21f3c8363f34fd03e75.
    
    There is a simpler solution that retrying random number of times. Do not set
    socket non-blocking till connect succeeds.

commit f7f8bde2376f8180a0dca6d7b8d7d2a4a12f4bd8
Author: Volker Lendecke <vl at samba.org>
Date:   Wed Apr 3 14:59:21 2013 +0200

    common/messaging: Use the jenkins hash in ctdb_message
    
    This give a better hash distribution

commit c137531fae8f7f6392746ce1b9ac6f219775fc29
Author: Volker Lendecke <vl at samba.org>
Date:   Fri Apr 5 13:11:31 2013 +1100

    common/messaging: use tdb_parse_record in message_list_db_fetch
    
    This avoids malloc/free in a hot code path.

commit bf7296ce9b98563bcb8426cd035dbeab6d884f59
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Apr 3 15:08:14 2013 +1100

    common/messaging: Abstract db related operations inside db functions
    
    This simplifies the use of message indexdb API and abstracts tdb related code
    inside the API.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 20be1f991dd75c2333c9ec9db226432a819f57ba
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 2 16:57:51 2013 +1100

    common/messaging: Don't forget to free the result returned by tdb_fetch()
    
    This fixes a memory leak in the messaging code.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4e1ec7412866f2d31c41de1bec0fbf788c03051b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Apr 2 12:08:39 2013 +1100

    common/messaging: Free message list header if all message handlers are freed
    
    This makes sure that even if the srvids are not deregistered, the header
    structure is freed when the last message handler has been freed as a result of
    client going away.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 85b777196289646ca37e06ebbf1f7a684d0aabc5
Author: Sumit Bose <sbose at redhat.com>
Date:   Mon Mar 25 12:28:31 2013 +0100

    build: Fix for tevent autoconf check
    
    The list of include files is the 4th argument of AC_CHECK_DECLS.

commit 307416afda707b687f5e89e8438e45c154a4c806
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Mar 13 22:57:44 2013 +1100

    util: Add hex_decode_talloc() to decode hex string into a binary blob
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 08c53ee609b80f87450a7a1d7dd24fbcdf5ab7bc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Mar 13 11:46:18 2013 +1100

    logging: Do not ignore stdout/stderr from the exec'd children
    
    To log debugging information from child processes that are started
    with vfork and exec, do not set close_on_exec on STDOUT and STDERR for
    that process.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 87c89b7c2a14e2ee79a3efc7e8125842bc04bf23
Author: Michael Adam <obnox at samba.org>
Date:   Fri Feb 22 12:42:10 2013 +0100

    server:persistent: fix a debug message (copy'n'paste error)
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 98abd344342a011a8599411deae79f94abc09541
Author: Volker Lendecke <vl at samba.org>
Date:   Tue Mar 12 13:53:58 2013 +0100

    fix a typo
    
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 11734be353a1e246163eda631d35dfe55d1d6fb1
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Feb 22 12:59:39 2013 +1100

    common/io: For scheduling immediate events use tevent_schedule_immediate
    
    tevent_schedule_immediate() is much more efficient at handling events that need
    to be processed immediately rather than creating timed events with
    timeval_zero().
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 3e09f25d419635f6dd679b48fa65370f7860be7d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Feb 21 13:16:15 2013 +1100

    ctdbd: Add an index db for message list for faster searches
    
    When CTDB is busy with lots of smbd, CTDB was spending too much time in
    daemon_check_srvids() which searches a list of srvids in the registered
    message handlers.  Using a hash based index significantly improves the
    performance of search in a linked list.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5402f85dde045576cbaf64e01c68e28ed52204e8
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Feb 27 16:01:55 2013 +1100

    tools/ctdb: delip no longer fails if IP can not be moved
    
    Moving the IP is an optimisation so should not cause failure.
    
    Refactor and simplify the retry-move-IP into new function
    try_moveip().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 6455ce5e4980a63d56ed30f7059869c8356c12ea
Author: Michael Adam <obnox at samba.org>
Date:   Fri Feb 22 11:36:00 2013 +0100

    server:persistent: fix a comment typo.
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 4f71dca8df19a63f198e2d6d59e605b49ec5e803
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Feb 18 16:39:00 2013 +1100

    recoverd: update_capabilities() should use connected nodes
    
    ... as the comment says... not just active nodes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit f505020a5720faa4ecc6414e0bfaa6b3c0e47291
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 19 14:30:50 2013 +1100

    client: Refactor node listing functions to use list_of_nodes()
    
    This reduces repetition.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit a73bb56991b8c07ed0e9517ffcf0dc264be30487
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 19 14:29:06 2013 +1100

    client: New generic node listing function list_of_nodes()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit d788bc8f7212b7dc1587ae592242dc8c876f4053
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jan 18 10:42:14 2013 +1100

    common/io: Rewrite socket handling code to read all available data
    
    This improves the processing of packets considerably.  It has been
    observed that there can be as many as 10 packets in the socket buffer and
    the current code of reading a single packet from a socket at a time is
    not very optimal.  This change reads all the bytes from socket buffer and
    then parses to extract multiple packets.  If there are multiple packets,
    set up a timed event to process next packet.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 855ab348901edb3ec1327499a43f509d279b8182
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Feb 15 11:18:45 2013 +1100

    doc: Fix typo in ctdbd manpage
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e204fac03412520e877ab04363b3ece02667c55b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Feb 11 13:23:47 2013 +1100

    ctdbd: Fix the PullDBPreallocation size to 10MB as intended
    
    In 1f262deaad0818f159f9c68330f7fec121679023, Ronnie changed recovery code
    to allocate chunks of 10MB in traverse_pulldb() and traverse_recdb().  The
    tunable PullDBPreallocation size was set to 100MB.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 053b89c6dbce47001505524606889334559d2ec4
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Feb 11 11:25:49 2013 +1100

    eventscripts: Remove calls to "smbstatus -np" for samba cleanup
    
    This is an artifact from older versions of Samba. In the newer versions of
    Samba, "smbstatus -np" command does not do anything useful, but causes a
    traverse in CTDB which is expensive and causes CPU utilization to shoot up.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Feb 6 14:15:11 2013 +1100

    Logging: Fix breakage when freeing the log ringbuffer
    
    Commit a82d3ec12f0fda16d6bfa8442a07595de897c10e broke fetching from
    the log ringbuffer.  The solution there is still generally good: there
    is no need to keep the ringbuffer in children created by
    ctdb_fork()... except for those special children that are created to
    fetch data from the ringbuffer!
    
    Introduce a new function ctdb_fork_no_free_ringbuffer() that does
    everything ctdb_fork() needs to do except free the ringbuffer (i.e. it
    is the old ctdb_fork() function).  The new ctdb_fork() function just
    calls that function and then frees the ringbuffer in the child.
    
    This means all callers of ctdb_fork() have the convenience of having
    the ringbuffer freed.  There are 3 special cases:
    
    * Forking the recovery daemon.  We want to be able to fetch from the
      ringbuffer there.
    
    * The ringbuffer fetching code.  Change the 2 calls in this code (main
      daemon, recovery daemon) to call ctdb_fork_no_free_ringbuffer()
      instead.
    
    While we're here, clear the log ringbuffer when the recovery deamon is
    forked, since it will contain a copy of the messages from the main
    daemon.
    
    Note to self: always test... even the most obvious patches...  ;-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b940e3a24daa73ca9b2896b7a449240136442b53
Author: Volker Lendecke <vl at samba.org>
Date:   Wed Feb 6 10:28:37 2013 +0100

    Fix a comment typo
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit a0ef73e197dc9147f7718e0813fe803ff0b3d54d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 13:16:46 2013 +1100

    initscript: export CTDB_EXTERNAL_TRACE
    
    This means it can be set like any other configuration option in the
    configuration file, without needing to export it there.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9b0d56b16775aa16f33bdfdf831256e085fa3339
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 14:36:29 2013 +1100

    ctdbd: Don't use a fixed length buffer for the hung script command
    
    The amount of data to write into the buffer wasn't constrained
    anywhere...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3400b2ed34b6eb9496eb55f1aab6f89d2952060d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 14:25:01 2013 +1100

    ctdbd: Complain loudly if CTDB_DEBUG_HUNG_SCRIPT script isn't executable
    
    This is quite easy to misconfigure by failing to set the execute bit
    on the script.  Better to complain loudly.
    
    This is a debugging facilty rather than core CTDB functionality, so it
    doesn't need a subtle mechanism to disable it at run-time.  To disable
    the designated script at run-time either edit it to put an "exit 0" at
    the top or move it aside and symlink to /bin/true.
    
    This is implemented by actually removing the code that checks that the
    file exists and is executable.  The output from the shell when the
    system() function fails is just as useful.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0581f9a84e58764d194f4e04064c2c5b393c348b
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 15:49:52 2013 +1100

    ctdbd: Remove command-line option --debug-hung-script
    
    Use an environment variable instead.  This just means that the
    initscript exports CTDB_DEBUG_HUNG_SCRIPT and the code checks for the
    environment variable.
    
    The justification for this simplification is that more debug options
    will be arriving soon and we want to handle them consistently without
    needing to add a command-line option for each.  So, the convention
    will be to use an environment variable for each debug option.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 501461cc3e132d4adee9e91b5d4513a26bae2846
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 13:08:55 2013 +1100

    ctdbd: Remove debug_hung_script_ctx
    
    The only allocation against this context is by
    ctdb_fork_with_logging().  This memory is freed by ctdb_log_handler()
    anyway.  There should be no memory leak.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f1ffe1112b7e342d7f1228ca816a8e5918f893cf
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 14:39:09 2013 +1100

    ctdbd: Message logged at exit should be different for different processes
    
    Some subprocesses print "CTDB daemon shutting down" when they exit and
    this can be confusing.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 35da9a7c2a0f5e54e61588c3c3455f06ebc66822
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jan 22 13:27:20 2013 +1100

    daemon: Make sure all the traverse children are terminated if traverse times out
    
    When traverse times out, callback function is called with key and data set to
    tdb_null.  This is also the way to signal end of traverse.  So if the traverse
    times out, callback function treats it as traverse ended and frees state without
    calling the destructor.
    
    Keep track if the traverse timed out, so callback function can take appropriate
    action for traverse timeout and traverse end.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a82d3ec12f0fda16d6bfa8442a07595de897c10e
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 12:09:36 2013 +1100

    Logging: Free the ringbuffer in child processes created with ctdb_fork()
    
    At the moment the log ringbuffer is duplicated in every child process.
    Althought it is copy-on-write we want to see if it is contributing to
    out-of-memory situations when there are a lot of children.
    
    The ringbuffer isn't accessible from any of the children anyway...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a4f622e85168f59417c11705f1734e0352e1d44a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 12:08:11 2013 +1100

    Logging: New function ctdb_log_ringbuffer_free()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 25a20409fb39a94b64c13990c0eba4f75d482ecd
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 5 12:13:57 2013 +1100

    build: Fix a Makefile.in typo
    
    Objects are named *.o  ;-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d1ec06d30148e6fd344625a2fbf1c22391bd908a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 12:39:37 2013 +1100

    tools/ctdb: Fix a compiler warning
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 124e2a471aeda9c900fd898178a30522d7d74221
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jan 23 14:35:47 2013 +1100

    recoverd: Fix printing of node flags from local information
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit b054193d1d19a8eef998fa690899501f79badb8a
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Mon Jan 14 17:48:01 2013 +0100

    common: Don't lie on unimplemented gratuitous arp
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit 109f428aa34f8f4cc0329880d2f4a5593a6cc6f3
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Mon Jan 14 17:21:01 2013 +0100

    tests: Test portability
    
    Curiously test_ctdb_sys_check_iface_exists fails on Linux
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit 258092aaf6b7a9bdc14f0fb35e8bd7f7dc742b3f
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Mon Jan 14 12:13:24 2013 +0100

    common: FreeBSD+kFreeBSD: Implement get_process_name (same as in Linux)
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit d202b2fdd4fd70172e5e44583627b57a1b7ad2ed
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Mon Jan 14 11:23:46 2013 +0100

    common: Detailed platform-specific FIXME
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit 3c6a9b73364c9543366fa033c778145dc7a152a9
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Sun Jan 13 14:15:20 2013 +0100

    build: Update config.guess 2012-12-30 and config.sub to 2013-01-11
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit 95fc493a7d4145f976cb3fe928d9e92faec4dd71
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Sat Jan 12 16:43:03 2013 +0100

    doc: allows to -> allows one to
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit 506ecd186759675a1cf50a0a05a285fee03fc51e
Author: Mathieu Parent <math.parent at gmail.com>
Date:   Sat Jan 12 15:14:48 2013 +0100

    build: Add missing LDFLAGS
    
    Original Author: Simon Ruderich <simon at ruderich.org>
    
    Signed-off-by: Mathieu Parent <math.parent at gmail.com>

commit 0e651e9da0f1f3c836b4474612ab13d0ccd272d9
Author: Srikrishan Malik <srimalik at in.ibm.com>
Date:   Wed Jan 9 16:11:39 2013 +0530

    Changes for unobtrusive recovery and new method for health check.
    
    Unobtrusive recovery: Ganesha will not be restarted on failovers.
    
    Ganesha health: Use the counters in /var/lib/nfs/ganesha_local to track progress
    instead of the null call which can timeout if the server is too busy.
    
    Signed-off-by: Srikrishan Malik <srimalik at in.ibm.com>
    Signed-off-by: Lance Russell <lancerus at us.ibm.com>

commit 7393e2b290f9879ff72d5c5a9ce933034129f0e8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jan 9 16:22:39 2013 +1100

    recoverd: Create recoverd monitoring timed events off recoverd context
    
    This ensures that when shutting down CTDB, all the timed events
    associated with monitoring recoverd are destroyed and recoverd
    is not restarted.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7d8546ee4353851f0543d0ca2c4c67cb0cc75aea
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 29 14:56:10 2012 +1100

    daemon: Protect against double free of callback state while shutting down
    
    When CTDB is shut down and monitoring has been stopped, monitor_context
    gets freed and all the callback states hanging off it.  This includes
    callback state for current_monitor, if the current monitor event has
    not yet finished.  As a result, when the shutdown event is called,
    current_monitor->callback state is not NULL, but it's actually freed
    and it's a dangling reference.
    
    So before executing callback function and freeing callback state check
    if ctdb->monitor->monitor_context is not NULL.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 746168df2e691058e601016110fae818c6a265c3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Dec 4 15:05:44 2012 +1100

    daemon: On shutdown, destroy timed events that check if recoverd is active
    
    When CTDB is shutting down, recovery daemon is stopped, but the
    event that checks if recovery daemon is still alive is not destroyed.
    So recovery master is restarted during shutdown if CTDB daemon takes
    longer to shutdown.
    
    There are two processes that check if recovery daemon is working.
    
    1. ctdb_check_recd() - which checks every 30 seconds if the recovery
       daemon process exists.
    
    2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon
       fails to ping CTDB daemon.
    
    Both the events are periodic and need to be destroyed when shutting down.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 45d439a1ab093b420c27b1502ef109021833c7af
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Dec 18 12:52:39 2012 +1100

    tests: Add a test for recovery of persistent databases
    
    Ensure that RSN based recovery and __db_sequence_number__ based recovery
    methods for persistent databases work correctly.  They should not cause
    corruption of the database.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit efaac27a9ed52ed0f436c7e194013fd06e8b02b3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Dec 19 15:14:42 2012 +1100

    tools/ctdb: Add setdbseqnum command to set __db_sequence_number__
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit ca6e7eccc90f2869c220231666bf284798342bce
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Dec 19 14:43:26 2012 +1100

    tools/ctdb: Re-factor code to check if db exists given name or id
    
    Most of the commands related to database operations can now use the
    common code (db_exists()) to refer to database with either name or id.
    
    In addition to return db_id for db_name, the function returns all the
    flags set for the database.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d23adec89b69e7c6f96c8e1417ef4ca4c9edc57e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Dec 17 14:46:14 2012 +1100

    tools/ctdb: Add pdelete command to delete a record from persistent database
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9a70a4d23d00f6cb996c061ba3dfb7c47b4f6a4f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Dec 4 14:58:30 2012 +1100

    daemon: Update the comment and remove redundant check in ctdb_start_transport()
    
    ctdb_start_transport() is called just before "setup" event, when CTDB
    is ready to process the requests. "startup" event happens much later
    after a successful recovery.
    
    Transport method ctdb->methods is successfully initialized before
    ctdb_start_transport() is called.  No need to check again.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 735ec99b99c7bb579851ce8293011aaf1dcc552a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jan 8 16:49:56 2013 +1100

    eventscripts: Fail the setup event if CTDB does not become ready
    
    Currently it silently continues without attempting to set tunables.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 50abf597cefe6f8ea2a2ff7694bf84641344a9b1
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 4 13:52:01 2013 +1100

    scripts: Make script_log() use supplied message, stop logger from hanging
    
    When using syslog any provided message arguments are ignored and not
    passed to logger.  This means that logger blocks waiting on stdin.
    That's bad.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e2aaa64925cca359c71520e01a18fc9461b0da4d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 4 11:41:03 2013 +1100

    scripts: Rework ctdb-crash-cleanup.sh so that it uses existing functions
    
    This improves maintainability.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 03356fd5ae7a3ac35fde0289cbea7c71ecf07367
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 4 11:23:29 2013 +1100

    scripts: Make drop_all_public_ips() more robust
    
    Incorporate some of the logic from ctdb-crash-cleanup.sh that ensures
    IPs are deleted even if they have the wrong netmask or are on the
    wrong interface.
    
    Factoring out some of the code will allow it to be used elsewhere.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 13e5e609b262847b607e7af7e0685f44e7cb8e36
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 3 16:02:52 2013 +1100

    ctdbd: Default value for debug_hung_script should use ETCDIR
    
    That is, it should use whatever was specified in ./configure and
    should not hardcode /etc.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8507303b525d20c74e8ec4e7c4f5f275945cd3b6
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 3 15:33:57 2013 +1100

    scripts: debug-hung-script.sh doesn't need functions/loadconfig
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 376015ba5ad6b7703ae9949a1d40a0c72dfaba0c
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 3 15:33:10 2013 +1100

    scripts: statd-callout should calculate CTDB_BASE if it is not set
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 740ea8ea5084149c8b552a01ee1c98c558b12384
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 3 15:26:12 2013 +1100

    eventscripts: Each script should set CTDB_BASE if it is not set
    
    This makes it easier to run the scripts externally.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b23c30253cc9eb274b895cac0f8c65245ba0a200
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 3 15:07:07 2013 +1100

    scripts: Move drop_all_public_ips() to the functions file
    
    ... so it can be improved and used elsewhere.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 13a5944f8a27d43006acfffba76958693cae7702
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 12 16:12:38 2012 +1100

    tests/simple: Add test to check recovery daemon IP verification
    
    Also update ips_are_on_nodeglob() to handle negation.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3cc596d2b459d834f9785b3a98027e46431ff2b9
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jan 8 10:21:49 2013 +1100

    tests/eventscripts: Ratchet down debug level for ctdb_takeover_tests
    
    The default IP allocation algorithm used by ctdb_takeover_tests
    changed from "non-deterministic IPs" to "LCP2".  The latter generates
    a lot more debug output.  ctdb_takeover_tests is used by the ctdb tool
    stub to calculate IP address changes for failovers.  This resulted in
    unexpected debug output that caused tests to fail.  Since eventscript
    tests don't care how IP allocations are arrived at, the best solution
    is to turn down the debug level.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 6a1d88a17321f7e1dc84b4823d5e7588516a6904
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 14 17:12:01 2012 +1100

    recoverd: Separate each IP allocation algorithm into its own function
    
    This makes the code much more readable and maintainable.
    
    As a side effect, fix a memory leak in LCP2.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8adb255e62dbe60d1e983047acd7b9c941231d11
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 13 13:23:32 2012 +1100

    recoverd: New function unassign_unsuitable_ips()
    
    Move the code into a new function so it can be called from a number of
    places.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f6ce18d011dd9043b04256690d826deb2640cd89
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 13 12:15:32 2012 +1100

    recoverd: Move failback retry loop into basic_failback() and lcp2_failback()
    
    The retry loop is currently in ctdb_takeover_run_core().  Pushing it
    into each function will make it possible to put each algorithm into a
    separate top-level function.  This will make the code much clearer and
    more maintainable.
    
    Also keep associated test code compatible.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c09aeaecad7d3232b1c07bab826b96818756f5e0
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 11 15:49:17 2012 +1100

    recoverd: Trying to failback more IPs no longer allocates unassigned IPs
    
    Neither basic_failback() nor lcp2_failback() unassign IPs anymore, so
    there's no point looping back that far.
    
    Also fix a unit test that now fails because looping back to handle
    unassigned IPs is no longer logged.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4dc08e37dec464c8785a2ddae15c7c69d3c81ac3
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 11 15:43:36 2012 +1100

    recoverd: basic_failback() can call find_takeover_node() directly
    
    Instead of unassigning, looping back and depending on
    basic_allocate_unassigned.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 4c87e7cb3fa2cf2e034fa8454364e0a7fe0c8f81
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 11 15:01:12 2012 +1100

    recoverd: Don't do failback at all when deterministic IPs are in use
    
    This seems to be the right thing to do instead of calling into the
    failback code and continually skipping the release of an IP.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e06476e07197b7327b8bdac9c0b2e7281798ffec
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 14 17:10:41 2012 +1100

    recoverd: Move the test for both 'DeterministicIPs' and 'NoIPFailback' set
    
    If this is done earlier then some other logic can be improved.  Also,
    this should be a warning since no error condition is set.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit bcd5f587aff3ba536cb0b5ef00d2d802352bae25
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 14 17:10:05 2012 +1100

    recoverd: Fix a memory leak in IP allocation
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit edda58a45915494027785608126b5da7c98fee85
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 20 16:27:27 2012 +1100

    tests/takeover: Add some LCP2 tests for case when no node are healthy
    
    3 tests should assign IPs to all nodes.
    
    3 tests set NoIPTakeoverOnDisabled=1 and should drop all IPs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5c820b2398a42af0e94bc524854a1ad144a63f7b
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 20 16:26:42 2012 +1100

    tests/takeover: Initial tests for deterministic IPs
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 98bd58a98d34ecca89c9042417d7527a18a5ecf9
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 20 16:25:53 2012 +1100

    tests/takeover: Do output filtering for deterministic IPs algorithm too
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d357d52dbd533444a4af6151d04ba119a1533068
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 20 16:24:58 2012 +1100

    tests/takeover: Support testing of NoIPTakeoverOnDisabled
    
    Via $CTDB_SET_NoIPTakeoverOnDisabled.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 20631f5f29859920844dd8f410e24917aabd3dfd
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 20 14:52:05 2012 +1100

    tests/takeover: IP allocation now selected via $CTDB_IP_ALGORITHM
    
    Default to LCP2, like ctdbd.  Also support "det" for deterministic
    IPs.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 06ad6b8a19f830472b0ed65cb52e7c3ea74ed1dc
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 13 20:29:22 2012 +1100

    tests/takeover: Support valgrinding the takeover code
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1a5410e8349cdb96fdc51aa5ecd4f5734f6798a5
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 30 16:38:08 2012 +1100

    tests: new simple integration test for delip interface garbage collection
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8164d9b29bf9080ccc76b1305fb6c07f1ed61d55
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 30 16:37:28 2012 +1100

    tests: new function ip2ipmask() for integration testing
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cc1a3ae911d3fee8b87fda5de5ab6d9499d7510a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 23 20:09:07 2012 +1100

    ctdbd: Clean up orphaned interfaces when an IP is deleted
    
    Add a new function ctdb_remove_orphaned_ifaces() and call it in
    ctdb_control_del_public_address().
    
    ctdb_remove_orphaned_ifaces() uses a naive implementation that does
    things in a very obvious way.  There are many ways to improve the
    performance - some are mentioned in a comment in the code.  However, I
    doubt that this will be a bottleneck even with a large number of
    public IPs.  Running the eventscript is likely to outweigh the cost of
    this cleanup.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit b849fb4923d6a34141fe19006a974de81508ceda
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jan 7 12:00:34 2013 +1100

    tests/complex: Add NFS test when CTDB is killed on one of the nodes
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c75b5e5b4d000f5c7dab403df8238ceed390c1c0
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 4 15:00:44 2012 +1100

    Eventscripts: Change the default reconfigure action to do nothing
    
    A default action of restarting the service doesn't obey the principle
    of least surprise.  It cause the NFS service to be implicitly
    reintroduced.
    
    This allows no-op functions to be removed from some eventscripts and
    service restart functions to be added to others.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2629de72e1f37b5e46772c2ef8d8d0012fc4ed37
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 4 14:52:25 2012 +1100

    Eventscripts: Do not restart NFS on reconfigure
    
    It looks like this restart was accidentally reintroduced in commit
    fc0678d351187cfa4c71123f97c0f493aacd5d16 when $service_reconfigure
    became unset so the default action of restarting the service would
    occur.  From there cleanups have explicitly reintroduced it and
    carried it through the code.
    
    Also update the unit tests affected by this change.
    
    The restart was originally removed in commit
    bc481c3f1a44c50648488c4f8a7f15ec395d446f.
    
    The default reconfigure action of restarting a service is clearly
    suboptimal and will be addressed in a separate patch.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2bbee8ac23ad5b7adf7122d8c91d5f0d54582507
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 4 14:28:06 2012 +1100

    ctdbd: Initialise the node flags in just one place
    
    Currently flags are initialised in 2 places.  One of them is in
    ctdb_tcp_listen_automatic(), which just seems wrong.  This makes the
    code easier to follow by just doing it in ctdb_start_daemon().
    
    This means that the flags are now initialised later than previously.
    However, it is still done before the transport is started and before
    clients can connect.
    
    In future it might make sense to do a similar thing with setting the
    PNN.  However, the current optimisation is reasonably obvious...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 496387a585b2c5778c808cf02b8e1435abde4c3e
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Dec 3 15:44:12 2012 +1100

    ctdbd: Remove debug option --node-ip, use --listen instead
    
    This effectively reverts d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 3221fce9ee2f6fdd3bb17a5e1629ad52a32f90d6
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Dec 3 15:32:49 2012 +1100

    tests: Local daemons should use --listen instead of --node-ip
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 776590bf84d221092298346a28d7fc0552a67c9d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 30 12:59:35 2012 +1100

    Initscript: when checking status, print output of "ctdb ping" if it fails
    
    At the moment the caller has no idea why it thinks CTDB isn't running
    and we can't debug failures...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5067392d2e06795559f25828b65c129608b65c0b
Author: Michael Adam <obnox at samba.org>
Date:   Tue Nov 20 11:20:34 2012 +0100

    ctdb:recover: fix a comment typo
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 81788cfabe960497b050c5ee4e4e487ee061012a
Author: Michael Adam <obnox at samba.org>
Date:   Fri Dec 21 11:52:57 2012 -0500

    events/50.samba: fix testparm background update
    
    creating the smb.conf cache with "-v" results in a cache file
    that fails to load with "testparm -s ..." later on due to
    "copy = " not being processable. (Copying the empty service name fails).
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 4a9e96ad3d8fc46da1cd44cd82309c1b54301eb7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jan 4 14:32:55 2013 +1100

    daemon: Add a tunable to enable automatic database priority setting
    
    Samba versions 3.6.x and older do not set the database priority.
    This can cause deadlock between Samba and CTDB since the locking order
    of database will be different. A hack was added for automatic promotion
    of priority for specific databases to avoid deadlock.  This code should
    not be invoked with Samba version 4.x which correctly specifies the
    priority for each database.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit f81e9add466b1d9b2796c09c6ba63b77296ea149
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Nov 30 12:21:30 2012 +1100

    daemon: Check if log_latency_ms is set before using it
    
    This fixes a bug where wrong variable is checked.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 905cd1293aa97dc7839a59b4f68eca02981f0891
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 23 12:51:47 2012 +1100

    Git should ignore generated include/version.h file
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9a02f61547ddf74629aca21639d8fb61c1df7cbb
Author: Volker Lendecke <vl at samba.org>
Date:   Thu Nov 22 15:27:51 2012 +0100

    vacuum: Avoid some tallocs in ctdb recovery
    
    In a heavily loaded and volatile database a lot of SCHEDULE_FOR_DELETION
    requests can come in between fast vacuuming runs. This can lead to
    significant ctdb cpu load due to the cost of doing talloc_free. This
    reduces the number of objects a bit by coalescing the two objects
    of delete_record_data into one. It will also avoid having to allocate
    another talloc header for a SCHEDULE_FOR_DELETION key. Not the full fix
    for this problem, but it might contribute a bit.

commit d05faf294e58e22ae3fbc76162258f1ae8178129
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Nov 21 17:03:37 2012 +1100

    doc: Update ping_pong documentation to add -c option
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4f42d17b74ce891691eee1cead498959cc8e4837
Author: Michael Adam <obnox at samba.org>
Date:   Tue Nov 6 01:26:05 2012 +0100

    utils:ping_pong: add a -c switch to check the lock before reading/writing
    
    This is to verify that the fcntl F_GETLK call reports F_UNLCK if called
    from a process already holding a lock. This is for example used by samba's
    strict locking code in combination with "posix locking = true".
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 6860c79aea416f56cfd7a6af790bbdf495dbc54e
Author: Michael Adam <obnox at samba.org>
Date:   Mon Nov 19 17:28:03 2012 +0100

    recovery: data corruption of persistent DBs after recoveries: don't delete emtpy records
    
    The record-by-record mode of recovery deletes empty records.
    For persistent databases, this can lead to data corruption
    by deleting records that should be there:
    
    - Assume the cluster has been running for a while.
    
    - A record R in a persistent database has been created and
      deleted a couple of times, the last operation being deletion,
      leaving an empty record with a high RSN, say 10.
    
    - Now a node N is turned off.
    
    - This leaves the local database copy of D on N with the empty
      copy of R and RSN 10. On all other nodes, the recovery has deleted
      the copy of record R.
    
    - Now the record is created again while node N is turned off.
      This creates R with RSN = 1 on all nodes except for N.
    
    - Now node N is turned on again. The following recovery will chose
      the older empty copy of R due to RSN 10 > RSN 1.
    
    ==> Hence the record is gone after the recovery.
    
    On databases like Samba's registry, this can damage the higher-level
    data structures built from the various tdb-level records.
    
    This patch fixes that problem by not deleting empty records in recoveries
    for persistent databases.
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 909269a4a3690e1245117ca1af935401455785e6
Author: Michael Adam <obnox at samba.org>
Date:   Mon Nov 19 17:20:11 2012 +0100

    recoverd: fix a comment typo
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit bab744e3c49efef2e05dc09e8ea9bd3e3fa58716
Author: Michael Adam <obnox at samba.org>
Date:   Fri Nov 16 14:33:41 2012 +0100

    vacuum: fix a comment typo
    
    Pair-Programmed-With: Volker Lendecke <vl at samba.org>
    Signed-off-by: Michael Adam <obnox at samba.org>

commit d8f010355b715e49709836e057a5d0f110919897
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 16 20:21:15 2012 +1100

    Eventscripts: 10.interface should list configured interfaces
    
    The current code lists available interfaces.  If IPs are configured in
    some other way than the public addresses file (e.g. ctdb addip) and their
    interfaces default to being marked down then, since down interfaces are
    not available, these interfaces can never be marked up.
    
    The configured interfaces should be listed instead.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9275a69a414482f1053ae14528d5972575b9214e
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Nov 16 19:43:14 2012 +1100

    ctdbd: Make the link status of new interfaces more flexible
    
    Neither up nor down is a good default value for the link status of a
    new interface.  Up means that IPs can be assigned to interfaces before
    the true state is known and they can move away quickly if the interface
    is actually down.  Down means that IPs can't be assigned to an interface
    for a variable amount of time - until a monitor cycle occurs - and this
    can result in imbalanced IPs.
    
    This is a neat compromise.  Before the startup event completes, IPs
    can't be assigned to interfaces because all interfaces begin in a down
    state.  As soon as the startup event completes, IPs can be allocated
    to any interface that has been marked up by the eventscript.  Later,
    during normal operation, newly added IPs can be assigned to new
    interfaces immediately.  The IPs will still move away if an interface
    is noticed to be down in the next monitor cycle, but that is the
    exception rather than the rule.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 54e24a151d2163954e5a2a1c0f41a2b5c19ae44b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Nov 14 15:51:59 2012 +1100

    locking: Do not use RECLOCK for tracking DB locks and latencies
    
    RECLOCK is for recovery lock in CTDB. Do not override the meaning for
    tracking locks on databases.  Database lock latency has nothing to do
    with recovery lock latency.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 718233c445cd6627ab3962b6565c2655f1f8efd0
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Nov 6 17:06:54 2012 +1100

    tools/ctdb: Do not use function return value as pnn
    
    This fixes the wrong code where same variable 'ret' is used to track the pnn
    and the return value of a function call.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a5c6bb1fffb8dc3960af113957a1fd080cc7c245
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 23 16:23:12 2012 +1100

    recoverd: Track the nodes that fail takeover run and set culprit count
    
    If any of the nodes fail takeover run (either due to timeout or failure
    to complete within takeover_timeout interval) from main loop, recovery
    master will give up trying takeover run with following message:
    
      "Unable to setup public takeover addresses. Try again later"
    
    And as a side-effect the monitoring is disabled on all the nodes. Before
    ctdb_takeover_run() is called from main loop, monitoring get disabled via
    startrecovery event. Since ctdb_takeover_run() fails, it never runs
    recovered event and monitoring does not get re-enabled.
    
    In main_loop, ctdb_takeover_run() is called with a takeover_fail_callback.
    This callback will get called if any of the nodes fail in handling
    takeip/releaseip/ipreallocated events in ctdb_takeover_run().
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f243a916ee71013f7402b9c396c2ead88eb3aab0
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Nov 14 10:37:15 2012 +1100

    Eventscripts: 10.interface startup event should only process interfaces once
    
    Provided that monitor_interfaces() sets the state of each interface,
    there's no need to mark all interfaces as up before running
    monitor_interfaces() in the startup event.  monitor_interfaces() will
    set the true status of each interface anyway.  The duplication is
    unnecessary and may cause extra action in the recovery daemon because
    the state of some interfaces is changed an extra time.
    
    Instead, add a comment at the top of the loop in monitor_interfaces()
    to warn against early loop exits.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5f58c811127a89f162b6a41ddcd6e944801740a5
Author: Volker Lendecke <vl at samba.org>
Date:   Tue Nov 6 16:17:22 2012 +0100

    build: Fix the build with old system-installed tevent
    
    We depend on the tracing callback mechanism in ctdb.

commit cd64035d71ddff6aebe6c15a49e09527283425d2
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 31 12:33:25 2012 +1100

    ctdbd: Fix compilation warning in locking code
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ceac026713a7ee30ea865ed4a9422900ed76fdf6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 31 12:17:27 2012 +1100

    web: Update instructions for building from tarball
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit aad1584da8a8425bc6f5163c95810e9d2390dc91
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 31 12:10:22 2012 +1100

    tests: Do not check release suffix in ctdb version test
    
    release suffix added by RPM is to track packaging changes. Core CTDB
    version does not include the release suffix.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 16a91c2a4d03b46743611e2fe844bb2cef95e46a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 30 11:54:52 2012 +1100

    packaging: Use maketarball.sh script to create tarball for RPM
    
    This removes the duplicate code for building tarball and reuses existing
    script.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 3d4838db51dd8199b9c29aebb6e7bfbd2a27b8bb
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 30 11:52:19 2012 +1100

    packaging: Use optional argument as targetdir when creating tarball
    
    In addition, do not modify CTDB version string with extra suffix.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f8af7d8de76e68e5c4bde15f832a31ce9107e8c7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 30 11:49:28 2012 +1100

    tool/ctdb: Always support ctdb version command, don't make it optional
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 8df7ea6b20417833792932487a082b3c71bb6837
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 30 11:48:23 2012 +1100

    build: Add rules to create include/version.h when building from git tree
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit b151f9b62299ec5b887c62cef780547a39c0ba9d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Oct 30 11:47:24 2012 +1100

    packaging: Create include/version.h to define CTDB_VERSION_STRING
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 9be3b23adbfc844b71bf1d4ddf0fbc3b269f15fa
Author: Volker Lendecke <vl at samba.org>
Date:   Tue Oct 23 21:49:34 2012 +0200

    Add a \n to an error message

commit e2213db479129ce9c2b2fb88ec8c53cbd33d54b3
Author: Volker Lendecke <vl at samba.org>
Date:   Tue Oct 23 13:45:42 2012 +0200

    Avoid a bashism in 60.ganesha
    
    This file is #!/bin/sh. On sn-devel at least, with this /bin/sh the
    shell does not like == for string equality.

commit e94070de52232d6cefae0c6276df88b8fc380a4e
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 24 12:58:57 2012 +1100

    web: Update broken links to manpages
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 6871415f6cb50c4f9753067359f0e264d3f93871
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 18:04:09 2012 +1100

    packaging: Bundle README, COPYING and html version of manpages
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f3888712298f1de7cc7eb51f50c22080fa64e3c0
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 17:43:32 2012 +1100

    doc: Do not keep the built version of manpages in version control
    
    Generated docs will be bundled with release tarballs. No need to keep
    them in git. This avoids the need to commit the generated doc version
    if source xml is modified.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 0019291371af1e63ee132ed173ba7f52a0291a44
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 15:12:50 2012 +1100

    packaging: Use common code to generate VERSION string
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 19fb26346567d2249b1237f92d871022db2ba8cd
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 15:08:41 2012 +1100

    packaging: Factor out the code to genreate VERSION string
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 69f0473b72aadab5bd5791ccff2facd0cd469d43
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 15:55:33 2012 +1100

    packaging: Build docs and include them in tarball
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 3274cffe2052953b34141a82de6053b747532a88
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 10:09:26 2012 +1100

    build: Extract building of manpages in a separate Makefile
    
    This can then be used to build manpages/html when creating tarball.
    Do not build docs during a regular build, but only for install.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit db987eeb3c6e10552a1c1334bf263eb66fcad9ab
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 10:52:06 2012 +1100

    doc: README - add information about CTDB, license and website

commit b3eac871895cc586bcc671835e882b136e466b98
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 11:27:32 2012 +1100

    web: Add posix locking information to prerequisites
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 12e4a3e2953842b4c3842bf920fe2086df4fe46c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 11:26:52 2012 +1100

    web: Add the links to ftp/http ctdb download area
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4250c7ebe369e73cf29ff910bb9bfc929735408c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 11:25:46 2012 +1100

    web: Remove reference to non-existent config files
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c18ec8ec234cb71da6cc77b1aadc398f57187947
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Oct 22 12:19:07 2012 +1100

    doc: getlog and clearlog changes for recovery daemon logs
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7547e011005f0dd5bd38e67572280126cf16e229
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 18 14:15:09 2012 +1100

    tests: Local daemons should use the logging ringbuffer
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7197e600f46f2d1638f6c45c0149f109ea25a47c
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 18 14:13:30 2012 +1100

    tools/ctdb: Merge recoverd log handling into getlog/clearlog
    
    We don't need extra commands for these.
    
    Also, allow a default value of NOTICE for the getlog level.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ef55e06192819d840c09b65741bab737223ac34c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 16 20:57:31 2012 +1100

    tools/ctdb: Add log ringbuffer handling for recoverd
    
    This adds commands rdgetlog and rdclearlog
    
    These are analogous to getlog and clearlog but operate on the logs for
    the recovery daemon.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 16 20:54:39 2012 +1100

    recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOG
    
    These support getting and clearing logs from the ring-buffer in the
    recovery daemon.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a9511cf5ecd5bc39b0070f0afa8ac4d4926c6cab
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Oct 22 09:01:27 2012 +1100

    build: Set CTDB_PATH to /tmp/ctdb.socket if SOCKPATH is not defined
    
    When building samba with CTDB, if samba configure/waf does not support
    setting of SOCKPATH, fallback to /tmp/ctdb.socket.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit f92b9c83a2f39fba9a141417a88de96fc8c592ff
Author: David Disseldorp <ddiss at samba.org>
Date:   Thu Oct 18 16:55:19 2012 +0200

    Build: Set the default ctdb socket path at configure time
    
    The ctdb socket path currently defaults to /tmp/ctdb.socket and can be
    modified at runtime using the --socket=filename option, common to both
    ctdb and ctdbd binaries.
    
    This change allows the default path to be set at configure time using
    the --with-socketpath=FILE argument. When not specified, the default
    path remains /tmp/ctdb.socket, documentation remains unchanged as a
    result.
    
    Signed-off-by: David Disseldorp <ddiss at samba.org>

commit 7d025281ee70c91ebcd4d9a908de1045a689786b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Sep 25 17:29:50 2012 +1000

    locking: Do not use ctdb_kill() to kill smbd processes
    
    ctdb_kill() is used to terminate processes spawned by CTDB.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit edbc8a6669b594d3c413d603e1c9fada9244c2ee
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jul 11 15:15:41 2012 +1000

    locking: Add database priority handling for older versions of samba
    
    In samba versions 3.6.x and older, database priorities are not set.
    later_db() function implements higher database priority (locking order)
    for these databases -
       brlock, g_lock, notify_onelevel, serverid, xattr_tdb
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c8eb4a3170ab8524e638047053831ba547e9cce8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 9 17:37:35 2012 +1000

    locking: Schedule a new lock request everytime a lock is released
    
    Since the number of active lock requests is limited to
    MAX_LOCK_PROCESSES_PER_DB (= 100), any new requests won't get scheduled
    when they are created. So schedule a pending request once current active
    request is done.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2126795153dacb255e441abcb36ee05107b6282a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 14 16:12:48 2012 +1000

    ctdbd: Replace lockwait with locking API and remove ctdb_lockwait.c
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4456a01d8f54ca6c771d7488048de5f638477d21
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 9 15:17:21 2012 +1000

    ctdb_recover: Replace static locking functions with locking API
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 01ee86d2aafbcda658ef6acc2bba6d6781ae4047
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 9 15:09:51 2012 +1000

    ctdb_freeze: Replace locking functions with locking API
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit caff197edf6f928494028ac6c993901954aaa36f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 9 15:10:20 2012 +1000

    ctdbd_test: Include ctdb_lock.c code for test stubs
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1ee55c511b99e9f8a6fa4e34207267e953f09bae
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 17 15:25:46 2012 +1000

    tests: Fix statistics test for new output lines from locking API
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit e24b5bf283736624b387b0364d7200212bb3054b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 9 12:58:19 2012 +1000

    tools/ctdb: Display the locking statistics
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 1af99cf0de9919dd89af1feab6d1bd18b95d82ff
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Oct 11 11:29:29 2012 +1100

    ctdbd: locking: Provide non-blocking API for locking of TDB record/db/alldb
    
    This introduces a consistent API for handling locks on single record, complete
    db or all dbs. The locks are taken out in a child process. In cases of timeout,
    find the processes that currently hold the lock and log.
    
    Callback functions for locking requests take locked boolean to indicate
    whether the lock was successfully obtained or not.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit be4051326b0c6a0fd301561af10fd15a0e90023b
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 6 11:50:25 2012 +1000

    common: Add routines to get process and lock information
    
    Currently these functions are implemented only for Linux.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a0cdfae7438092f5c605f0608daa536be860b7fe
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed May 9 12:56:53 2012 +1000

    header: Added DB statistics update macros
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 5ee242c949a98bb7397e0f7368b20d44c06fe772
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 16 17:04:48 2012 +1100

    scripts: Refactor logging code in initscript and functions file
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2d75a04ba9a2e87a0dcb9bf778c58e335af1871c
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 16:21:02 2012 +1100

    tools/ctdb_diagnostics: Add "ctdb listvars" output
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 59a47c0674bacfebc17a1b44f0244727bf2fa7a4
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 16:18:26 2012 +1100

    initscript: Check that rc.ctdb is executable before running it
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 440892d75ef73c0aca22f47c0c01712be00cf5b7
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 16:10:19 2012 +1100

    ctdbd: Remove references to forcing running of eventscripts from log messages
    
    Running of eventscripts can be initiated from many places, including
    the recovery daemon.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 14589bf7c16ba017fe00d4e8bea8cc501546c60f
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 15:59:00 2012 +1100

    recoverd: Clarify some misleading log messages
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 59520c9785d113ad5063eb5fbe42a9efc7e30076
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 15:49:13 2012 +1100

    tools/ctdb: Remove extra header from natgwlist -Y output
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3cc878bc97fdac764a60ed805f64d649eaab06e8
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 15:17:54 2012 +1100

    recoverd: Verifying local IPs should only check for unhosted available IPs
    
    Currently it checks for unhosted IPs among the known IPs rather than
    available IPs.  This means that a takeover run can be flagged even
    when that takeover run will be unable to assign a known, unhosted IP.
    
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 16aba4eb620844626a1c71c58b51658caf44dea6
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Oct 11 14:34:37 2012 +1100

    Revert "Eventscripts - add facility to 10.interface to delete unmanaged IPs"
    
    This reverts commit 88f88d86b0d08240f749fb721b8c401c2eeb1099.
    
    This is dangerous and, on reflection, I can't see it being useful.
    There are often permanent IPs on interfaces that CTDB shares with its
    public IPs.

commit eaa7c165f58abd7e259c37d76b7dd37c91e13d9f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Sep 26 14:37:49 2012 +1000

    Eventscripts: "recovered" event should not fail on NATGW failure
    
    The recovery process has no protection against the "recovered" event
    failing, so this can cause a recovery loop.
    
    Instead of failing the "recovered" event, add a "monitor" event and
    fail that instead.  In this case the failure semantics are well
    defined.
    
    A separate patch should ban nodes if the "recovered" event fails for
    an unknown reason.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0e56e2dad1861892aa8ba59494ad244f2498314e
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Sep 28 09:39:12 2012 +1000

    Logging: Map TEVENT_DEBUG_FATAL to DEBUG_CRIT
    
    This is currently mapped to DEBUG_EMERG.  CTDB really has no business
    logging anything at EMERG level since the whole system is not about to
    abort or catch fire.  EMERG causes the message to appear on the
    console and on every terminal.  That's a bit overzealous!
    
    There would be very few situations where logs are being filtered at
    level below ERROR, so CRIT should certainly suffice.
    
    The trigger for this was curious messages saying "No event for <n>
    seconds!" logged in a user's terminal.
    
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7895bc003f087ab2f3181df3c464386f59bfcc39
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Sep 6 20:22:38 2012 +1000

    common: Debug ctdb_addr_to_str() using new function ctdb_external_trace()
    
    We've seen this function report "Unknown family, 0" and then CTDB
    disappeared without a trace.  If we can reproduce it then this might
    help us to debug it.
    
    The idea is that you do something like the following in /etc/sysconfig/ctdb:
    
      export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh"
    
    When we hit this error than we call out to gcore to get a core file so
    we can do forensics.  This might block CTDB for a few seconds.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit af540ef728303b4a0a188b17c695e9aefab34489
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 17 14:21:33 2012 +0200

    config/functions: fix a comment
    
    ctdb_check_counter_limits does not fail but succeed if count >= limit
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 25d886060b138bc5e78fe93d7bebe3990264f29d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 11:38:37 2012 +1100

    doc: Add info about execute permissions on event scripts
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 36d25e96a2f8ae1461c5a708a2922f0475a39900
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 11:38:59 2012 +1100

    doc: Fix documentation for setup event
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 632c1b9c1cc2e242376358ce49fd2022b3f27aa2
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Sep 3 12:39:36 2012 +1000

    scripts: Remove duplicate code from init script to set tunables
    
    The tunable variables defined in CTDB configuration file are currently
    set up from init script as well as part of "setup" event in 00.ctdb
    eventscript.  Remove the duplication of this code and set tunable
    variables only from setup event.  During the "setup" event, it's possible
    that ctdb tool commands can timeout if CTDB daemon is not ready.  To guard
    against such eventuality, wait till "ctdb ping" command succeeds before
    executing any other ctdb tool commands.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 08dbd9c7958f9a0ee3de314d49523d32e4be135c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Oct 17 11:24:57 2012 +1100

    doc: Fix the hyperlink for "Testing CTDB" page
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bd4ff176387372b1c233373c0bc8ced523fc9670
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 10 15:03:06 2012 +1100

    tests/eventscripts: add unit tests for policy routing reconfigure
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7d4b8cce96f33fff647a0c9d259c121dfc8403e9
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Oct 10 14:48:59 2012 +1100

    tests/eventscripts: add extra infrastructure for policy routing tests
    
    Less copying and pasting is a good thing...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c185ffd2822fcee26d07398464c59b66c61f53fa
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 3 10:54:30 2012 +1000

    Eventscripts: Add support for "reconfigure" pseudo-event for policy routing
    
    This rebuilds all policy routes and can be used if the configuration
    changes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9550c497e6d6ef5ee44826c4bd9ed5ad65174263
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 24 14:32:04 2012 +1000

    recoverd: Track failure of "recovered" event, banning culprits
    
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 56fcee3c7730cb12fa666072d5400949af6e5f7c
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 31 09:34:17 2012 +1000

    recoverd: When starting a takeover run disable IP verification
    
    Disable for TakeoverTimeout seconds.
    
    Otherwise the the recovery daemon can get overzealous and start trying
    to add/delete addresses that it thinks are missing but where the
    eventscript just hasn't finished.  This didn't used to matter so much
    but it is more important now that concurrent takeip/releaseip/updateip
    generate error - we want to avoid spamming the log.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 11 14:46:07 2012 +1000

    ctdbd: Stop takeovers and releases from colliding in mid-air
    
    There's a race here where release and takeover events for an IP can
    run at the same time.  For example, a "ctdb deleteip" and a takeover
    initiated by the recovery daemon.  The timeline is as follows:
    
    1. The release code registers a callback to update the VNN.  The
       callback is executed *after* the eventscripts run the releaseip
       event.
    
    2. The release code calls the eventscripts for the releaseip event,
       removing IP from its interface.
    
       The takeover code "updates" the VNN saying that IP is on some
       iface.... even if/though the address is already there.
    
    3. The release callback runs, removing the iface associated with IP in
       the VNN.
    
       The takeover code calls the eventscripts for the takeip event,
       adding IP to an interface.
    
    As a result, CTDB doesn't think it should be hosting IP but IP is on
    an interface.  The recovery daemon fixes this later... but it
    shouldn't happen.
    
    This patch can cause some additional noise in the logs:
    
      Release of IP 10.0.2.133/24 on interface eth2  node:2
      recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it.
      Release of IP 10.0.2.133/24 rejected update for this IP already in flight
      recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed
      recoverd:Failed to release local ip address
    
    In this case the node has started releasing an IP when the recovery
    daemon notices the addresses is still hosted and initiates another
    release.  This noise is harmless but annoying.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a555940fb5c914b7581667a05153256ad7d17774
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 28 15:17:29 2012 +1000

    ctdbd: New tunable NoIPTakeoverOnDisabled
    
    Stops the behaviour where unhealthy nodes can host IPs when there are
    no healthy nodes.  Set this to 1 when an immediate complete outage is
    preferred when all nodes are unhealthy.  The alternative
    (i.e. default) can lead to undefined behaviour when the shared
    filesystem is unavailable.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit be4ad110ede9981b181ac28f31ffd855a879d5df
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 21 15:52:03 2012 +1000

    Eventscripts: Add service-start and service-stop pseudo-events
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7054e4ded59c6b8f254dcfefaef64da05f25aecd
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 15 15:28:14 2012 +1000

    ctdbd: Avoid unnecessary updateip event
    
    The existing code makes one fatally bad assumption:
    vnn->iface->references can never be -1 (or max-unit32_t in this case).
    Right now the reference counting is broken so a reference count of -1
    is possible and causes a spurious updateip when vnn->iface is the same
    as best_face.  This can occur frequently because we get a lot of
    redundant takeovers, especially when each IP can only be hosted on one
    interface.
    
    This makes the code much more defensive by noting that when best_iface
    is the same as vnn->iface there is never a need for an updateip event.
    This effectively neuters the updateip code path when IPs can only be
    hosted by a single interface.
    
    This should obsolete 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c4f5a58471b206e2287c7958c7f29c1f1c0626ac
Author: Volker Lendecke <vl at samba.org>
Date:   Tue Oct 9 11:39:58 2012 +0200

    Correct include for ctdb_protocol.h
    
    With an old ctdb_protocol.h installed under /usr/local, ctdb will
    not compile because the <> form of include will find the header
    under /usr/local

commit 06dfd13604d08910e07cbf927c338d7b9fce9a2f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Sep 20 17:10:34 2012 +1000

    Revert "when creating/adding a public ip, set the initial interface to be the first interface specified"
    
    This reverts commit 4308935ba48ac7a29e7523315acf580019715f0f.
    
    This fixes 16_ctdb_config_add_ip.sh test when run against local daemons. When
    running against local daemons, if the interface is assigned as soon as an IP is
    added, then takeover would never assign this IP address.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 212298279557a2833ef0f81809b4a5cdac72ca02
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Oct 2 11:51:24 2012 +1000

    util: ctdb_fork() closes all sockets opened by the main daemon
    
    Do some other hosuekeeping including stopping tevent.
    
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Sep 3 15:37:01 2012 +1000

    eventscripts: Auto-start/stop services in background
    
    If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done
    in the background with logging.
    
    Fix some unit tests for samba and winbind.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 34535ae64420926b9a3bf7d453fed4e6f4c90115
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 16 14:41:11 2012 +1000

    Eventscripts: split 50.samba into 49.winbind and 50.samba
    
    winbind and samba can be separately managed.  This makes the service
    starting and stopping code way too complicated, and even adds a small
    amount of complexity to the monitoring code.  The sensible option is
    to split this eventscript in two.
    
    There are two potentially backward incompatible changes here:
    
    * Functionality has been removed that allowed 50.samba to manage
      winbind when CTDB_MANAGES_WINBIND was unset but the smb.conf
      "security" parameter was set to "ADS" or "DOMAIN".
    
      Maintaining this functionality would have required moving the
      testparm-related code to the functions file, deciding where the
      cache file should go, and then calling it from both 49.winbind and
      50.samba.  This feature wasn't of great value and asking
      administrators to set an extra variable in exchange for code
      simplicity seems like a reasonable deal.
    
    * External code will need to be changed if it calls 50.samba directly
      with winbind-related expectations.  This is fairly obvious!
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 043ef77086797a703aec436a26a05c56a1bcbf2b
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 21 14:28:37 2012 +1000

    Initscript: Kill any existing ctdbd processes if the ping succeeds
    
    Initialising a new ctdbd will destroy the Unix domain socket so
    existing processes will be useless anyway.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit dc2a8c638bd74b9f1dd75339cd2ae2f32ffa18a8
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 20 15:02:24 2012 +1000

    tools/ctdb: Free the event context
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b89e959904d7d1b0e5525abd7789f5101537a46a
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 20 14:30:35 2012 +1000

    libctdb: Add comments to effect that some controls return result in status
    
    These controls include:
    
      CTDB_CONTROL_GET_RECMODE
      CTDB_CONTROL_GET_RECMASTER
      CTDB_CONTROL_GET_PID
      CTDB_CONTROL_GET_PNN
      CTDB_CONTROL_PING
      CTDB_CONTROL_GET_DB_PRIORITY
    
    In these cases the data field is empty.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 6bd4feff7039138d435428eeded51975c44e567c
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 17:05:03 2012 +1000

    tests/tool: New tests for natgwlist, getcapabilities, lvs, lvsmaster
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0f0aef21a1bb2d88a8c184ef70c718e0c91acdc3
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 17:02:38 2012 +1000

    tests/tool: New function setup_natgw() to setup $CTDB_NATGW_NODES
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a56ec75edd1705b0539513d396d311f0e80a3bf5
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 16:59:19 2012 +1000

    tools/ctdb: Clean up control_natgw()
    
    * Factor out repeated code into new function find_natgw()
    * Support both machine and human readable output
    * Use libctdb
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c30ec02615183ecf9b412ad415bf1abd859aec45
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 16:57:01 2012 +1000

    tools/ctdb: Convert some commands over to libctdb
    
    control_getcapabilities(), control_lvs(), control_lvsmaster() updated
    to use ctdb_getcapabilities(), ctdb_getnodemap() as appropriate.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 81af67c6959fdbe0566e3f1a00e2be58dd268dc6
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 15:57:13 2012 +1000

    tests: libctdb stubs initial ctdb_getcapabilities() implementation
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a3f15d2828325bbfba5bc5c0a30429e2ce572a44
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 15:53:39 2012 +1000

    tests: libctdb stubs must copy pointers rather than just returning them
    
    Some code (e.g. NAT gateway code) modifies the returned result so was
    modifying the original.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 140fafef23050d40d66f5b5558c7efcb78f80cd2
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 18 14:24:08 2012 +1000

    libctdb: add ctdb_getcapabilities()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7b75a3bb722dc86139b1a07a0100d08c34620b91
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 21:25:27 2012 +1000

    tools/ctdb: Remove redundant filtering loop in control_natgwlist()
    
    This used to catch trailing blank lines.  However, these are caught
    just as effectively by the whitespace filtering in the loop below.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b29d5bbaa7048291c4b3a39bf12e04f0436f67da
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 21:15:57 2012 +1000

    tools/ctdb: natgwlist output is either human readable or machine readable
    
    The first line is currently human readable and the rest is machine
    readable.  This doesn't make sense.  Do one or the other...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 12a0a7a208d1c8fa8991894200d1dc133f3a2d1a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 21:09:46 2012 +1000

    tools/ctdb: Factor out printing of the machine readable status header
    
    It is already in 2 places and we might use it in another.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2da7730dc06153173778ab14e228960e72ff8a86
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 16 14:24:39 2012 +1000

    tools/ctdb: NAT gateway code should use CTDB_NATGW_NODES
    
    ... not NATGW_NODES.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 93c97c3ba3ff714dfa0d056a91ff45010a6e2d66
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 20:46:58 2012 +1000

    tests/eventscripts: New policy routing test with invalid table ID
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit acdaa04079a9827885f32a7bc078d3365c89b474
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 20:45:23 2012 +1000

    tests/eventscripts: Modify ip stub to simulate invalid table ID
    
    This involves refactoring ip_route_check_table() into a new function
    ip_check_table() which tables the operation type (i.e. rule/route) as
    an argument.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5c3be8f26dcde0b1b3d86928953e74d4a8b35958
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 20:19:37 2012 +1000

    Eventscripts: Indent error when a route delete fails in 11.per_ip_routing
    
    This puts it under the umbrella of the previous warning that should
    also have been printed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 6d41208074f0e9b56c585bca7eb39aaed653c4ca
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 19 17:20:18 2012 +1000

    tests/eventscript: unit test for 13.per_ip_routing bogus route removal
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d0d0a6f19960f233224970b8d5d19b0e37222616
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 15 17:22:02 2012 +1000

    eventscripts: 13.per_ip_routing should remove bogus routes on ipreallocated
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0ce5b079f327aba55b62800ccb22d79976fac665
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 13 13:53:18 2012 +1000

    tests/eventscripts: Add a policy routing unit test for "ip rule del" failure
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 30d69defa7e97ab5e3ba0492a27868dde2616494
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 13 13:49:49 2012 +1000

    eventscripts: Print a warning on failure to delete a routing rule
    
    del_routing_for_ip() currently fails silently, which could hide real
    errors.
    
    In add_routing_for_ip() we don't want to see any error when calling
    del_routing_for_ip(), since we don't expect the rule to be there.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 49dd755fcd077c84eaf3d2fe5dd7757f5588d49c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Aug 17 13:06:12 2012 +1000

    doc: Fix path string of /etc/sysconfig/ctdb file
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fc18188b7b63eb0dafbc47e3abf80e306e1dfc31
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 6 20:43:46 2012 +1000

    recoverd: All inactive nodes should yield recovery master role
    
    Not just stopped nodes.  In reality, this means that banned nodes will
    also yield, since nodes in the other inactive states won't be running
    a daemon.
    
    This seems sensible since if another node notices that an inactive
    node is the recovery master then it will force an election anyway.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e7dc10da3ced54ea9d719ad167ee42dcca8dce75
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 6 20:36:48 2012 +1000

    recoverd: An inactive node should not force recovery master elections
    
    An inactive node can't become the recovery master.  So if an inactive
    node notices that the recovery master is inactive, it shouldn't force
    an election for recovery master and nominate itself as a candidate.
    This can cause the recovery master to flip-flop between nodes when all
    nodes are inactive.
    
    If there is actually an active node then it will trigger the election.
    
    This is fairly cosmetic but is a step along the way towards ironing
    out weirdness when all nodes are stopped.
    
    Also, fix a related comment.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit a0c30c820fd47d4f8620dc060c825be10754f5d1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 3 10:30:29 2012 +1000

    recoverd: main_loop() should not verify local IPs if node is stopped
    
    Doing these checks is pointless and potentially causes unnecessary log
    messages.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f586e8a2911fc6e7f6698f516653145d8fd45dad
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 3 10:15:25 2012 +1000

    recoverd: verify_local_ip_allocation() should dup ifaces before early return
    
    If CTDB starts in STOPPED state then it thinks it is in the middle of
    a recovery.  rec->ifaces is also NULL and an early exit further down
    (that checks to see if a recovery is in process) means that it stays
    that way.
    
    However, each time this function is entered the need for a takeover
    run is re-flagged.  The takeover run never happens due to the the
    early exit, causing a couple of unneeded messages to be logged each
    time.
    
    This is avoided by moving the code that sets rec->ifaces so that it is
    executed earlier and, in this case, in the middle of a recovery.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit cc9d96f4248e45ea99c5f00db1526426ac26fbc2
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 2 17:26:04 2012 +1000

    recoverd: Update a log message that has bit-rotted
    
    This message used to be correct because the ipreallocated event only
    handled updating the NAT gateway.  However, that has changed so the
    message needs to be updated.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 9119a568c2b4601318f7751f537dca2f92a7230b
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 22 14:01:02 2012 +1000

    recoverd: Fix bogus info in message about changed flags
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c29a943f9bbcfecb861e71d007c7698a53dc8773
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 30 12:51:43 2012 +1000

    tests/eventscripts: Extra cases for policy routing missing config test
    
    Test the startup and monitor events too.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c64c6c77c3f6aa2898e5a575547b587bea868c76
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 30 12:51:12 2012 +1000

    Eventscripts: 13.per_ip_routing should always fail if config is missing
    
    Currently, if the configuration file is specified by
    $CTDB_PER_IP_ROUTING_CONF but is missing, takeip fails but (the
    absent) monitor event "succeeds", so the state of a node will
    flip-flop.
    
    Instead of this, if the configuration file is missing then fail early
    on for all events.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5277d749c9111716fd723647d5421907476422bf
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 30 11:50:53 2012 +1000

    Revert "Eventscripts - make 13.per_ip_routing fail gracefully if config is missing"
    
    When the configuration file is missing this causes the node to
    flip-flop betwen unhealthy (when takeip fails) and healthy (no monitor
    event here).
    
    Will reimplement this properly.
    
    This reverts commit 351ca413eec460330571ca8b01ad269728fe15df.

commit 076282622fcb2663d378e0c90ed0d9c19f73c005
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 6 20:35:23 2012 +1000

    ctdb tool: recmaster command might as well be auto-all
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit fa0f3cba5adaa38bed37dd8b121ad53e962a010d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 16:52:04 2012 +1000

    doc: Document the new onnode -P option
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit aed9b98ddbbf3e81de4f7257a10676565f7d7507
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 16:45:55 2012 +1000

    tools/onnode: Add -P option to push files to given nodes
    
    A list of files is given rather than a command.  These files are
    pushed to the specified nodes.
    
    Quoting is fragile/broken so filenames with spaces won't work - you
    win some, you lose some.  :-)
    
    All of the other onnode options should work together with this option.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 96fdda124f5511fb76190e7c7a7f0b98e6b01a31
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 20:13:45 2012 +1000

    Eventscripts: Clean up 11.routing
    
    The loops can all be done without cat or grep.
    
    The pair of loops in updateip is combined into a single loop.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 553455b386aa7848a516a921dfc14eb87c8a3fc1
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 4 07:21:01 2012 +1000

    ctdbd: Log a meaningful message if the nodes file/list is empty
    
    Right now the message says it can't bind to any of the
    addresses... even when there aren't any!
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3880589db4d563e438126cf5080261fa06b9e242
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 2 17:15:42 2012 +1000

    ctdbd: Remove the worked "Forced" from message about running eventscripts
    
    The eventscripts are run after a takeover run and in this case they're
    not forced.  The messages seems to imply that somone has run "ctdb
    eventscript" when that is not necessarily the case.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 38e8651b955afdbaf0ae87c24c55c052f8209290
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 2 14:09:32 2012 +1000

    ctdbd: Fix ctdb_control_release_ip() on local daemons
    
    When running on local daemons no IPs are actually assigned to
    interfaces.  Commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e broke
    ctdb_control_release_ip() for local daemons because it asks the system
    which interface the given IP is on, instead of the old behaviour of
    trusting CTDB's internal records.
    
    For local deamons (i.e. !ctdb->do_checkpublicip) revert to the old
    behaviour of looking up the interface internally.  This is good
    enough, given that the tests don't tend to misconfigure the addresses.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5b2725d1ae052e848c2487cb10c5393a877d118c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 15:45:45 2012 +1000

    Initscript: clean up drop_all_public_ips()
    
    This makes the case implicit where $CTDB_PUBLIC_ADDRESSES is unset.
    This is OK because that's not an interesting code path.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 6616a5712b5d4db2b9ba6a88cec79378696c2184
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 20 17:00:12 2012 +1000

    tests/tool: Run ctdb_tool_* under $VALGRIND
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7ef9916bd95ff2472359a412eac5489f1aad2dce
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 4 07:29:18 2012 +1000

    tests/eventscripts: Rewrite the testparm stub
    
    It currently needs the real testparm command installed even though it
    only uses limited features.  It is easy enough to fake up the
    functionality that 50.samba uses.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 3f268805c14c51f23024267916eae161bada8a0e
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 3 13:05:58 2012 +1000

    tests/complex: Fix broken ctdb_test_check_real_cluster()
    
    It doesn't set $h at all...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8d17dacee415dd0b4268805a366a86f83e33f27c
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 2 14:18:51 2012 +1000

    tests/simple: ctdb stop/continue tests weren't actually checking IPs
    
    The correct variable is $test_node_ips, not $ips.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 2fd0157382b42aa5c5212b8e743c6f589edc6662
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 2 14:06:35 2012 +1000

    tests: select_test_node_and_ips() should try to avoid failing
    
    Sometimes "ctdb sync" doesn't do its job, so we end up with unassigned
    IPs.
    
    If $test_node isn't set then this is bad.  However, try a few times to
    ensure it is set.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 47180dc75d15f3d61470705603565b718491c9f8
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 2 14:05:21 2012 +1000

    tests: simple tests against local daemons should check $TEST_LOCAL_DEAMONS
    
    Note the old $CTDB_TEST_REAL_CLUSTER - it doesn't exist anymore...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 619af3e857c2ced3840abfd86135cc954796da97
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 20 15:57:48 2012 +1000

    tests: run_tests should exit with $status with -e option
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 6e7bd9685406ae024d413a5d9d8c6e0d89b15567
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 14 19:37:39 2012 +1000

    tests/simple: ctdb reloadips test should use $test_ip
    
    There's no point recalculating this value.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f02e501342112aab67aee95f253e29a670b29273
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 14 19:36:04 2012 +1000

    tests:  select_test_node_and_ips() should never select non-node -1
    
    Instead of selecting the 1st pnn found, select the 1st one that isn't -1.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 21a5cbf9518fafc610939f14874371a52b1dc8b3
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jul 26 22:01:50 2012 +1000

    util: Do not lock down memory when running with local daemons
    
    Thanks to Ronnie for highlighting the issue of memory lockdown on AIX.
    Fix typo, use getuid and not getpid.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 25d45e69f4ffc2b26061ac13038d52a353e79e61
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 5 16:27:54 2012 +1000

    statd-callout: Fix a bug in the calculations of $STATE
    
    It is just meant to be even, so divided *and* multiplied by 2.  Use
    $(( )) to make it more readable.
    
    While touching this code, make the related calculation a bit more
    readable too.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 624f4677e99ed1710a0ace76201150349b1a0335
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 24 11:23:09 2012 +1000

    Eventscripts: Default route on NAT gateway should have a metric of 10
    
    At the moment routes from 11.routing can fail to be added because they
    conflict with the default route added by 11.natgw.
    
    NAT gateway is meant to be a last resort, so routes from 11.routing
    should override it.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 5d713d5e5be67f5914a661694c15d938bd67dea3
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 20:10:11 2012 +1000

    Eventscripts: Update/remove stale comments in 11.natgw
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 630cfe6451ba23d959fa4907fbba42702337ed3b
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 15:39:50 2012 +1000

    Eventscripts: Retrieve and build NAT gateway details better in 11.natgw
    
    * "ctdb natgw" is run twice when it doesn't need to be.
    
    * Tweak the parsing of "ctdb natgw" output so that it is done by the
      shell instead of a bunch of external processes.
    
    * Make default NAT gateway be -1, even on error.  If the process
      failed entirely then it could previously be empty.
    
    * Streamline the error handling using die() for when there is no NAT
      gateway.
    
    * Downcase script-local variable names.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 34f58a0773618c4508a55ad75fc4602dad5a5f4c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 15:37:14 2012 +1000

    Eventscripts: Optimise building the host address in 11.natgw
    
    It can be build without forking unnecessary processes.
    
    Also downcase variable name because it is local to script.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f6e421e8bf935cae790a6dc2b861eb9c7f8610b4
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 15:32:38 2012 +1000

    Eventscripts: Clean up startup sanity check in 11.natgw
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 07149edaecb3caa672163e5a3b89715557d5205a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 15:26:16 2012 +1000

    Eventscripts: remove redundant firewall rules from 11.natgw
    
    aeb70c7e7822854eb87873a5c7783e27e6e72318 said it moved these but it
    redundantly duplicated them instead.  That commit also fixed the
    problem because it moved the rules after delete_all() not out of the
    startup event as claimed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e20fdb974158061f4627d6f360c168d764690e6f
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 17 15:21:10 2012 +1000

    Eventscripts: 11.natgw $CTDB_NATGW_PUBLIC_IP splitting optimisation
    
    $CTDB_NATGW_PUBLIC_IP can be split into $_ip and $_maskbits without
    forking lots of processes.
    
    Also "local" isn't supported by POSIX.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b3e798f357606648f04d8a67ffee775b34fdede7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 24 17:27:22 2012 +1000

    web: Add my name to the developer list.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 538c68d0e83e14f0000981ee06408b8f0035be37
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 15 11:05:00 2012 +1000

    Remove tevent_loop_allow_nesting()
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 3de2830ae68241ee95bcc14dc1bb896ff18d86ce
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 6 16:19:10 2012 +1000

    ctdbd: Return explicit boolean values for function returning bool
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 25f84797a64a683c303b04057aa8113e9fc47c49
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 6 16:16:15 2012 +1000

    util: Do not try to lockdown memory when running in local daemons mode
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit d29e1880c8ce7219e065d31b47b0e8ad9e83146d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 15 15:07:04 2012 +1000

    Fix compiler warnings.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a0a0f5588445aeabe07b0e4d65087db454dc09da
Author: Michael Adam <obnox at samba.org>
Date:   Tue Jul 3 11:50:05 2012 +0200

    run_tests: improve spacing

commit 0e515115b3c21cb179fd7a6356164ac1b5d423e0
Author: Michael Adam <obnox at samba.org>
Date:   Tue Jul 3 11:46:26 2012 +0200

    run_tests.sh: fix a comment

commit 85a367005bd669309bb7e532b60d27621110180d
Author: Michael Adam <obnox at samba.org>
Date:   Tue Jul 3 14:28:36 2012 +0200

    ctdb: use correct "persistent" state for ctdb_attach in "ctdb cattdb"
    
    Originally, "ctdb cattdb" attached explicitly as non-persistent, which
    is now forbidden for persistent databases by the server.
    
    Pair-Programmed-With: Gregor Beck <gbeck at sernet.de>

commit 1ebbaa620b3cfb9ff373828e4aaa84246cf3ec25
Author: Gregor Beck <gbeck at sernet.de>
Date:   Thu Jun 21 10:26:03 2012 +0200

    ctdbd: refuse attaching with "persistent" to a non-persistent db and v.v.
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 20 15:10:05 2012 +1000

    When we find an ip we shouldnt host, just release it
    
    Dont call a full blown clusterwide ipreallocation,  just release it locally

commit c6bf22ba5c01001b7febed73dd16a03bd3fd2bed
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 20 10:08:11 2012 +1000

    When we release an ip, get the interface name from the kernel
    
    instead of using the interface where ctdb thinks the ip is hosted at.
    The difference is that this now allows us to handle cases where we want to release an ip   but ctdbd does not know which interface the ip is assigned on.
    (user has used 'ip addr add...'  and manually assigned an ip to the wrong interface)

commit f07376309e70f5ccdb7de8453caacc71b451ab48
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 20 13:32:02 2012 +1000

    Add new command to find which interface is located on

commit 8307c70ed98996b430c470e9641a09fdeeb81bd8
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 13 16:17:18 2012 +1000

    STATISTICS: Add tracking of the 10 hottest keys per database measured in hopcount
    
    and add mechanisms to dump it using the ctdb dbstatistics command

commit 98e1b46adba11b9549b5c5976e1f561fe732fa6e
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 7 15:08:15 2012 +1000

    Reimplement logging of long running events
    
    Reimplement 5aba53e6adcfcd7edbdac9e30aa5fcba176aca00 using tevent
    trace points.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 0dc204988eadff214dd149a756d756ab6e96e410
Author: Stefan Metzmacher <metze at samba.org>
Date:   Fri Jun 8 12:50:21 2012 +0200

    tevent: change version to 0.9.16
    
    This adds tevent_*_trace_*() and tevent_context_init_ops()
    
    metze
    
    Autobuild-User(master): Stefan Metzmacher <metze at samba.org>
    Autobuild-Date(master): Fri Jun  8 20:47:41 CEST 2012 on sn-devel-104

commit 7ebc00dc6a89043a971a720e7c21baf5f2a0233d
Author: Stefan Metzmacher <metze at samba.org>
Date:   Fri May 11 15:19:55 2012 +0200

    tevent: expose tevent_context_init_ops
    
    This can be used to implement wrapper backends,
    while passing a private pointer to the backens init function
    via ev->additional_data.
    
    metze

commit cb2bbe93628c1ab932c2e1ad6e2ec199a98f74c6
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 5 16:00:07 2012 +1000

    lib/tevent: Add trace point callback
    
    Set/get a single callback function to be invoked at various trace
    points.  Define "before wait" and "after wait" trace points - more
    trace points can be added later if required.
    
    CTDB wants this to log long waits and events.
    
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Signed-off-by: Stefan Metzmacher <metze at samba.org>

commit 88040778aace229d724de1ba7556aded12e22f86
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 7 14:20:13 2012 +1000

    Revert "TEVENT: Add back tracking of long runnig  events to the local copy of tevent library"
    
    This reverts commit 5aba53e6adcfcd7edbdac9e30aa5fcba176aca00.
    
    Do this using new tevent trace point callback.

commit e0c9200c05b1f7a04e002f505ebb5ba9340c0ca1
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jun 7 12:26:02 2012 +1000

    lib/tevent: In poll_event_context, add a pointer back to the tevent_context
    
    This makes it consistent with the other backends.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Signed-off-by: Stefan Metzmacher <metze at samba.org>

commit 6559106b8b853920f325f2dba532f4008e931fa3
Author: Stefan Metzmacher <metze at samba.org>
Date:   Mon May 14 11:48:00 2012 +0200

    lib/tevent/testsuite: no longer use 'compat' symbols
    
    metze

commit 1a6a011c772f7d302d114d7c8a151fa7820ec85f
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed May 30 11:50:13 2012 +1000

    Run the shutdown eventscript before we tear down the transport
    
    This allows eventscripts to still be able to call and use ctdb during the shutdown phase.

commit ac89da4eea98fa686408c5671a6c44c0fd1d7a58
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 25 15:57:14 2012 +1000

    tests: Increment RSN always in ctdb_update_record_persistent test
    
    If the record does not exist in persistent DB, RSN for that record is
    considered 0. To write a record, RSN for that record should be set to 1,
    otherwise the RSN check would fail.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 0be452958db95c8253c362a1c08a1966e53a1f99
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 25 11:40:38 2012 +1000

    tests: Fix ctdb_fetch test (parse extra lines of output)
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bc55e09fdac9f743d6428bfe0be77840ad0fd1ba
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 24 16:46:07 2012 +1000

    tests: Fix flakey behavior of ctdb_fetch test
    
    There were two issues with this test:
    
    1. Since the messages are sent from one node to the next, if a node
       does not register for messages before CTDB on that nodes receives
       the message, it will never be seen by ctdb_fetch and it would
       block on receive and would not send any messages to next node.
       The crude solution is to sleep just before the messages are sent,
       so that ctdb_fetch on all nodes have registered for the messages.
    
    2. If ctdb_fetch stops sending messages after timelimit expiry, the
       next node will keep waiting to receive messages in event_loop_once().
       The default timeout is 30 seconds for event_loop_once(). Adding a
       timed event will always set the timeout value to the time remaining
       for the timed event to expire.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 17 16:08:37 2012 +1000

    server: Replace BOOL datatype with bool, True/False with true/false
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fd3b73d7e634f16cbb99d7d5a548e12f00d1aadb
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri May 25 11:44:56 2012 +1000

    tests/eventscripts: Tweak expected output for lockd:b restart
    
    Commit 13acd58c41fba1a33894fbd654fed69ea0eac322 mades this test fail,
    since lockd:b and lockd:bs were incorrectly producing the same output.

commit 14012781c3751a514055df29ea70adfb12ecb2d9
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 23 15:36:01 2012 +1000

    tests: Complex tests must not be run from a cluster node
    
    Tickle tests fail if run from a node involved in the test.
    
    The condition is actually weaker than this: the test can't be run from
    a CTDB node that is hosting public addresses that may be used by the
    test.
    
    Rework ctdb_test_check_real_cluster() to support checking this.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 7640352c6697f9d4e0d13afbc8523afc64e7d462
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 23 14:24:40 2012 +1000

    Eventscripts: Fix deprecated iptables ! usage
    
    This currently causes warning in the logs.
    
    This change is not SLES10-compatible but we already have some other
    non-SLES10-compatible changes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c5e3e4bccbde349739b90d8761e1aa19637887a8
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 22 11:24:05 2012 +1000

    tests: test_wrap needs to set TEST_BIN_DIR when installed
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d0b539c4d2d4dc8c9eb95801bff09c3bcbeebca5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri May 18 12:59:41 2012 +1000

    packaging: make ctdb-tests package depend on nc
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 61df417821762d87ed01a7b5e64c35079940344d
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu May 10 16:59:39 2012 +1000

    tests: Use per node log files when running tests with local daemons
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 03fa2a517247eb2adfba67248e2466f17ea14418
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri May 25 12:31:11 2012 +1000

    RECOVERY: Increase the time we allow before timing out recovery related tasks.
    
    If the system is temporarily taking unusually long to perform these tasks it is better to wait a lot longer and allow the tasks to complete than timing out repeatedly and then becomming banned.

commit 1f262deaad0818f159f9c68330f7fec121679023
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri May 25 12:27:59 2012 +1000

    RECOVER: When we pull databases during recovery, we used to reallocate the databuffer for each entry added. This would normally not be an issue, but for cases where memory is fragmented, this could start to cost significant cpu if we need to reallocate and move to a different region.
    
    Change this to instead preallocate , by default, 10MByte chunks to the data buffer.
    This significantly reduces the number of potential reallocate and move  operations that may be required.
    
    Create a tunable to override/change how much preallocation should be used.

commit 6cf6a9b071bd8dd730717ca033337ff73bf247bb
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Mon May 21 14:01:04 2012 +1000

    DOCS: Document the new tunables to produce warnings if databases grow unexpectedly big.

commit 9ed58fef4991725f75509433496f4d5ffae0ae87
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Mon May 21 13:11:38 2012 +1000

    DEBUG: Add checks for and print debug messages when 1) a database contains very many records, 2) when a database is very big, 3) when a single record is very big.
    
    Add tunables to control when to log these instances and allow it to be completely turned off by setting the threshold to 0

commit 5aba53e6adcfcd7edbdac9e30aa5fcba176aca00
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Mon May 21 09:17:05 2012 +1000

    TEVENT: Add back tracking of long runnig  events to the local copy of tevent library

commit f59b40b3f8ea3da8ffb8601bc025e83c237072d5
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Thu May 17 11:16:57 2012 +1000

    GANESHA: make the ganesha script executable by default

commit f23b5a160184db8c92f8c69307dc4a64adae839d
Merge: 6e68797af67bee36f2bad045f94806e7e98f27e9 637cab6304dae66b85668506028c76ea1ee88980
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Thu May 17 11:48:07 2012 +1000

    Merge remote branch 'martins/ganesha'

commit 6e68797af67bee36f2bad045f94806e7e98f27e9
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Thu May 17 10:17:51 2012 +1000

    Debug: When scripts hang, we may need to collect additional data in order to debug why the script hung.
    
    Break this debug and datacollection out into an external script to make it easier to modify what data we need to collect.
    For now we only collect a pstree so we can see what part of the script we hung in.
    
    S1037271

commit 637cab6304dae66b85668506028c76ea1ee88980
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 16 17:24:21 2012 +1000

    Eventscripts: Modernise 60.ganesha to match 60.nfs
    
    Originally from Srikrishan Malik <srikrishan.malik at in.ibm.com> with
    some style changes by me.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 13acd58c41fba1a33894fbd654fed69ea0eac322
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 16 13:29:58 2012 +1000

    Eventscripts: restart lockd in the background when going unhealthy
    
    Sometimes the restart can hang when there are I/O problems.  Then the
    eventscript times out and gets killed so the node never marked as
    unhealthy.
    
    Restarting in the background avoids this.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 92f74fd589467b46c758e116e97417edfe8773d7
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 8 14:53:58 2012 +1000

    Eventscript functions: add optional version to nfs_check_rpc_service()
    
    This can be optional because the 1st item of each action-triple is a
    test comparison that starts with '-'.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

-----------------------------------------------------------------------


-- 
CTDB repository


More information about the samba-cvs mailing list