[SCM] Samba Shared Repository - branch master updated
Amitay Isaacs
amitay at samba.org
Mon Oct 8 03:37:02 UTC 2018
The branch, master has been updated
via 80f3f7c ctdb-tests: Improve counting of database records
via 52dcecb ctdb-tests: Add extra debug to large database recovery test
via d67d8ed ctdb-tests: Shut down transaction_loop clients more cleanly
via 2aa006a ctdb-tools: Have onnode pass -n option even when regular ssh not in use
via 6ac5124 ctdb-tests: Support closing of stdin in local daemons ssh stub
via 0dfb3c8 ctdb-tests: Be more careful when building public IP addresses
via 36eb738 ctdb-tests: Be more careful when building node addresses
via 03dddc3 ctdb-tests: Don't format IPv4 octets as hex digits
via 0eabac5 ctdb-tests: Be more efficient about starting/stopping local daemons
via a9ac330 ctdb-tests: Do not use ctdbd_wrapper in local daemon tests
via 8bde6fa ctdb-tests: Don't remove non-existent test database directory
via f2e4a5e ctdb-tests: Drop unused function maybe_stop_ctdb()
via 2cd6a00 ctdb-tests: Explicitly check for local daemons when shutting down
via 90f6b0a ctdb-tests: Drop functions daemons_start(), daemons_stop()
via f1ede41 ctdb-tests: Don't used daemons_start()/daemons_stop() directly in tests
via 4642a34 ctdb-tests: Rename _ctdb_start_all() -> ctdb_start_all()
via f57e5bb ctdb-tests: Rename ctdb_start_all() -> ctdb_init()
via a66a969 ctdb-tests: Drop ps_ctdbd()
via 83b3c56 ctdb-tests: Drop code for RECEIVE_RECORDS control
via 2f89bd9 ctdb-protocol: Drop marshalling code for RECEIVE_RECORDS control
via 81dae71 ctdb-protocol: Mark RECEIVE_RECORDS control obsolete
via d18385e ctdb-daemon: Drop implementation of RECEIVE_RECORDS control
via e15cdc6 ctdb-vacuum: Remove unnecessary check for zero records in delete list
via ef05239 ctdb-vacuum: Fix the incorrect counting of remote errors
via 202b902 ctdb-vacuum: Simplify the deletion of vacuumed records
via dcc9935 ctdb-tests: Add recovery record resurrection test for volatile databases
via c4ec99b ctdb-daemon: Invalidate records if a node becomes INACTIVE
via 040401c ctdb-daemon: Don't pull any records if records are invalidated
via 71896fd ctdb-daemon: Add invalid_records flag to ctdb_db_context
from 6784ff2 ctdbd_conn: Generalise inaccurate error message
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 80f3f7c1889d225dcc1e7841e28e9a3f7918c99c
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Oct 5 10:34:29 2018 +1000
ctdb-tests: Improve counting of database records
Record counts are sometimes incomplete for large databases when
relevant tests are run on a real cluster.
This probably has something to do with ssh, pipes and buffering, so
move the filtering and counting to the remote end. This means that
only the count comes across the pipe, instead of all the record data.
Instead of explicitly excluding the key for persistent database
sequence numbers, just exclude any key starting with '_'. Such keys
are not used in tests.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
Autobuild-Date(master): Mon Oct 8 05:36:11 CEST 2018 on sn-devel-144
commit 52dcecbc923ec16e85f01f822b1450ab7b91900d
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Oct 4 16:30:47 2018 +1000
ctdb-tests: Add extra debug to large database recovery test
This test sometimes fails, probably because the test is flakey.
Either the records aren't being added correctly or the counting of
records loses records. Try to debug both possibilities.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit d67d8ed44ac9beba7fdec0ceda56136f781fe19b
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Oct 3 16:39:16 2018 +1000
ctdb-tests: Shut down transaction_loop clients more cleanly
A transaction_loop client can exit with a transaction active when its
time limit expires. This causes a recovery and causes problems with
the test cleanup, which detects unwanted recoveries and fails.
Set a flag when the time limit expires and exit cleanly before the
next transaction is started.
Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 2aa006a31117769dd440661cf54394590c4b8f11
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Oct 3 19:13:57 2018 +1000
ctdb-tools: Have onnode pass -n option even when regular ssh not in use
ONNODE_SSH is really a test hook, so it doesn't need to support
completely random values.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 6ac5124b0117b7b5f3c5402934c93633ce186896
Author: Martin Schwenke <martin at meltin.net>
Date: Sat Apr 14 21:27:20 2018 +1000
ctdb-tests: Support closing of stdin in local daemons ssh stub
Not sure this is needed but this makes it behave the same as ssh.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 0dfb3c87b50745012c6c8bab5e0af262ce3f5f87
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 20 15:26:08 2018 +1000
ctdb-tests: Be more careful when building public IP addresses
The goal is to allow more local daemons by expanding the address range
rather than generating invalid addresses.
For IPv6, use a separate address space instead of an offset for the
2nd address.
For IPv4, use the last 2 octets with addresses starting at
192.168.100.1 and 192.168.200.1. Avoid addresses with 0 and 255 in
the last octet by using a maximum of 100 addresses per "subnet"
starting at .1.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 36eb7388775f7e931d102d71b867c4985830df17
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 20 15:24:43 2018 +1000
ctdb-tests: Be more careful when building node addresses
The goal is to allow more local daemons by expanding the address range
rather than generating invalid addresses.
For IPv6, use all 4 trailing hex digits.
For IPv4, use the last 2 octets. Although 127.0.0.0 is a /8 network,
avoid unexpected issues due to 0 and 255 in the last octet. Use a
maximum of 100 addresses per "subnet" starting at .1. Keep the first
group of addresses in 127.0.0.0/24 to continue to allow a reasonable
number of nodes to be tested with socket-wrapper.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 03dddc37b5f0f7e9a56fbe5299816b31053c2480
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 20 14:30:54 2018 +1000
ctdb-tests: Don't format IPv4 octets as hex digits
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 0eabac52955a191ef931ee739eff3ae7eafbf7b9
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 20 14:42:42 2018 +1000
ctdb-tests: Be more efficient about starting/stopping local daemons
Don't loop, just use onnode all.
For shutting down, use onnode -p all. This results in a significant
time saving for stopping many deamons because "ctdb shutdown" is now
synchronous.
onnode -p all can be used to start daemons directly because they
daemonize. However, this does not work under valgrind because the
valgrind process does not exit, so onnode will wait forever for it.
In this case, use onnode without the -p option.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit a9ac33015b64b7cf4f8d9e69576f8237a9b62684
Author: Martin Schwenke <martin at meltin.net>
Date: Tue Jul 10 15:57:19 2018 +1000
ctdb-tests: Do not use ctdbd_wrapper in local daemon tests
Run the daemon directly and shut it down using ctdb shutdown.
The wrapper waits for ctdbd to reach >=FIRST_RECOVERY runstate within
a timeout period and shuts ctdbd down if that doesn't happen. This is
only really used to ensure that ctdbd doesn't exit early after an
apparently successful start. There are no known cases where ctdbd
will continue running but fail to reach >=FIRST_RECOVERY runstate.
When ctdbd is started in tests, the test code will wait until ctdbd is
in a healthy state on all nodes before proceeding, so there is
effectively no change in behaviour.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 8bde6fa09c93b601cf463c0e25691e8396445cca
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 27 15:29:43 2018 +1000
ctdb-tests: Don't remove non-existent test database directory
This directory is no longer used. Lack of removal doesn't seem to
cause a problem.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit f2e4a5e9fae071b4ea5fad3015c84ba05fe6edcf
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Sep 28 20:41:45 2018 +1000
ctdb-tests: Drop unused function maybe_stop_ctdb()
There are too many functions to start/stop daemons. Simplify this.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 2cd6a00399d4303f19aa2eafe8560185a4248b35
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Sep 28 20:39:18 2018 +1000
ctdb-tests: Explicitly check for local daemons when shutting down
This is clearer if the logic is explicit... and...
There are too many functions to start/stop daemons. Simplify this.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 90f6b0a1ede1480d3aac911817f12825b87cb947
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 27 16:05:38 2018 +1000
ctdb-tests: Drop functions daemons_start(), daemons_stop()
There are too many functions to start/stop daemons. Simplify this.
Inline the functionality into ctdb_start_all() and ctdb_stop_all().
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit f1ede41adff57624120f8f8a09358bd516bcebea
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 27 16:02:54 2018 +1000
ctdb-tests: Don't used daemons_start()/daemons_stop() directly in tests
There are too many functions to start/stop daemons. Simplify this.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 4642a347d0b7bf90241c82b12454edc3d4afd257
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 27 16:00:35 2018 +1000
ctdb-tests: Rename _ctdb_start_all() -> ctdb_start_all()
There are too many functions to start/stop daemons. Simplify this.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit f57e5bbde7c6436d32e648bf25a66caf67155f6f
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 27 15:54:17 2018 +1000
ctdb-tests: Rename ctdb_start_all() -> ctdb_init()
There are too many functions to start/stop daemons. Simplify this.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit a66a96934a5ea778e72554a143aa23fd0b1ac746
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Sep 27 16:23:07 2018 +1000
ctdb-tests: Drop ps_ctdbd()
This was used for debugging tests by ensuring that the arguments to
ctdbd were as expected. It no longer outputs anything useful because
ctdbd is now started without arguments.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 83b3c5670d85c607c1cf1ab8cfc2c967d4d16721
Author: Amitay Isaacs <amitay at gmail.com>
Date: Thu Feb 15 12:28:36 2018 +1100
ctdb-tests: Drop code for RECEIVE_RECORDS control
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit 2f89bd96fb6c5e50cfc09604ceb6b96a94cb4f56
Author: Amitay Isaacs <amitay at gmail.com>
Date: Thu Feb 15 12:21:57 2018 +1100
ctdb-protocol: Drop marshalling code for RECEIVE_RECORDS control
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit 81dae71fa74bfd83a5701e4841b5a0a13cbe87a1
Author: Amitay Isaacs <amitay at gmail.com>
Date: Thu Feb 15 13:52:10 2018 +1100
ctdb-protocol: Mark RECEIVE_RECORDS control obsolete
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit d18385ea2aa93770996214d056a384a0244e7d73
Author: Amitay Isaacs <amitay at gmail.com>
Date: Thu Feb 15 12:04:32 2018 +1100
ctdb-daemon: Drop implementation of RECEIVE_RECORDS control
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit e15cdc652d76b37c58cd114215f00500991bc6b4
Author: Amitay Isaacs <amitay at gmail.com>
Date: Wed Feb 14 15:23:07 2018 +1100
ctdb-vacuum: Remove unnecessary check for zero records in delete list
Since no records are deleted from RB tree during step 1, there is no
need for the check. Run step 2 unconditionally.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit ef052397173522ac2dd0d0bd9660a18a13a3e4fc
Author: Amitay Isaacs <amitay at gmail.com>
Date: Wed Feb 14 15:18:17 2018 +1100
ctdb-vacuum: Fix the incorrect counting of remote errors
If a node fails to delete a record in TRY_DELETE_RECORDS control during
vacuuming, then it's possible that other nodes also may fail to delete a
record. So instead of deleting the record from RB tree on first failure,
keep track of the remote failures.
Update delete_list.remote_error and delete_list.left statistics only
once per record during the delete_record_traverse.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit 202b9027ba44eee33c2fde2332126be10f719423
Author: Amitay Isaacs <amitay at gmail.com>
Date: Wed Feb 14 14:50:40 2018 +1100
ctdb-vacuum: Simplify the deletion of vacuumed records
The 3-phase deletion of vacuumed records was introduced to overcome
the problem of record(s) resurrection during recovery. This problem
is now handled by avoiding the records from recently INACTIVE nodes in
the recovery process.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit dcc9935995a5a7b40df64653a605d1af89075bd1
Author: Martin Schwenke <martin at meltin.net>
Date: Mon Sep 24 16:17:19 2018 +1000
ctdb-tests: Add recovery record resurrection test for volatile databases
Ensure that deleted records and vacuumed records are not resurrected
from recently inactive nodes.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit c4ec99b1d3f1c5bff83bf66e3fd64d45a8be7441
Author: Amitay Isaacs <amitay at gmail.com>
Date: Wed Feb 14 14:19:44 2018 +1100
ctdb-daemon: Invalidate records if a node becomes INACTIVE
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit 040401ca3abfa266261130f6c5ae4e9718f19cd7
Author: Amitay Isaacs <amitay at gmail.com>
Date: Wed Feb 14 14:27:32 2018 +1100
ctdb-daemon: Don't pull any records if records are invalidated
This avoids unnecessary work during recovery to pull records from nodes
that were INACTIVE just before the recovery.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
commit 71896fddf10a92237d332779ccbb26c059caa649
Author: Amitay Isaacs <amitay at gmail.com>
Date: Wed Feb 14 14:29:18 2018 +1100
ctdb-daemon: Add invalid_records flag to ctdb_db_context
If a node becomes INACTIVE, then all the records in volatile databases
are invalidated. This avoids the need to include records from such
nodes during subsequent recovery after the node comes out INACTIVE state.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641
Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Martin Schwenke <martin at meltin.net>
-----------------------------------------------------------------------
Summary of changes:
ctdb/include/ctdb_private.h | 3 +-
ctdb/protocol/protocol.h | 2 +-
ctdb/protocol/protocol_api.h | 6 -
ctdb/protocol/protocol_client.c | 29 ---
ctdb/protocol/protocol_control.c | 26 --
ctdb/server/ctdb_control.c | 2 +-
ctdb/server/ctdb_freeze.c | 24 +-
ctdb/server/ctdb_recover.c | 211 +---------------
ctdb/server/ctdb_vacuum.c | 280 ++-------------------
ctdb/tests/complex/00_ctdb_init.sh | 2 +-
ctdb/tests/scripts/integration.bash | 28 +--
ctdb/tests/simple/00_ctdb_init.sh | 2 +-
ctdb/tests/simple/19_ip_takeover_noop.sh | 10 +-
ctdb/tests/simple/28_zero_eventscripts.sh | 8 +-
ctdb/tests/simple/69_recovery_resurrect_deleted.sh | 84 +++++++
ctdb/tests/simple/78_ctdb_large_db_recovery.sh | 32 ++-
ctdb/tests/simple/99_daemons_shutdown.sh | 6 +-
ctdb/tests/simple/scripts/local_daemons.bash | 80 ++----
ctdb/tests/simple/scripts/ssh_local_daemons.sh | 11 +-
ctdb/tests/src/protocol_common_ctdb.c | 20 --
ctdb/tests/src/transaction_loop.c | 28 ++-
ctdb/tools/onnode | 9 +-
22 files changed, 247 insertions(+), 656 deletions(-)
create mode 100755 ctdb/tests/simple/69_recovery_resurrect_deleted.sh
Changeset truncated at 500 lines:
diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index b3d2e14..ea00bb1 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -386,6 +386,7 @@ struct ctdb_db_context {
uint32_t freeze_transaction_id;
uint32_t generation;
+ bool invalid_records;
bool push_started;
void *push_state;
@@ -820,8 +821,6 @@ int32_t ctdb_control_start_recovery(struct ctdb_context *ctdb,
int32_t ctdb_control_try_delete_records(struct ctdb_context *ctdb,
TDB_DATA indata, TDB_DATA *outdata);
-int32_t ctdb_control_receive_records(struct ctdb_context *ctdb,
- TDB_DATA indata, TDB_DATA *outdata);
int32_t ctdb_control_get_capabilities(struct ctdb_context *ctdb,
TDB_DATA *outdata);
diff --git a/ctdb/protocol/protocol.h b/ctdb/protocol/protocol.h
index 6abd015..b868553 100644
--- a/ctdb/protocol/protocol.h
+++ b/ctdb/protocol/protocol.h
@@ -355,7 +355,7 @@ enum ctdb_controls {CTDB_CONTROL_PROCESS_EXISTS = 0,
CTDB_CONTROL_SET_DB_STICKY = 133,
CTDB_CONTROL_RELOAD_PUBLIC_IPS = 134,
CTDB_CONTROL_TRAVERSE_ALL_EXT = 135,
- CTDB_CONTROL_RECEIVE_RECORDS = 136,
+ CTDB_CONTROL_RECEIVE_RECORDS = 136, /* obsolete */
CTDB_CONTROL_IPREALLOCATED = 137,
CTDB_CONTROL_GET_RUNSTATE = 138,
CTDB_CONTROL_DB_DETACH = 139,
diff --git a/ctdb/protocol/protocol_api.h b/ctdb/protocol/protocol_api.h
index 1cd5d7d..6104c10 100644
--- a/ctdb/protocol/protocol_api.h
+++ b/ctdb/protocol/protocol_api.h
@@ -530,12 +530,6 @@ int ctdb_reply_control_set_db_sticky(struct ctdb_reply_control *reply);
void ctdb_req_control_reload_public_ips(struct ctdb_req_control *request);
int ctdb_reply_control_reload_public_ips(struct ctdb_reply_control *reply);
-void ctdb_req_control_receive_records(struct ctdb_req_control *request,
- struct ctdb_rec_buffer *recbuf);
-int ctdb_reply_control_receive_records(struct ctdb_reply_control *reply,
- TALLOC_CTX *mem_ctx,
- struct ctdb_rec_buffer **recbuf);
-
void ctdb_req_control_ipreallocated(struct ctdb_req_control *request);
int ctdb_reply_control_ipreallocated(struct ctdb_reply_control *reply);
diff --git a/ctdb/protocol/protocol_client.c b/ctdb/protocol/protocol_client.c
index a18af08..9aa32a9 100644
--- a/ctdb/protocol/protocol_client.c
+++ b/ctdb/protocol/protocol_client.c
@@ -1948,35 +1948,6 @@ int ctdb_reply_control_reload_public_ips(struct ctdb_reply_control *reply)
/* CTDB_CONTROL_TRAVERSE_ALL_EXT */
-/* CTDB_CONTROL_RECEIVE_RECORDS */
-
-void ctdb_req_control_receive_records(struct ctdb_req_control *request,
- struct ctdb_rec_buffer *recbuf)
-{
- request->opcode = CTDB_CONTROL_RECEIVE_RECORDS;
- request->pad = 0;
- request->srvid = 0;
- request->client_id = 0;
- request->flags = 0;
-
- request->rdata.opcode = CTDB_CONTROL_RECEIVE_RECORDS;
- request->rdata.data.recbuf = recbuf;
-}
-
-int ctdb_reply_control_receive_records(struct ctdb_reply_control *reply,
- TALLOC_CTX *mem_ctx,
- struct ctdb_rec_buffer **recbuf)
-{
- if (reply->rdata.opcode != CTDB_CONTROL_RECEIVE_RECORDS) {
- return EPROTO;
- }
-
- if (reply->status == 0) {
- *recbuf = talloc_steal(mem_ctx, reply->rdata.data.recbuf);
- }
- return reply->status;
-}
-
/* CTDB_CONTROL_IPREALLOCATED */
void ctdb_req_control_ipreallocated(struct ctdb_req_control *request)
diff --git a/ctdb/protocol/protocol_control.c b/ctdb/protocol/protocol_control.c
index 12a78e1..0b88b5c 100644
--- a/ctdb/protocol/protocol_control.c
+++ b/ctdb/protocol/protocol_control.c
@@ -360,10 +360,6 @@ static size_t ctdb_req_control_data_len(struct ctdb_req_control_data *cd)
len = ctdb_traverse_all_ext_len(cd->data.traverse_all_ext);
break;
- case CTDB_CONTROL_RECEIVE_RECORDS:
- len = ctdb_rec_buffer_len(cd->data.recbuf);
- break;
-
case CTDB_CONTROL_IPREALLOCATED:
break;
@@ -660,10 +656,6 @@ static void ctdb_req_control_data_push(struct ctdb_req_control_data *cd,
&np);
break;
- case CTDB_CONTROL_RECEIVE_RECORDS:
- ctdb_rec_buffer_push(cd->data.recbuf, buf, &np);
- break;
-
case CTDB_CONTROL_DB_DETACH:
ctdb_uint32_push(&cd->data.db_id, buf, &np);
break;
@@ -988,11 +980,6 @@ static int ctdb_req_control_data_pull(uint8_t *buf, size_t buflen,
&np);
break;
- case CTDB_CONTROL_RECEIVE_RECORDS:
- ret = ctdb_rec_buffer_pull(buf, buflen, mem_ctx,
- &cd->data.recbuf, &np);
- break;
-
case CTDB_CONTROL_DB_DETACH:
ret = ctdb_uint32_pull(buf, buflen, &cd->data.db_id, &np);
break;
@@ -1368,10 +1355,6 @@ static size_t ctdb_reply_control_data_len(struct ctdb_reply_control_data *cd)
case CTDB_CONTROL_TRAVERSE_ALL_EXT:
break;
- case CTDB_CONTROL_RECEIVE_RECORDS:
- len = ctdb_rec_buffer_len(cd->data.recbuf);
- break;
-
case CTDB_CONTROL_IPREALLOCATED:
break;
@@ -1562,10 +1545,6 @@ static void ctdb_reply_control_data_push(struct ctdb_reply_control_data *cd,
ctdb_db_statistics_push(cd->data.dbstats, buf, &np);
break;
- case CTDB_CONTROL_RECEIVE_RECORDS:
- ctdb_rec_buffer_push(cd->data.recbuf, buf, &np);
- break;
-
case CTDB_CONTROL_GET_RUNSTATE:
ctdb_uint32_push(&cd->data.runstate, buf, &np);
break;
@@ -1753,11 +1732,6 @@ static int ctdb_reply_control_data_pull(uint8_t *buf, size_t buflen,
&cd->data.dbstats, &np);
break;
- case CTDB_CONTROL_RECEIVE_RECORDS:
- ret = ctdb_rec_buffer_pull(buf, buflen, mem_ctx,
- &cd->data.recbuf, &np);
- break;
-
case CTDB_CONTROL_GET_RUNSTATE:
ret = ctdb_uint32_pull(buf, buflen, &cd->data.runstate, &np);
break;
diff --git a/ctdb/server/ctdb_control.c b/ctdb/server/ctdb_control.c
index 848010e..c260b92 100644
--- a/ctdb/server/ctdb_control.c
+++ b/ctdb/server/ctdb_control.c
@@ -650,7 +650,7 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
return ctdb_control_reload_public_ips(ctdb, c, async_reply);
case CTDB_CONTROL_RECEIVE_RECORDS:
- return ctdb_control_receive_records(ctdb, indata, outdata);
+ return control_not_implemented("RECEIVE_RECORDS", NULL);
case CTDB_CONTROL_DB_DETACH:
return ctdb_control_db_detach(ctdb, indata, client_id);
diff --git a/ctdb/server/ctdb_freeze.c b/ctdb/server/ctdb_freeze.c
index c41fc7d..10841ef 100644
--- a/ctdb/server/ctdb_freeze.c
+++ b/ctdb/server/ctdb_freeze.c
@@ -140,6 +140,9 @@ static int ctdb_db_freeze_handle_destructor(struct ctdb_db_freeze_handle *h)
ctdb_db->freeze_mode = CTDB_FREEZE_NONE;
ctdb_db->freeze_handle = NULL;
+ /* Clear invalid records flag */
+ ctdb_db->invalid_records = false;
+
talloc_free(h->lreq);
return 0;
}
@@ -394,6 +397,19 @@ static int db_freeze_waiter_destructor(struct ctdb_db_freeze_waiter *w)
}
/**
+ * Invalidate the records in the database.
+ * This only applies to volatile databases.
+ */
+static int db_invalidate(struct ctdb_db_context *ctdb_db, void *private_data)
+{
+ if (ctdb_db_volatile(ctdb_db)) {
+ ctdb_db->invalid_records = true;
+ }
+
+ return 0;
+}
+
+/**
* Count the number of databases
*/
static int db_count(struct ctdb_db_context *ctdb_db, void *private_data)
@@ -436,13 +452,17 @@ static int db_freeze(struct ctdb_db_context *ctdb_db, void *private_data)
}
/*
- start the freeze process for a certain priority
+ start the freeze process for all databases
+ This is only called from ctdb_control_freeze(), which is called
+ only on node becoming INACTIVE. So mark the records invalid.
*/
static void ctdb_start_freeze(struct ctdb_context *ctdb)
{
struct ctdb_freeze_handle *h;
int ret;
+ ctdb_db_iterator(ctdb, db_invalidate, NULL);
+
if (ctdb->freeze_mode == CTDB_FREEZE_FROZEN) {
int count = 0;
@@ -534,6 +554,8 @@ static int ctdb_freeze_waiter_destructor(struct ctdb_freeze_waiter *w)
/*
freeze all the databases
+ This control is only used when freezing database on node becoming INACTIVE.
+ So mark the records invalid in ctdb_start_freeze().
*/
int32_t ctdb_control_freeze(struct ctdb_context *ctdb,
struct ctdb_req_control_old *c, bool *async_reply)
diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c
index fc64037b..f05052e 100644
--- a/ctdb/server/ctdb_recover.c
+++ b/ctdb/server/ctdb_recover.c
@@ -279,6 +279,11 @@ int32_t ctdb_control_pull_db(struct ctdb_context *ctdb, TDB_DATA indata, TDB_DAT
ctdb_db->db_name, ctdb_db->unhealthy_reason));
}
+ /* If the records are invalid, we are done */
+ if (ctdb_db->invalid_records) {
+ goto done;
+ }
+
if (ctdb_lockdb_mark(ctdb_db) != 0) {
DEBUG(DEBUG_ERR,(__location__ " Failed to get lock on entire db - failing\n"));
return -1;
@@ -293,6 +298,7 @@ int32_t ctdb_control_pull_db(struct ctdb_context *ctdb, TDB_DATA indata, TDB_DAT
ctdb_lockdb_unmark(ctdb_db);
+done:
outdata->dptr = (uint8_t *)params.pulldata;
outdata->dsize = params.len;
@@ -388,6 +394,11 @@ int32_t ctdb_control_db_pull(struct ctdb_context *ctdb,
state.srvid = pulldb_ext->srvid;
state.num_records = 0;
+ /* If the records are invalid, we are done */
+ if (ctdb_db->invalid_records) {
+ goto done;
+ }
+
if (ctdb_lockdb_mark(ctdb_db) != 0) {
DEBUG(DEBUG_ERR,
(__location__ " Failed to get lock on entire db - failing\n"));
@@ -422,6 +433,7 @@ int32_t ctdb_control_db_pull(struct ctdb_context *ctdb,
ctdb_lockdb_unmark(ctdb_db);
+done:
outdata->dptr = talloc_size(outdata, sizeof(uint32_t));
if (outdata->dptr == NULL) {
DEBUG(DEBUG_ERR, (__location__ " Memory allocation error\n"));
@@ -1318,205 +1330,6 @@ int32_t ctdb_control_try_delete_records(struct ctdb_context *ctdb, TDB_DATA inda
return 0;
}
-/**
- * Store a record as part of the vacuum process:
- * This is called from the RECEIVE_RECORD control which
- * the lmaster uses to send the current empty copy
- * to all nodes for storing, before it lets the other
- * nodes delete the records in the second phase with
- * the TRY_DELETE_RECORDS control.
- *
- * Only store if we are not lmaster or dmaster, and our
- * rsn is <= the provided rsn. Use non-blocking locks.
- *
- * return 0 if the record was successfully stored.
- * return !0 if the record still exists in the tdb after returning.
- */
-static int store_tdb_record(struct ctdb_context *ctdb,
- struct ctdb_db_context *ctdb_db,
- struct ctdb_rec_data_old *rec)
-{
- TDB_DATA key, data, data2;
- struct ctdb_ltdb_header *hdr, *hdr2;
- int ret;
-
- key.dsize = rec->keylen;
- key.dptr = &rec->data[0];
- data.dsize = rec->datalen;
- data.dptr = &rec->data[rec->keylen];
-
- if (ctdb_lmaster(ctdb, &key) == ctdb->pnn) {
- DEBUG(DEBUG_INFO, (__location__ " Called store_tdb_record "
- "where we are lmaster\n"));
- return -1;
- }
-
- if (data.dsize != sizeof(struct ctdb_ltdb_header)) {
- DEBUG(DEBUG_ERR, (__location__ " Bad record size\n"));
- return -1;
- }
-
- hdr = (struct ctdb_ltdb_header *)data.dptr;
-
- /* use a non-blocking lock */
- if (tdb_chainlock_nonblock(ctdb_db->ltdb->tdb, key) != 0) {
- DEBUG(DEBUG_INFO, (__location__ " Failed to lock chain in non-blocking mode\n"));
- return -1;
- }
-
- data2 = tdb_fetch(ctdb_db->ltdb->tdb, key);
- if (data2.dptr == NULL || data2.dsize < sizeof(struct ctdb_ltdb_header)) {
- if (tdb_store(ctdb_db->ltdb->tdb, key, data, 0) == -1) {
- DEBUG(DEBUG_ERR, (__location__ "Failed to store record\n"));
- ret = -1;
- goto done;
- }
- DEBUG(DEBUG_INFO, (__location__ " Stored record\n"));
- ret = 0;
- goto done;
- }
-
- hdr2 = (struct ctdb_ltdb_header *)data2.dptr;
-
- if (hdr2->rsn > hdr->rsn) {
- DEBUG(DEBUG_INFO, (__location__ " Skipping record with "
- "rsn=%llu - called with rsn=%llu\n",
- (unsigned long long)hdr2->rsn,
- (unsigned long long)hdr->rsn));
- ret = -1;
- goto done;
- }
-
- /* do not allow vacuuming of records that have readonly flags set. */
- if (hdr->flags & CTDB_REC_RO_FLAGS) {
- DEBUG(DEBUG_INFO,(__location__ " Skipping record with readonly "
- "flags set\n"));
- ret = -1;
- goto done;
- }
- if (hdr2->flags & CTDB_REC_RO_FLAGS) {
- DEBUG(DEBUG_INFO,(__location__ " Skipping record with readonly "
- "flags set\n"));
- ret = -1;
- goto done;
- }
-
- if (hdr2->dmaster == ctdb->pnn) {
- DEBUG(DEBUG_INFO, (__location__ " Attempted to store record "
- "where we are the dmaster\n"));
- ret = -1;
- goto done;
- }
-
- if (tdb_store(ctdb_db->ltdb->tdb, key, data, 0) != 0) {
- DEBUG(DEBUG_INFO,(__location__ " Failed to store record\n"));
- ret = -1;
- goto done;
- }
-
- ret = 0;
-
-done:
- tdb_chainunlock(ctdb_db->ltdb->tdb, key);
- free(data2.dptr);
- return ret;
-}
-
-
-
-/**
- * Try to store all these records as part of the vacuuming process
- * and return the records we failed to store.
- */
-int32_t ctdb_control_receive_records(struct ctdb_context *ctdb,
- TDB_DATA indata, TDB_DATA *outdata)
-{
- struct ctdb_marshall_buffer *reply = (struct ctdb_marshall_buffer *)indata.dptr;
- struct ctdb_db_context *ctdb_db;
- int i;
- struct ctdb_rec_data_old *rec;
- struct ctdb_marshall_buffer *records;
-
- if (indata.dsize < offsetof(struct ctdb_marshall_buffer, data)) {
- DEBUG(DEBUG_ERR,
- (__location__ " invalid data in receive_records\n"));
- return -1;
- }
-
- ctdb_db = find_ctdb_db(ctdb, reply->db_id);
- if (!ctdb_db) {
- DEBUG(DEBUG_ERR, (__location__ " Unknown db 0x%08x\n",
- reply->db_id));
- return -1;
- }
-
- DEBUG(DEBUG_DEBUG, ("starting receive_records of %u records for "
- "dbid 0x%x\n", reply->count, reply->db_id));
-
- /* create a blob to send back the records we could not store */
- records = (struct ctdb_marshall_buffer *)
- talloc_zero_size(outdata,
- offsetof(struct ctdb_marshall_buffer, data));
- if (records == NULL) {
- DEBUG(DEBUG_ERR, (__location__ " Out of memory\n"));
- return -1;
- }
- records->db_id = ctdb_db->db_id;
-
- rec = (struct ctdb_rec_data_old *)&reply->data[0];
- for (i=0; i<reply->count; i++) {
- TDB_DATA key, data;
-
- key.dptr = &rec->data[0];
- key.dsize = rec->keylen;
- data.dptr = &rec->data[key.dsize];
- data.dsize = rec->datalen;
-
- if (data.dsize < sizeof(struct ctdb_ltdb_header)) {
- DEBUG(DEBUG_CRIT, (__location__ " bad ltdb record "
- "in indata\n"));
- talloc_free(records);
- return -1;
- }
-
- /*
- * If we can not store the record we must add it to the reply
- * so the lmaster knows it may not purge this record.
- */
- if (store_tdb_record(ctdb, ctdb_db, rec) != 0) {
- size_t old_size;
- struct ctdb_ltdb_header *hdr;
-
- hdr = (struct ctdb_ltdb_header *)data.dptr;
- data.dptr += sizeof(*hdr);
- data.dsize -= sizeof(*hdr);
-
- DEBUG(DEBUG_INFO, (__location__ " Failed to store "
- "record with hash 0x%08x in vacuum "
- "via RECEIVE_RECORDS\n",
- ctdb_hash(&key)));
-
- old_size = talloc_get_size(records);
- records = talloc_realloc_size(outdata, records,
- old_size + rec->length);
- if (records == NULL) {
- DEBUG(DEBUG_ERR, (__location__ " Failed to "
- "expand\n"));
- return -1;
- }
- records->count++;
- memcpy(old_size+(uint8_t *)records, rec, rec->length);
- }
-
- rec = (struct ctdb_rec_data_old *)(rec->length + (uint8_t *)rec);
- }
-
- *outdata = ctdb_marshall_finish(records);
-
- return 0;
-}
-
-
/*
report capabilities
*/
diff --git a/ctdb/server/ctdb_vacuum.c b/ctdb/server/ctdb_vacuum.c
index e749116..2194b7f 100644
--- a/ctdb/server/ctdb_vacuum.c
+++ b/ctdb/server/ctdb_vacuum.c
@@ -107,6 +107,7 @@ struct delete_record_data {
struct ctdb_context *ctdb;
struct ctdb_db_context *ctdb_db;
struct ctdb_ltdb_header hdr;
+ uint32_t remote_fail_count;
TDB_DATA key;
uint8_t keydata[1];
};
--
Samba Shared Repository
More information about the samba-cvs
mailing list