[SCM] CTDB repository - branch master updated - ctdb-1.10-170-ga16dc65
Michael Adam
obnox at samba.org
Mon Mar 14 06:43:22 MDT 2011
The branch, master has been updated
via a16dc65b4602da5ce2c16578bec2e7882aff240d (commit)
via 52193b6692091e341ed7a81dbd9a61ae49a8aac5 (commit)
via be4b63ee18933524f780df5c313447e5ef0786d1 (commit)
via f28e636cc4a04ef982672d5f569ad6b6b963db1f (commit)
via f898ff21fa338358179e79381215b13a6bc77c53 (commit)
via 69d34983a37b0324ff7610b8dfdcd8d13bf81c54 (commit)
via 46381a3cb58ccc11422af8f7798c80ea8d72294f (commit)
via ab2711701999a5ecc23a36b3d9ba8e94f92e4c87 (commit)
via 2559b2a45eb11834da3b0e0963e24351c8b7477f (commit)
via e58c8f51f27e468897af5210b80e5f5f45c3c4bb (commit)
via c9b65f3602f51bcbf0e6d82c12076c31e4aebe38 (commit)
via 3cca0d4b48325d86de2cb0b44bb7811a30701352 (commit)
via df49ec44de80affa5ccc637dec12a20a26e8706e (commit)
via 23631ffc152486aed9ce5b69a391e52bc4947833 (commit)
via 3da1e2e30bf34622f08e6ecd5b8fe55684e5007a (commit)
via 30aa55b3efc6fbd4078f93da386b6aeb337c1a0c (commit)
via cf57efd440ccc3db381386f4749bfcbf8ac5ecae (commit)
via b70bc141d84f7355d2c6c901961b7366db566980 (commit)
via 680223074e992b32ccf6f42cb80c3fa93074fee7 (commit)
via 4cebfa33db3c7effa087f753530c52b2dd8550e6 (commit)
via 2038e745db33cc5c3b4e2db8a00a57ede03906a2 (commit)
via b9bdef46fedfbc543263b67cfee3e896773cd8e8 (commit)
via 3addd28aa73883b3b05888e309d19db0eb67eab9 (commit)
via 7bbb12695c24da25671f1c39a411295d35870d2c (commit)
via 4f0ace982dbb5b4f9c035dbf4cb0ae74cd18d81b (commit)
via 571683e7c48aeed8ce41c584d016ced7ff0d2e2d (commit)
via 23b8c8c5fc8604ee0bd6da1f4b5152277eb5f1c0 (commit)
via 91e6d36a190b1c9e4c8b18f7833e51c5c9a67574 (commit)
via c0668bfe0bb4e69988ae34d875568d08539e6fb9 (commit)
via 53a39d0cc5ea251c2189ec8178ccb769fa046c43 (commit)
via 0d997ec7e61a7bee2cb05456f9c7d5e6f7a44797 (commit)
via 04c335f9195a5fd83c91a57d06b1e4eaa511844e (commit)
via 5eee05c4d256c08f4ee60a1a69efda6844e39729 (commit)
via 4d32908fdcec120426536a761e1d0be60f076198 (commit)
via 4407e5a7fb045ce56b6d902f7116de663ea648cb (commit)
via e99834c1a2eea60f7f974c0689ae0a65cfe178ff (commit)
via d4ab790c1f679e833eb97816762fcfcee15ccb10 (commit)
via 6c603f85726d2efac9710af7c4875ded2ca7230e (commit)
via 731a6011ce4a1301f86eacb039955745f2b5d866 (commit)
via f19fe5b45748a6998c6950a5b1db7ec2c4468c1c (commit)
via 0aff1b61dd1b683c6739478008a5b014b933df50 (commit)
via 9bbedf786b26bb074f668b31f29a9032af958673 (commit)
via c11ca778ee90444c44dee0a629cd2eefa3a1f75e (commit)
via 4079b8bf7a57a27a45d29784a1b0a414c778e552 (commit)
via 945187d64cfc7bd30a0c3b0d548cbe582d95dde3 (commit)
via fb5d832104970320359b3e474eb291ca3d629380 (commit)
via dd2449c422f323f9b5485e45107a9cc5acc09e08 (commit)
via 86c844fb08a7fd33e94f56b8d5e43278120e1162 (commit)
via 455cc6616e10b7f09589f9b87cb60f591bb502b0 (commit)
via 101be642e492a3a54231e2e3e6553a59380fe702 (commit)
via f5fb232117886186066ab3430fdd2307cba94960 (commit)
via 3930c7796b72bbf275bbca8aaeceec3e705a964b (commit)
via bc4990e600c53433a924a0d70e3488a5a6bdc1ff (commit)
via 49247df4a47a8a107fa7dd7b187e69e243e6bdbe (commit)
via 136508e3f4dd0acc210dde938ad59ef38b63d3a1 (commit)
from 6a4df8242ee4d095ff03229a168b83bcd84c8a7a (commit)
http://gitweb.samba.org/?p=ctdb.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit a16dc65b4602da5ce2c16578bec2e7882aff240d
Author: Michael Adam <obnox at samba.org>
Date: Fri Mar 11 16:05:44 2011 +0100
vacuum: fix a comment typo
commit 52193b6692091e341ed7a81dbd9a61ae49a8aac5
Author: Michael Adam <obnox at samba.org>
Date: Fri Mar 11 15:57:45 2011 +0100
vacuum: use insert_record_into_delete_queue in ctdb_local_schedule_for_deletion.
This is to take advantage of the hash collision handling and logging
also in ctdb_local_schedule_for_deletion.
commit be4b63ee18933524f780df5c313447e5ef0786d1
Author: Michael Adam <obnox at samba.org>
Date: Fri Mar 11 15:55:52 2011 +0100
vacuum: refactor insert_record_into_delete_queue out of ctdb_control_schedule_for_deletion
commit f28e636cc4a04ef982672d5f569ad6b6b963db1f
Author: Michael Adam <obnox at samba.org>
Date: Fri Mar 11 14:57:15 2011 +0100
vacuum: raise a debug level from INFO to DEBUG
when overwriting an existing entry in the delete_queue.
commit f898ff21fa338358179e79381215b13a6bc77c53
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 16:32:23 2011 +0100
ctdb_ltdb_store_server: honour the AUTOMATIC record flag
Do not delete empty records that carry this flag but store
them and schedule them for deletetion. Do not store the flag
in the ltdb though, since this is internal only and should not
be visible to the client.
commit 69d34983a37b0324ff7610b8dfdcd8d13bf81c54
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 16:30:52 2011 +0100
ltdb: add the CTDB_REC_FLAG_AUTOMATIC to the initial header in ctdb_ltdb_fetch()
Signals that this record was not created by a client level store.
commit 46381a3cb58ccc11422af8f7798c80ea8d72294f
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 16:27:42 2011 +0100
ctdb_private.h: add record flag CTDB_REC_FLAG_AUTOMATIC
This is a flag that shall signa that a record has been automatically generated by ctdb
and not by an explicit client store operation. This will be used in the ctdb_ltdb_fetch
operation which stores an empty record with default initial header before trying to
migrate the record from the dmaster when the record does not exist in the local tdb.
commit ab2711701999a5ecc23a36b3d9ba8e94f92e4c87
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 28 13:19:22 2010 +0100
ctdb_ltdb_store_server: add ability to send SCHEDULE_FOR_DELETION control to ctdb_ltdb_store.
commit 2559b2a45eb11834da3b0e0963e24351c8b7477f
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 18:08:11 2010 +0100
ctdb_ltdb_store_server: Improve debug message in ctdb_ltdb_store when store or delete fails.
commit e58c8f51f27e468897af5210b80e5f5f45c3c4bb
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 17:50:52 2010 +0100
ctdb_ltdb_store_server: always store the data when ctdb_ltdb_store() is called from the client
This also fixes a segfault since ctdb_lmaster uses the vnn_map.
commit c9b65f3602f51bcbf0e6d82c12076c31e4aebe38
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 14:13:50 2010 +0100
ctdb_ltdb_store_server: implement fastpath vacuuming deletion based on VACUUM_MIGRATED flag.
When the record has been obtained by the lmaster as part of the vacuuming-fetch
handler and it is empty and never been migrated with data, then such records
are deleted instead of being stored. These records have automatically been
deleted when leaving the former dmaster, so that they vanish for good when
hitting the lmaster in this way. This will reduces the load on traditional
vacuuming.
Pair-Programmed-With: Stefan Metzmacher <metze at samba.org>
commit 3cca0d4b48325d86de2cb0b44bb7811a30701352
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 3 15:29:21 2010 +0100
ctdb_ltdb_store_server: delete an empty record that is safe to delete instead of storing locally.
When storing a record that is being migrated off to another node
and has never been migrated with data, then we can safely delete it
from the local tdb instead of storing the record with empty data.
Note: This record is not deleted if we are its lmaster or dmaster.
Pair-Programmed-With: Stefan Metzmacher <metze at samba.org>
commit df49ec44de80affa5ccc637dec12a20a26e8706e
Author: Michael Adam <obnox at samba.org>
Date: Thu Dec 30 18:19:32 2010 +0100
server: Use the ctdb_ltdb_store_server() in the ctdb daemon for non-persistent dbs
This is realized by adding a ctdb_ltdb_store_fn function pointer to the db
context and filling it in the attach procedure for non-persistent dbs.
commit 23631ffc152486aed9ce5b69a391e52bc4947833
Author: Michael Adam <obnox at samba.org>
Date: Thu Dec 30 17:44:51 2010 +0100
server: create a server variant ctdb_ltdb_store_server() of ctdb_ltdb_store().
This is supposed to contain logic for deleting records that are safe
to delete and scheduling records for deletion. It will be called in
server context for non-persistent databases instead of the standard
ctdb_ltdb_store() function.
commit 3da1e2e30bf34622f08e6ecd5b8fe55684e5007a
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 28 13:14:23 2010 +0100
daemon: fill ctdb->ctdbd_pid early
commit 30aa55b3efc6fbd4078f93da386b6aeb337c1a0c
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 15:29:46 2010 +0100
test: send SCHEDULE_FOR_DELETION control from randrec test.
commit cf57efd440ccc3db381386f4749bfcbf8ac5ecae
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 15:29:23 2010 +0100
client: add accessor function ctdb_header_from_record_handle().
commit b70bc141d84f7355d2c6c901961b7366db566980
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 28 13:13:34 2010 +0100
vacuum: add ctdb_local_schedule_for_deletion()
commit 680223074e992b32ccf6f42cb80c3fa93074fee7
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 14:25:48 2010 +0100
server: implement a new control SCHEDULE_FOR_DELETION to fill the delete_queue.
commit 4cebfa33db3c7effa087f753530c52b2dd8550e6
Author: Michael Adam <obnox at samba.org>
Date: Wed Mar 9 00:57:55 2011 +0100
control: add a new control opcode CTDB_CONTROL_SCHEDULE_FOR_DELETION
commit 2038e745db33cc5c3b4e2db8a00a57ede03906a2
Author: Michael Adam <obnox at samba.org>
Date: Wed Mar 9 00:56:25 2011 +0100
control: add macro CHECK_CONTROL_MIN_DATA_SIZE.
This is for the control dispatcher to check whether the input data has
a required minimum size.
commit b9bdef46fedfbc543263b67cfee3e896773cd8e8
Author: Michael Adam <obnox at samba.org>
Date: Thu Dec 23 11:54:09 2010 +0100
vacuum: lower level of hash collision debug message to INFO
commit 3addd28aa73883b3b05888e309d19db0eb67eab9
Author: Michael Adam <obnox at samba.org>
Date: Thu Dec 23 00:27:27 2010 +0100
vacuum: add statistics output to the fast and full traverse runs.
commit 7bbb12695c24da25671f1c39a411295d35870d2c
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 14:19:00 2010 +0100
vacuum: refactor insert_delete_record_data_into_tree() out of add_record_to_delete_tree()
for reuse in filling the delete_queue.
commit 4f0ace982dbb5b4f9c035dbf4cb0ae74cd18d81b
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 21:43:41 2010 +0100
vacuum: change all Vacuum*Interval tunables to default to 10
So, by default we have a fastpath vacuuming every 10 seconds and
full blown db-traverse vacuuming once every 10 minutes.
commit 571683e7c48aeed8ce41c584d016ced7ff0d2e2d
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 21:30:39 2010 +0100
vacuum: disable full db-traverse vacuuming runs when VacuumFastPathCount == 0
commit 23b8c8c5fc8604ee0bd6da1f4b5152277eb5f1c0
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 18:03:38 2010 +0100
vacuum: Only run full vacuumig (db traverse) every VacuumFastPathCount times.
commit 91e6d36a190b1c9e4c8b18f7833e51c5c9a67574
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:54:04 2010 +0100
vacuum: reset the fast path count in the event handle if it exceeds the limit.
commit c0668bfe0bb4e69988ae34d875568d08539e6fb9
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:49:29 2010 +0100
vacuum: bump the number of fast-path runs in the vacuum child destructor
commit 53a39d0cc5ea251c2189ec8178ccb769fa046c43
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:44:02 2010 +0100
vacuum: add a fast_path_count to the vacuum_handle.
commit 0d997ec7e61a7bee2cb05456f9c7d5e6f7a44797
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:42:25 2010 +0100
Add a tunable VacuumFastPathCount.
This will control how many fast-path vacuuming runs wil have to
be done, before a full vacuuming will be triggered, i.e. one with
a db-traversal.
commit 04c335f9195a5fd83c91a57d06b1e4eaa511844e
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:25:35 2010 +0100
vacuum: traverse the delete_queue befor traversing the database.
commit 5eee05c4d256c08f4ee60a1a69efda6844e39729
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:24:32 2010 +0100
vacuum: add delete_queue_traverse() for traversal of the delete_queue.
commit 4d32908fdcec120426536a761e1d0be60f076198
Author: Michael Adam <obnox at samba.org>
Date: Tue Dec 21 11:22:50 2010 +0100
vacuum: reduce indentation in add_record_to_delete_tree()
This simplyfies the logical structure a bit by using early return.
commit 4407e5a7fb045ce56b6d902f7116de663ea648cb
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 17:11:27 2010 +0100
vacuum: refactor new add_record_to_delete_tree() out of vacuum_traverse().
This will be reused by the traversal of the delete_queue list.
commit e99834c1a2eea60f7f974c0689ae0a65cfe178ff
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 16:41:13 2010 +0100
vacuum: skip adding records to list of records to send to lmaster on lmaster
This list is skipped afterwards when the lists are processed.
commit d4ab790c1f679e833eb97816762fcfcee15ccb10
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 16:31:27 2010 +0100
vacuum: refactor new add_record_to_vacuum_fetch_list() out of vacuum_traverse().
This is the function that fills the list of records to send to each lmaster
with the VACUUM_FETCH message.
This function will be reused in the traverse function for the delete_queue.
commit 6c603f85726d2efac9710af7c4875ded2ca7230e
Author: Michael Adam <obnox at samba.org>
Date: Mon Dec 20 10:55:53 2010 +0100
server: rename ctdb_repack_db() to ctdb_vacuum_and_repack_db()
commit 731a6011ce4a1301f86eacb039955745f2b5d866
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 17 02:22:02 2010 +0100
When wiping a database, clear the delete_queue.
commit f19fe5b45748a6998c6950a5b1db7ec2c4468c1c
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 17 01:53:25 2010 +0100
vaccum: clear the fast-path vacuuming delete_queue after creating the vacuuming child.
Maybe we should keep a copy for the case that the vacuuming fails?
commit 0aff1b61dd1b683c6739478008a5b014b933df50
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 17 01:38:09 2010 +0100
When attaching to a non-persistent DB, initialize the delete_queue.
commit 9bbedf786b26bb074f668b31f29a9032af958673
Author: Michael Adam <obnox at samba.org>
Date: Wed Dec 22 14:50:53 2010 +0100
Add a delete_queue to the ctdb database context struct.
This list will be filled by the client using a new
delete control. The list will then be used to implement
a fast-path vacuuming that will traverse this list instead
of traversing the database.
commit c11ca778ee90444c44dee0a629cd2eefa3a1f75e
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 14:11:38 2010 +0100
call: becoming dmaster in VACUUM_MIGRATION, set the VACUUM_MIGRATED record flag
This temporary flag is used for the local record storage function to
decide whether to delete an empty record which has never been migrated
with data as part of the fast-path vacuuming process or, or to store
the record.
commit 4079b8bf7a57a27a45d29784a1b0a414c778e552
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 14:07:21 2010 +0100
call: hand the submitted record_flags to local record storage function.
commit 945187d64cfc7bd30a0c3b0d548cbe582d95dde3
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 14:02:33 2010 +0100
call: transfer the record flags in the ctdb call packets.
This way, the MIGRATED_WITH_DATA information can be transported
along with the records. This is important for vacuuming to function
properly.
The record flags are appended to the data section of the ctdb_req_dmaster
and ctdb_reply_dmaster structs.
Pair-Programmed-With: Stefan Metzmacher <metze at samba.org>
commit fb5d832104970320359b3e474eb291ca3d629380
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 13:59:37 2010 +0100
server: in the VACUUM_FETCH handler, add the VACUUM_MIGRAION to the call flags
This way, the records coming in via this handler, can be treated appropriately.
Namely, they can be deleted instead of being stored when the meet the fast-path
vacuuming criteria (empty, never migrated with data...)
commit dd2449c422f323f9b5485e45107a9cc5acc09e08
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 13:57:01 2010 +0100
add a new record flag CTDB_REC_FLAG_VACUUM_MIGRATED.
This is to be used internally. The purpose is to flag a record
as been migrated by a VACUUM_MIGRATION, which is triggered by
a VACUUM_FETCH message as part of the vacuuming. The local store
routine will base its decision whether to delete or to store
the record (among other things) upon the value of this flag.
This flag should never be stored in the local database copies.
commit 86c844fb08a7fd33e94f56b8d5e43278120e1162
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 14:22:55 2010 +0100
call: Move definition of call flags down to the definition of the flags field.
commit 455cc6616e10b7f09589f9b87cb60f591bb502b0
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 10 14:24:40 2010 +0100
call: add new call flag CTDB_CALL_FLAG_VACUUM_MIGRATION
This is to be used when the CTDB_SRVID_VACUUM_FETCH message
triggers the migration of deleted records to the lmaster.
The lmaster can then delete records that have not been
migrated with data instead of storing them.
commit 101be642e492a3a54231e2e3e6553a59380fe702
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 3 15:24:06 2010 +0100
recoverd: in a recovery, set the MIGRATED_WITH_DATA flag on all records
Those records that are kept after recovery, are non-empty, and
stored identically on all nodes. So this is as if they had been
migrated with data.
Pair-Programmed-With: Stefan Metzmacher <metze at samba.org>
commit f5fb232117886186066ab3430fdd2307cba94960
Author: Michael Adam <obnox at samba.org>
Date: Fri Dec 3 15:21:51 2010 +0100
server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag
commit 3930c7796b72bbf275bbca8aaeceec3e705a964b
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 12:15:41 2011 +0100
vacuum: check lmaster against num_nodes instead of vnn_map->size
When lmaster is bigger than the biggest recorded node number,
then exit the traverse with error.
commit bc4990e600c53433a924a0d70e3488a5a6bdc1ff
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 17:47:36 2011 +0100
vacuum: reduce indentation of the loop sending VACUUM_FETCH controls
This slightly improves the code structure in that loop.
commit 49247df4a47a8a107fa7dd7b187e69e243e6bdbe
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 12:26:45 2011 +0100
vacuum: correctly send TRY_DELETE_RECORDS ctrl to all active nodes
Originally, the control was sent to all records in the vnn_map, but
there was something still missing here:
When a node can not become lmaster (via CTDB_CAPABILITY_LMASTER=no)
then it will not be part of the vnn_map. So such a node would
be active but never receive the TRY_DELETE_RECORDS control from a
vacuuming run.
This is fixed in this change by correctly building the list of
active nodes first in the same way that the recovery process does it.
commit 136508e3f4dd0acc210dde938ad59ef38b63d3a1
Author: Michael Adam <obnox at samba.org>
Date: Thu Feb 3 12:18:58 2011 +0100
vacuum: in ctdb_vacuum_db, fix the length of the array of vacuum fetch lists
This patch fixes segfaults in the vacuum child when at least one
node has been stopped or removed from the cluster:
The size of the vnn_map is only the number of active nodes
(that can be lmaster). But the node numbers that are referenced
by the vnn_map spread over all configured nodes.
Since the array of vacuum fetch lists is referenced by the
key's lmaster's node number later on, the array needs to
be of size num_nodes instad of vnn_map->size.
-----------------------------------------------------------------------
Summary of changes:
client/ctdb_client.c | 9 +
common/ctdb_ltdb.c | 5 +
include/ctdb_private.h | 35 +++
include/ctdb_protocol.h | 6 +-
server/ctdb_call.c | 52 +++-
server/ctdb_control.c | 9 +
server/ctdb_daemon.c | 1 +
server/ctdb_freeze.c | 11 +
server/ctdb_ltdb_server.c | 204 ++++++++++++++-
server/ctdb_recoverd.c | 2 +
server/ctdb_tunables.c | 7 +-
server/ctdb_vacuum.c | 667 ++++++++++++++++++++++++++++++++++++++-------
tests/src/ctdb_randrec.c | 43 +++-
13 files changed, 945 insertions(+), 106 deletions(-)
Changeset truncated at 500 lines:
diff --git a/client/ctdb_client.c b/client/ctdb_client.c
index 99ff72d..a43710f 100644
--- a/client/ctdb_client.c
+++ b/client/ctdb_client.c
@@ -4234,3 +4234,12 @@ int ctdb_ctrl_getstathistory(struct ctdb_context *ctdb, struct timeval timeout,
return 0;
}
+
+struct ctdb_ltdb_header *ctdb_header_from_record_handle(struct ctdb_record_handle *h)
+{
+ if (h == NULL) {
+ return NULL;
+ }
+
+ return &h->header;
+}
diff --git a/common/ctdb_ltdb.c b/common/ctdb_ltdb.c
index 200cca4..3ee7fe8 100644
--- a/common/ctdb_ltdb.c
+++ b/common/ctdb_ltdb.c
@@ -65,6 +65,7 @@ static void ltdb_initial_header(struct ctdb_db_context *ctdb_db,
ZERO_STRUCTP(header);
/* initial dmaster is the lmaster */
header->dmaster = ctdb_lmaster(ctdb_db->ctdb, &key);
+ header->flags = CTDB_REC_FLAG_AUTOMATIC;
}
@@ -129,6 +130,10 @@ int ctdb_ltdb_store(struct ctdb_db_context *ctdb_db, TDB_DATA key,
int ret;
bool seqnum_suppressed = false;
+ if (ctdb_db->ctdb_ltdb_store_fn) {
+ return ctdb_db->ctdb_ltdb_store_fn(ctdb_db, key, header, data);
+ }
+
if (ctdb->flags & CTDB_FLAG_TORTURE) {
struct ctdb_ltdb_header *h2;
rec = tdb_fetch(ctdb_db->ltdb->tdb, key);
diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index 68877ec..396427b 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -119,6 +119,7 @@ struct ctdb_tunable {
uint32_t allow_unhealthy_db_read;
uint32_t stat_history_interval;
uint32_t deferred_attach_timeout;
+ uint32_t vacuum_fast_path_count;
};
/*
@@ -514,6 +515,12 @@ struct ctdb_db_context {
struct lockwait_handle *lockwait_active;
struct lockwait_handle *lockwait_overflow;
struct ctdb_persistent_state *persistent_state;
+ struct trbt_tree *delete_queue;
+ int (*ctdb_ltdb_store_fn)(struct ctdb_db_context *ctdb_db,
+ TDB_DATA key,
+ struct ctdb_ltdb_header *header,
+ TDB_DATA data);
+
};
@@ -840,6 +847,14 @@ ctdb_control_send(struct ctdb_context *ctdb,
} \
} while (0)
+#define CHECK_CONTROL_MIN_DATA_SIZE(size) do { \
+ if (indata.dsize < size) { \
+ DEBUG(0,(__location__ " Invalid data size in opcode %u. Got %u expected >= %u\n", \
+ opcode, (unsigned)indata.dsize, (unsigned)size)); \
+ return -1; \
+ } \
+ } while (0)
+
int ctdb_control_getvnnmap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA *outdata);
int ctdb_control_setvnnmap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA *outdata);
int ctdb_control_getdbmap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA *outdata);
@@ -1375,4 +1390,24 @@ int ctdb_deferred_drop_all_ips(struct ctdb_context *ctdb);
int ctdb_process_deferred_attach(struct ctdb_context *ctdb);
+/**
+ * structure to pass to a schedule_for_deletion_control
+ */
+struct ctdb_control_schedule_for_deletion {
+ uint32_t db_id;
+ struct ctdb_ltdb_header hdr;
+ uint32_t keylen;
+ uint8_t key[1]; /* key[] */
+};
+
+int32_t ctdb_control_schedule_for_deletion(struct ctdb_context *ctdb,
+ TDB_DATA indata);
+
+
+int32_t ctdb_local_schedule_for_deletion(struct ctdb_db_context *ctdb_db,
+ const struct ctdb_ltdb_header *hdr,
+ TDB_DATA key);
+
+struct ctdb_ltdb_header *ctdb_header_from_record_handle(struct ctdb_record_handle *h);
+
#endif
diff --git a/include/ctdb_protocol.h b/include/ctdb_protocol.h
index b6b753c..0422afe 100644
--- a/include/ctdb_protocol.h
+++ b/include/ctdb_protocol.h
@@ -34,13 +34,14 @@
#define CTDB_FETCH_FUNC 0xFF000002
-#define CTDB_IMMEDIATE_MIGRATION 0x00000001
struct ctdb_call {
int call_id;
TDB_DATA key;
TDB_DATA call_data;
TDB_DATA reply_data;
uint32_t status;
+#define CTDB_IMMEDIATE_MIGRATION 0x00000001
+#define CTDB_CALL_FLAG_VACUUM_MIGRATION 0x00000002
uint32_t flags;
};
@@ -361,6 +362,7 @@ enum ctdb_controls {CTDB_CONTROL_PROCESS_EXISTS = 0,
CTDB_CONTROL_SET_IFACE_LINK_STATE = 125,
CTDB_CONTROL_TCP_ADD_DELAYED_UPDATE = 126,
CTDB_CONTROL_GET_STAT_HISTORY = 127,
+ CTDB_CONTROL_SCHEDULE_FOR_DELETION = 128,
};
/*
@@ -482,6 +484,8 @@ struct ctdb_ltdb_header {
uint32_t reserved1;
#define CTDB_REC_FLAG_DEFAULT 0x00000000
#define CTDB_REC_FLAG_MIGRATED_WITH_DATA 0x00010000
+#define CTDB_REC_FLAG_VACUUM_MIGRATED 0x00020000
+#define CTDB_REC_FLAG_AUTOMATIC 0x00040000
uint32_t flags;
};
diff --git a/server/ctdb_call.c b/server/ctdb_call.c
index e188fcf..73072c3 100644
--- a/server/ctdb_call.c
+++ b/server/ctdb_call.c
@@ -181,7 +181,7 @@ static void ctdb_send_dmaster_reply(struct ctdb_db_context *ctdb_db,
tmp_ctx = talloc_new(ctdb);
/* send the CTDB_REPLY_DMASTER */
- len = offsetof(struct ctdb_reply_dmaster, data) + key.dsize + data.dsize;
+ len = offsetof(struct ctdb_reply_dmaster, data) + key.dsize + data.dsize + sizeof(uint32_t);
r = ctdb_transport_allocate(ctdb, tmp_ctx, CTDB_REPLY_DMASTER, len,
struct ctdb_reply_dmaster);
CTDB_NO_MEMORY_FATAL(ctdb, r);
@@ -194,6 +194,7 @@ static void ctdb_send_dmaster_reply(struct ctdb_db_context *ctdb_db,
r->db_id = ctdb_db->db_id;
memcpy(&r->data[0], key.dptr, key.dsize);
memcpy(&r->data[key.dsize], data.dptr, data.dsize);
+ memcpy(&r->data[key.dsize+data.dsize], &header->flags, sizeof(uint32_t));
ctdb_queue_packet(ctdb, &r->hdr);
@@ -222,13 +223,18 @@ static void ctdb_call_send_dmaster(struct ctdb_db_context *ctdb_db,
return;
}
+ if (data->dsize != 0) {
+ header->flags |= CTDB_REC_FLAG_MIGRATED_WITH_DATA;
+ }
+
if (lmaster == ctdb->pnn) {
ctdb_send_dmaster_reply(ctdb_db, header, *key, *data,
c->hdr.srcnode, c->hdr.reqid);
return;
}
- len = offsetof(struct ctdb_req_dmaster, data) + key->dsize + data->dsize;
+ len = offsetof(struct ctdb_req_dmaster, data) + key->dsize + data->dsize
+ + sizeof(uint32_t);
r = ctdb_transport_allocate(ctdb, ctdb, CTDB_REQ_DMASTER, len,
struct ctdb_req_dmaster);
CTDB_NO_MEMORY_FATAL(ctdb, r);
@@ -241,6 +247,7 @@ static void ctdb_call_send_dmaster(struct ctdb_db_context *ctdb_db,
r->datalen = data->dsize;
memcpy(&r->data[0], key->dptr, key->dsize);
memcpy(&r->data[key->dsize], data->dptr, data->dsize);
+ memcpy(&r->data[key->dsize + data->dsize], &header->flags, sizeof(uint32_t));
header->dmaster = c->hdr.srcnode;
if (ctdb_ltdb_store(ctdb_db, *key, header, *data) != 0) {
@@ -258,10 +265,10 @@ static void ctdb_call_send_dmaster(struct ctdb_db_context *ctdb_db,
must be called with the chainlock held. This function releases the chainlock
*/
-static void ctdb_become_dmaster(struct ctdb_db_context *ctdb_db,
+static void ctdb_become_dmaster(struct ctdb_db_context *ctdb_db,
struct ctdb_req_header *hdr,
TDB_DATA key, TDB_DATA data,
- uint64_t rsn)
+ uint64_t rsn, uint32_t record_flags)
{
struct ctdb_call_state *state;
struct ctdb_context *ctdb = ctdb_db->ctdb;
@@ -273,6 +280,21 @@ static void ctdb_become_dmaster(struct ctdb_db_context *ctdb_db,
ZERO_STRUCT(header);
header.rsn = rsn + 1;
header.dmaster = ctdb->pnn;
+ header.flags = record_flags;
+
+ state = ctdb_reqid_find(ctdb, hdr->reqid, struct ctdb_call_state);
+
+ if (state) {
+ if (state->call->flags & CTDB_CALL_FLAG_VACUUM_MIGRATION) {
+ /*
+ * We temporarily add the VACUUM_MIGRATED flag to
+ * the record flags, so that ctdb_ltdb_store can
+ * decide whether the record should be stored or
+ * deleted.
+ */
+ header.flags |= CTDB_REC_FLAG_VACUUM_MIGRATED;
+ }
+ }
if (ctdb_ltdb_store(ctdb_db, key, &header, data) != 0) {
ctdb_fatal(ctdb, "ctdb_reply_dmaster store failed\n");
@@ -284,7 +306,6 @@ static void ctdb_become_dmaster(struct ctdb_db_context *ctdb_db,
return;
}
- state = ctdb_reqid_find(ctdb, hdr->reqid, struct ctdb_call_state);
if (state == NULL) {
DEBUG(DEBUG_ERR,("pnn %u Invalid reqid %u in ctdb_become_dmaster from node %u\n",
@@ -345,12 +366,19 @@ void ctdb_request_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr
TDB_DATA key, data, data2;
struct ctdb_ltdb_header header;
struct ctdb_db_context *ctdb_db;
+ uint32_t record_flags = 0;
+ size_t len;
int ret;
key.dptr = c->data;
key.dsize = c->keylen;
data.dptr = c->data + c->keylen;
data.dsize = c->datalen;
+ len = offsetof(struct ctdb_req_dmaster, data) + key.dsize + data.dsize
+ + sizeof(uint32_t);
+ if (len <= c->hdr.length) {
+ record_flags = *(uint32_t *)&c->data[c->keylen + c->datalen];
+ }
ctdb_db = find_ctdb_db(ctdb, c->db_id);
if (!ctdb_db) {
@@ -407,10 +435,13 @@ void ctdb_request_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr
/* use the rsn from the sending node */
header.rsn = c->rsn;
+ /* store the record flags from the sending node */
+ header.flags = record_flags;
+
/* check if the new dmaster is the lmaster, in which case we
skip the dmaster reply */
if (c->dmaster == ctdb->pnn) {
- ctdb_become_dmaster(ctdb_db, hdr, key, data, c->rsn);
+ ctdb_become_dmaster(ctdb_db, hdr, key, data, c->rsn, record_flags);
} else {
ctdb_send_dmaster_reply(ctdb_db, &header, key, data, c->dmaster, hdr->reqid);
@@ -583,6 +614,8 @@ void ctdb_reply_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr)
struct ctdb_reply_dmaster *c = (struct ctdb_reply_dmaster *)hdr;
struct ctdb_db_context *ctdb_db;
TDB_DATA key, data;
+ uint32_t record_flags = 0;
+ size_t len;
int ret;
ctdb_db = find_ctdb_db(ctdb, c->db_id);
@@ -595,6 +628,11 @@ void ctdb_reply_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr)
key.dsize = c->keylen;
data.dptr = &c->data[key.dsize];
data.dsize = c->datalen;
+ len = offsetof(struct ctdb_reply_dmaster, data) + key.dsize + data.dsize
+ + sizeof(uint32_t);
+ if (len <= c->hdr.length) {
+ record_flags = *(uint32_t *)&c->data[c->keylen + c->datalen];
+ }
ret = ctdb_ltdb_lock_requeue(ctdb_db, key, hdr,
ctdb_call_input_pkt, ctdb, False);
@@ -606,7 +644,7 @@ void ctdb_reply_dmaster(struct ctdb_context *ctdb, struct ctdb_req_header *hdr)
return;
}
- ctdb_become_dmaster(ctdb_db, hdr, key, data, c->rsn);
+ ctdb_become_dmaster(ctdb_db, hdr, key, data, c->rsn, record_flags);
}
diff --git a/server/ctdb_control.c b/server/ctdb_control.c
index 69724e3..748907f 100644
--- a/server/ctdb_control.c
+++ b/server/ctdb_control.c
@@ -604,6 +604,15 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
CHECK_CONTROL_DATA_SIZE(0);
return ctdb_control_get_stat_history(ctdb, c, outdata);
+ case CTDB_CONTROL_SCHEDULE_FOR_DELETION: {
+ struct ctdb_control_schedule_for_deletion *d;
+ size_t size = offsetof(struct ctdb_control_schedule_for_deletion, key);
+ CHECK_CONTROL_MIN_DATA_SIZE(size);
+ d = (struct ctdb_control_schedule_for_deletion *)indata.dptr;
+ size += d->keylen;
+ CHECK_CONTROL_DATA_SIZE(size);
+ return ctdb_control_schedule_for_deletion(ctdb, indata);
+ }
default:
DEBUG(DEBUG_CRIT,(__location__ " Unknown CTDB control opcode %u\n", opcode));
return -1;
diff --git a/server/ctdb_daemon.c b/server/ctdb_daemon.c
index 9c650a0..75344ad 100644
--- a/server/ctdb_daemon.c
+++ b/server/ctdb_daemon.c
@@ -777,6 +777,7 @@ int ctdb_start_daemon(struct ctdb_context *ctdb, bool do_fork, bool use_syslog,
block_signal(SIGPIPE);
ctdbd_pid = getpid();
+ ctdb->ctdbd_pid = ctdbd_pid;
DEBUG(DEBUG_ERR, ("Starting CTDBD as pid : %u\n", ctdbd_pid));
diff --git a/server/ctdb_freeze.c b/server/ctdb_freeze.c
index 86cb5ed..0f70fd3 100644
--- a/server/ctdb_freeze.c
+++ b/server/ctdb_freeze.c
@@ -25,6 +25,7 @@
#include "../include/ctdb_private.h"
#include "lib/util/dlinklist.h"
#include "db_wrap.h"
+#include "../common/rb_tree.h"
static bool later_db(const char *name)
{
@@ -605,5 +606,15 @@ int32_t ctdb_control_wipe_database(struct ctdb_context *ctdb, TDB_DATA indata)
return -1;
}
+ if (!ctdb_db->persistent) {
+ talloc_free(ctdb_db->delete_queue);
+ ctdb_db->delete_queue = trbt_create(ctdb_db, 0);
+ if (ctdb_db->delete_queue == NULL) {
+ DEBUG(DEBUG_ERR, (__location__ " Failed to re-create "
+ "the vacuum tree.\n"));
+ return -1;
+ }
+ }
+
return 0;
}
diff --git a/server/ctdb_ltdb_server.c b/server/ctdb_ltdb_server.c
index 19a68ec..92fb0f6 100644
--- a/server/ctdb_ltdb_server.c
+++ b/server/ctdb_ltdb_server.c
@@ -25,6 +25,7 @@
#include "system/dir.h"
#include "system/time.h"
#include "../include/ctdb_private.h"
+#include "../common/rb_tree.h"
#include "db_wrap.h"
#include "lib/util/dlinklist.h"
#include <ctype.h>
@@ -49,6 +50,199 @@ static int ctdb_fetch_func(struct ctdb_call_info *call)
}
+/**
+ * write a record to a normal database
+ *
+ * This is the server-variant of the ctdb_ltdb_store function.
+ * It contains logic to determine whether a record should be
+ * stored or deleted. It also sends SCHEDULE_FOR_DELETION
+ * controls to the local ctdb daemon if apporpriate.
+ */
+static int ctdb_ltdb_store_server(struct ctdb_db_context *ctdb_db,
+ TDB_DATA key,
+ struct ctdb_ltdb_header *header,
+ TDB_DATA data)
+{
+ struct ctdb_context *ctdb = ctdb_db->ctdb;
+ TDB_DATA rec;
+ int ret;
+ bool seqnum_suppressed = false;
+ bool keep = false;
+ bool schedule_for_deletion = false;
+ uint32_t lmaster;
+
+ if (ctdb->flags & CTDB_FLAG_TORTURE) {
+ struct ctdb_ltdb_header *h2;
+ rec = tdb_fetch(ctdb_db->ltdb->tdb, key);
+ h2 = (struct ctdb_ltdb_header *)rec.dptr;
+ if (rec.dptr && rec.dsize >= sizeof(h2) && h2->rsn > header->rsn) {
+ DEBUG(DEBUG_CRIT,("RSN regression! %llu %llu\n",
+ (unsigned long long)h2->rsn, (unsigned long long)header->rsn));
+ }
+ if (rec.dptr) free(rec.dptr);
+ }
+
+ if (ctdb->vnn_map == NULL) {
+ /*
+ * Called from a client: always store the record
+ * Also don't call ctdb_lmaster since it uses the vnn_map!
+ */
+ keep = true;
+ goto store;
+ }
+
+ lmaster = ctdb_lmaster(ctdb_db->ctdb, &key);
+
+ /*
+ * If we migrate an empty record off to another node
+ * and the record has not been migrated with data,
+ * delete the record instead of storing the empty record.
+ */
+ if (data.dsize != 0) {
+ keep = true;
+ } else if (ctdb_db->persistent) {
+ keep = true;
+ } else if (header->flags & CTDB_REC_FLAG_AUTOMATIC) {
+ /*
+ * The record is not created by the client but
+ * automatically by the ctdb_ltdb_fetch logic that
+ * creates a record with an initial header in the
+ * ltdb before trying to migrate the record from
+ * the current lmaster. Keep it instead of trying
+ * to delete the non-existing record...
+ */
+ keep = true;
+ schedule_for_deletion = true;
+ } else if (header->flags & CTDB_REC_FLAG_MIGRATED_WITH_DATA) {
+ keep = true;
+ } else if (ctdb_db->ctdb->pnn == lmaster) {
+ /*
+ * If we are lmaster, then we usually keep the record.
+ * But if we retrieve the dmaster role by a VACUUM_MIGRATE
+ * and the record is empty and has never been migrated
+ * with data, then we should delete it instead of storing it.
+ * This is part of the vacuuming process.
+ *
+ * The reason that we usually need to store even empty records
+ * on the lmaster is that a client operating directly on the
+ * lmaster (== dmaster) expects the local copy of the record to
+ * exist after successful ctdb migrate call. If the record does
+ * not exist, the client goes into a migrate loop and eventually
+ * fails. So storing the empty record makes sure that we do not
+ * need to change the client code.
+ */
+ if (!(header->flags & CTDB_REC_FLAG_VACUUM_MIGRATED)) {
+ keep = true;
+ } else if (ctdb_db->ctdb->pnn != header->dmaster) {
+ keep = true;
+ }
+ } else if (ctdb_db->ctdb->pnn == header->dmaster) {
+ keep = true;
+ }
+
+ if (keep &&
+ (data.dsize == 0) &&
+ !ctdb_db->persistent &&
+ (ctdb_db->ctdb->pnn == header->dmaster))
+ {
+ schedule_for_deletion = true;
+ }
+
+store:
+ /*
+ * The VACUUM_MIGRATED flag is only set temporarily for
+ * the above logic when the record was retrieved by a
+ * VACUUM_MIGRATE call and should not be stored in the
+ * database.
+ *
+ * The VACUUM_MIGRATE call is triggered by a vacuum fetch,
+ * and there are two cases in which the corresponding record
+ * is stored in the local database:
+ * 1. The record has been migrated with data in the past
+ * (the MIGRATED_WITH_DATA record flag is set).
+ * 2. The record has been filled with data again since it
+ * had been submitted in the VACUUM_FETCH message to the
+ * lmaster.
+ * For such records it is important to not store the
+ * VACUUM_MIGRATED flag in the database.
+ */
+ header->flags &= ~CTDB_REC_FLAG_VACUUM_MIGRATED;
+
+ /*
+ * Similarly, clear the AUTOMATIC flag which should not enter
+ * the local database copy since this would require client
+ * modifications to clear the flag when the client stores
+ * the record.
+ */
+ header->flags &= ~CTDB_REC_FLAG_AUTOMATIC;
--
CTDB repository
More information about the samba-cvs
mailing list