[PATCH] ctdb-locking: Back-off from logging every 10 seconds
Michael Adam
obnox at samba.org
Thu Mar 5 01:14:35 MST 2015
Looks good to me.
Doing an private build/test and pushing then.
Michael
On 2015-03-05 at 16:27 +1100, Amitay Isaacs wrote:
> Hi,
>
> This patch prevents flooding of debug logs by locking code when a lock
> helper is unable to obtain a lock for a long time. Instead of logging
> every 10 seconds, increase the interval to 100 seconds and 1000 seconds
> when the elapsed time reaches 100 seconds and 1000 seconds respectively.
>
> Please review and push if ok.
>
> Amitay.
> From a8d10da180dea7e4c202176fc447f370662bb6f5 Mon Sep 17 00:00:00 2001
> From: Amitay Isaacs <amitay at gmail.com>
> Date: Wed, 4 Mar 2015 15:36:05 +1100
> Subject: [PATCH] ctdb-locking: Back-off from logging every 10 seconds
>
> If ctdb_lock_helper cannot get a lock within 10 seconds, ctdb daemon
> logs a message and invokes an external debug script. This is repeated
> every 10 seconds.
>
> In case of a contention or on a loaded system, there can be multiple
> ctdb_lock_helper processes waiting to get lock on record(s). For each
> lock request taking longer, ctdb daemon will flood the log every
> 10 seconds. Instead of logging aggressively every 10 seconds, relax
> logging to every 100s and 1000s if the elapsed time has exceeded 100s
> and 1000s respectively.
>
> Signed-off-by: Amitay Isaacs <amitay at gmail.com>
> ---
> ctdb/server/ctdb_lock.c | 20 ++++++++++++++++----
> 1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/ctdb/server/ctdb_lock.c b/ctdb/server/ctdb_lock.c
> index 7959d40..c5a2b98 100644
> --- a/ctdb/server/ctdb_lock.c
> +++ b/ctdb/server/ctdb_lock.c
> @@ -486,6 +486,8 @@ static void ctdb_lock_timeout_handler(struct tevent_context *ev,
> struct lock_context *lock_ctx;
> struct ctdb_context *ctdb;
> pid_t pid;
> + double elapsed_time;
> + int new_timer;
>
> lock_ctx = talloc_get_type_abort(private_data, struct lock_context);
> ctdb = lock_ctx->ctdb;
> @@ -495,16 +497,17 @@ static void ctdb_lock_timeout_handler(struct tevent_context *ev,
> lock_ctx->ttimer = NULL;
> return;
> }
> +
> + elapsed_time = timeval_elapsed(&lock_ctx->start_time);
> if (lock_ctx->ctdb_db) {
> DEBUG(DEBUG_WARNING,
> ("Unable to get %s lock on database %s for %.0lf seconds\n",
> (lock_ctx->type == LOCK_RECORD ? "RECORD" : "DB"),
> - lock_ctx->ctdb_db->db_name,
> - timeval_elapsed(&lock_ctx->start_time)));
> + lock_ctx->ctdb_db->db_name, elapsed_time));
> } else {
> DEBUG(DEBUG_WARNING,
> ("Unable to get ALLDB locks for %.0lf seconds\n",
> - timeval_elapsed(&lock_ctx->start_time)));
> + elapsed_time));
> }
>
> /* Fire a child process to find the blocking process. */
> @@ -529,11 +532,20 @@ static void ctdb_lock_timeout_handler(struct tevent_context *ev,
> " Unable to setup lock debugging - no memory?\n"));
> }
>
> + /* Back-off logging if lock is not obtained for a long time */
> + if (elapsed_time < 100.0) {
> + new_timer = 10;
> + } else if (elapsed_time < 1000.0) {
> + new_timer = 100;
> + } else {
> + new_timer = 1000;
> + }
> +
> /* reset the timeout timer */
> // talloc_free(lock_ctx->ttimer);
> lock_ctx->ttimer = tevent_add_timer(ctdb->ev,
> lock_ctx,
> - timeval_current_ofs(10, 0),
> + timeval_current_ofs(new_timer, 0),
> ctdb_lock_timeout_handler,
> (void *)lock_ctx);
> }
> --
> 1.9.3
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150305/5bcf06dd/attachment.pgp>
More information about the samba-technical
mailing list