[PATCH 0/6] CTDB monitoring with Performance Co-Pilot

David Disseldorp ddiss at suse.de
Sun Sep 4 12:27:32 MDT 2011


ctdbd currently captures a bunch of interesting and useful runtime
metrics indicating the current load and health of a running node.
Call counters, request processing latencies and operation timeouts are
but a few of the metrics captured.

This patch series adds a CTDB Performance Metrics Domain Agent (PMDA),
which acts as an underlying data source for the Performance Co-Pilot
(PCP) suite of monitoring tools. This allows real-time and retrospective
analysis of distributed CTDB cluster nodes from a single pane with PCP
GUI utilities such as pmchart[1].

The PMDA runs as a separate process, periodically fetching metric data
from the locally running CTDB daemon. Feedback welcome.

Cheers, David

[1] pmchart screenshot - monitoring two CTDB nodes:

The following changes since commit cdbc800a776f213cfd0ed543cee85b0d1714a186:

  tests:ctdb_fetch_lock_once   we must link with @POPT_OBJ@ in case -lpopt is not available (2011-09-02 13:31:41 +1000)

are available in the git repository at:
  git://oss.sgi.com/ddiss/ctdb mstr_pmda

David Disseldorp (6):
      pmda: Initial ctdb pmda check-in
      pmda: Attempt reconnects while ctdbd is unavailable
      pmda: Pull ctdb statistics once per fetch
      pmda: Use CTDB_PATH macro for default socket path
      pmda: document in README how to add a new metric
      pmda: handle struct latency_counter and add num_recoveries

