[CTDB] strange fork bomb bug
David Disseldorp
ddiss at suse.de
Sun Mar 23 14:23:59 MDT 2014
Hi Mathieu,
On Fri, 7 Mar 2014 12:23:42 +0100
Mathieu Parent <math.parent at gmail.com> wrote:
> We had a strange ctdb behavior recently on an 8-nodes cluster: each
> node had 192 ctdbd processes (instead of the usual 2), using 1024 or
> so file descriptors each (which is the default linux limit)! Mostly
> :pipe and :socket. It was hard to connect via SSH then, and even the
> process table looked corrupted. Solution was to stop then kill ctdbd.
>
> It seems that when the ctdbd child is blocked, the parent create a new
> one without cleaning the older, untill hiting resource limits.
Are the processes all waiting on record locks? I'd suggest looking at
/proc/locks, and also checking the lockwait metrics displayed with
"ctdb statistics". We've run into similar issues under record lock
contention.
> This was on an old version (Debian 1.12+git20120201-4), I wonder if
> that has been fixed since.
If the processes are all lockwait forks, then consider merging
"ctdb_lockwait: create overflow queue" and "LockWait congestion" if
you don't have them already.
Cheers, David
More information about the samba-technical
mailing list