[Samba] ctdb vacuum timeouts and record locks
Martin Schwenke
martin at meltin.net
Mon Nov 6 01:15:56 UTC 2017
On Thu, 2 Nov 2017 11:17:27 -0700, Computerisms Corporation via samba
<samba at lists.samba.org> wrote:
> This occurred again this morning, when the user reported the problem, I
> found in the ctdb logs that vacuuming has been going on since last
> night. The need to fix it was urgent (when isn't it?) so I didn't have
> time to poke around for clues, but immediately restarted the lxc
> container. But this time it wouldn't restart, which I had time to trace
> to a hung smbd process, and between that and a run of the debug_locks.sh
> script, I traced it to the user reporting the problem. Given that the
> user was primarily having problems with files in a given folder, I am
> thinking this is because of some kind of lock on a file within that
> folder.
>
> Ended up rebooting both physical machines, problem solved. for now.
>
> So, not sure how to determine if this is a gluster problem, an lxc
> problem, or a ctdb/smbd problem. Thoughts/suggestions are welcome...
You need a stack trace of the stuck smbd process. If it is wedged in a
system call on the cluster filesystem then you can blame the cluster
filesystem. debug_locks.sh is meant to be able to get you the relevant
stack trace via gstack. In fact, even before you get the stack trace
you could check a process listing to see if the process is stuck in D
state.
gstack basically does:
gdb -batch -ex "thread apply all bt" -p <pid>
For a single-threaded process it leaves out "thread apply all".
However, in recent GDB I'm not sure it makes a difference... seems to
work for me on Linux.
Note that gstack/gdb will hang when run against a process in D state.
peace & happiness,
martin
More information about the samba
mailing list