[Samba] ctdb vacuum timeouts and record locks

Computerisms Corporation bob at computerisms.ca
Wed Nov 15 06:48:57 UTC 2017


Hi Martin,

well, it has been over a week since my last hung process, but got 
another one today...
>> So, not sure how to determine if this is a gluster problem, an lxc
>> problem, or a ctdb/smbd problem.  Thoughts/suggestions are welcome...
> 
> You need a stack trace of the stuck smbd process.  If it is wedged in a
> system call on the cluster filesystem then you can blame the cluster
> filesystem.  debug_locks.sh is meant to be able to get you the relevant
> stack trace via gstack.  In fact, even before you get the stack trace
> you could check a process listing to see if the process is stuck in D
> state.

So, yes, I do have a process stuck in the D state.  is in an smbd 
process.  matching up the times in the logs, I see that the the 
"Vacuuming child process timed out for db locking.tdb" error in ctdb 
lines up with the user who owns the the smbd process accessing a file 
that has been problematic before.  it is an xlsx file.

> gstack basically does:
> 
>    gdb -batch -ex "thread apply all bt" -p <pid>
> 
> For a single-threaded process it leaves out "thread apply all".
> However, in recent GDB I'm not sure it makes a difference... seems to
> work for me on Linux.
> 
> Note that gstack/gdb will hang when run against a process in D state.

Indeed, gdb, pstack, and strace all either hang or output no information.

I have been trying to find a way to get the actual gdb output, but all I 
can seem to find is the contents of /proc/<pid>/stack:

[<ffffffffc05ed856>] request_wait_answer+0x166/0x1f0 [fuse]
[<ffffffffa04b8d50>] prepare_to_wait_event+0xf0/0xf0
[<ffffffffc05ed958>] __fuse_request_send+0x78/0x80 [fuse]
[<ffffffffc05f0bdd>] fuse_simple_request+0xbd/0x190 [fuse]
[<ffffffffc05f6c37>] fuse_setlk+0x177/0x190 [fuse]
[<ffffffffa0659467>] SyS_flock+0x117/0x190
[<ffffffffa0403b1c>] do_syscall_64+0x7c/0xf0
[<ffffffffa0a0632f>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff

I am still not too sure how to interpret this, but I think this is 
pointing me to the gluster file system, so will see what I can find 
chasing that down...


> 
> peace & happiness,
> martin
> 



More information about the samba mailing list