[Samba] ctdb vacuum timeouts and record locks

Computerisms Corporation bob at computerisms.ca
Thu Nov 2 18:17:27 UTC 2017


Hi,

This occurred again this morning, when the user reported the problem, I 
found in the ctdb logs that vacuuming has been going on since last 
night.  The need to fix it was urgent (when isn't it?) so I didn't have 
time to poke around for clues, but immediately restarted the lxc 
container.  But this time it wouldn't restart, which I had time to trace 
to a hung smbd process, and between that and a run of the debug_locks.sh 
script, I traced it to the user reporting the problem.  Given that the 
user was primarily having problems with files in a given folder, I am 
thinking this is because of some kind of lock on a file within that 
folder.

Ended up rebooting both physical machines, problem solved.  for now.

So, not sure how to determine if this is a gluster problem, an lxc 
problem, or a ctdb/smbd problem.  Thoughts/suggestions are welcome...

On 2017-10-27 10:09 AM, Computerisms Corporation via samba wrote:
> Hi Martin,
> 
> Thanks for reading and taking the time to reply
> 
>>> ctdbd[89]: Unable to get RECORD lock on database locking.tdb for 20 
>>> seconds
>>> /usr/local/samba/etc/ctdb/debug_locks.sh: 142:
>>> /usr/local/samba/etc/ctdb/debug_locks.sh: cannot create : Directory
>>> nonexistent
>>> sh: echo: I/O error
>>> sh: echo: I/O error
>>
>> That's weird.  The only file really created by that script is the lock
>> file that is used to make sure we don't debug locks too many times.
>> That should be in:
>>
>>    "${CTDB_SCRIPT_VARDIR}/debug_locks.lock"
> 
> Next time it happens I will check this.
> 
>> The other possibility is the use of the script_log() function to try to
>> get the output logged.  script_log() isn't my greatest moment.  When
>> debugging you could just replace it with the logger command to get the
>> output out to syslog.
> 
> Okay, that sounds useful, will see what I can do next time I see the 
> problem...
> 
>>> My setup is two servers, the OS is debian and is running samba AD on
>>> dedicated SSDs, and each server has a RAID array of HDDs for storage,
>>> with a mirrored GlusterFS running on top of them.  Each OS has an LXC
>>> container running the clustered member servers with the GlusterFS
>>> mounted to the containers.  The tdb files are in the containers, not on
>>> the shared storage.  I do not use ctdb to start smbd/nmbd.  I can't
>>> think what else is relevant about my setup as it pertains to this 
>>> issue...
>>
>> Are the TDB files really on a FUSE filesystem?  Is that an artifact of
>> the LXC containers?  If so, could it be that locking isn't reliable on
>> the FUSE filesystem?
> 
> No.  The TDB files are in the container, and the container is on the SSD 
> with the OS.  running mount from within the container shows:
> 
> /dev/sda1 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
> 
> However, the gluster native client is a fuse-based system, so the data 
> is stored on a fuse system which is mounted in the container:
> 
> masterchieflian:ctfngluster on /CTFN type fuse.glusterfs 
> (rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)
> 
> Since this is where the files that become inaccessible are, perhaps this 
> is really where the problem is, and not with the locking.tdb file?  I 
> will investigate about file locks on the gluster system...
> 
>> Is it possible to try this without the containers?  That would
>> certainly tell you if the problem is related to the container
>> infrastructure...
> 
> I like to think everything is possible, but it's not really feasible in 
> this case.  Since there are only two physical servers, and they need to 
> be running AD, the only way to separate the containers now is with 
> additional machines to act as member servers.  And because everything 
> tested fine and actually was fine for at least two weeks, these servers 
> are in production now and have been for a few months.  If I have to go 
> this way, it will certainly be a last resort...
> 
> Thanks again for your reply, will get back to you with what I find...
> 
> 
> 
> 
>>
>> peace & happiness,
>> martin
>>
> 



More information about the samba mailing list