Very frequent : Vacuuming child process timed out for db locking.tdb. Bottleneck?

Nicolas Ecarnot nicolas at ecarnot.net
Mon May 26 13:24:21 MDT 2014


Le 26/05/2014 10:03, Amitay Isaacs a écrit :
> Hi Nicolas,
>
> What version of CTDB and Samba are you using?

samba-3.6.9-168.el6_5.x86_64

ctdb-1.0.114.5-3.el6.x86_64

on

Oracle Linux Server release 6.5

with

3.8.13-26.2.4.el6uek.x86_64

>
>
> On Mon, May 19, 2014 at 7:51 PM, Nicolas Ecarnot <nicolas at ecarnot.net
> <mailto:nicolas at ecarnot.net>> wrote:
>
>     Hi,
>
>     In our two-nodes ctdb setup, with an iscsi qdisk lun, the ctdb log
>     files are showing frequent message such as :
>
>     Vacuuming child process timed out for db locking.tdb
>
>     Frequency is around every 3 minutes.
>
>
> This usually means that there is meta-data intensive activity happening
> in Samba.  For example if lots of files are opened and closed from
> Samba, there will be lots of locking records created and deleted.  These
> records are removed cluster-wide via vacuuming.  If vacuuming times out,
> it means that the vacuuming process did not finish processing empty
> records and it will process them in the next vacuuming cycle.
>
>     I read it may be due to too numerous locks to "balance/sync" between
>     the nodes (did I read right?) and taking too much time.
>     This seems odd to me because we have around 300 users, doing basic
>     office work, and not particular intensive activity. This seems
>     classical to me.
>
>
> This issue may not be related to contention at all, but may be caused by
> meta-data intensive workload.
>
>
>     Our iscsi network is dedicated, and not much loaded.
>
>     My two questions are :
>     - Could those error message mean this ctdb setup is LOOSING some
>     locks, and thus two users may access read+write the same file (and
>     then corrupt it)?
>
>
> No. Problems in vacuuming will not cause Samba to corrupt files.
>
> Vacuuming is required to remove the deleted records from the cluster. It
> does not affect the proper working of Samba.  Only when Samba has
> released the locks, the locking records will be empty and then CTDB has
> to vacuum them.  If vacuuming fails, usually it should not matter.
> Vacuuming is triggered every 10 seconds for every database.  So if one
> run fails, subsequent runs should continue working.  If vacuuming
> consistently fails every time, then it will cause the database sizes to
> grow very large and that can become a concern.
>
>     - what do you advice me to look at, or what to bench?
>
>
> In the latest version of CTDB, there have been significant changes to
> improve vacuuming performance.  So if possible, I would recommend using
> the latest CTDB.
>
> Amitay.


-- 
Nicolas Ecarnot


More information about the samba-technical mailing list