Very frequent : Vacuuming child process timed out for db locking.tdb. Bottleneck?
nicolas at ecarnot.net
Mon May 26 13:24:21 MDT 2014
Le 26/05/2014 10:03, Amitay Isaacs a écrit :
> Hi Nicolas,
> What version of CTDB and Samba are you using?
Oracle Linux Server release 6.5
> On Mon, May 19, 2014 at 7:51 PM, Nicolas Ecarnot <nicolas at ecarnot.net
> <mailto:nicolas at ecarnot.net>> wrote:
> In our two-nodes ctdb setup, with an iscsi qdisk lun, the ctdb log
> files are showing frequent message such as :
> Vacuuming child process timed out for db locking.tdb
> Frequency is around every 3 minutes.
> This usually means that there is meta-data intensive activity happening
> in Samba. For example if lots of files are opened and closed from
> Samba, there will be lots of locking records created and deleted. These
> records are removed cluster-wide via vacuuming. If vacuuming times out,
> it means that the vacuuming process did not finish processing empty
> records and it will process them in the next vacuuming cycle.
> I read it may be due to too numerous locks to "balance/sync" between
> the nodes (did I read right?) and taking too much time.
> This seems odd to me because we have around 300 users, doing basic
> office work, and not particular intensive activity. This seems
> classical to me.
> This issue may not be related to contention at all, but may be caused by
> meta-data intensive workload.
> Our iscsi network is dedicated, and not much loaded.
> My two questions are :
> - Could those error message mean this ctdb setup is LOOSING some
> locks, and thus two users may access read+write the same file (and
> then corrupt it)?
> No. Problems in vacuuming will not cause Samba to corrupt files.
> Vacuuming is required to remove the deleted records from the cluster. It
> does not affect the proper working of Samba. Only when Samba has
> released the locks, the locking records will be empty and then CTDB has
> to vacuum them. If vacuuming fails, usually it should not matter.
> Vacuuming is triggered every 10 seconds for every database. So if one
> run fails, subsequent runs should continue working. If vacuuming
> consistently fails every time, then it will cause the database sizes to
> grow very large and that can become a concern.
> - what do you advice me to look at, or what to bench?
> In the latest version of CTDB, there have been significant changes to
> improve vacuuming performance. So if possible, I would recommend using
> the latest CTDB.
More information about the samba-technical