[Samba] Problems with TDBs on CTDB-managed Samba instance
Jeremy Allison
jra at samba.org
Fri Oct 16 23:53:13 UTC 2015
On Fri, Oct 16, 2015 at 02:44:36PM +0000, Howard, Stewart Jameson wrote:
> Hi All,
>
>
> My site has two separate clustered Samba instances (managed by two independent CTDB instances) running over GPFS. In the last couple of weeks, we have seen a recurring issue that causes the `smbd` process in *one* of these instances to become unresponsive (as seen by CTDB), which results in flapping of CTDB and multiple IP takeover runs.
>
>
> The symptoms that we observe are:
>
>
> 1) Samba becomes unresponsive
>
>
> 2) The output of `smbstatus` starts to show "-1" for each connection where it should be showing user/group information.
>
>
> 3) Samba starts terminating connected sessions and CTDB kills its IP address
>
>
> 4) After some thrashing (Samba restarts, presumably), CTDB is able to recover and start serving again
>
>
> We have noticed that the following messages have started appearing in syslog, as well as in the winbind log on the afflicted cluster:
>
>
> """
>
> [2015/10/16 10:25:30.892468, 0] ../source3/lib/util_tdb.c:313(tdb_log)
> tdb(<PATH OMMITTED>/gencache_notrans.tdb): tdb_rec_read bad magic 0xd9fee666 at offset=517632
>
>
> [2015/10/16 10:25:37.827964, 0] ../source3/lib/util_tdb.c:313(tdb_log)
> tdb(<PATH OMMITTED>/gencache_notrans.tdb): tdb_expand overflow detected current map_size[4294967295] size[124]!
tdb_rec_read bad magic - this means a corrupted tdb
database.
Can you shutdown, remove these tdb's and restart ?
More information about the samba
mailing list