[Samba] Problems with TDBs on CTDB-managed Samba instance

Jeremy Allison jra at samba.org
Fri Oct 16 23:53:13 UTC 2015


On Fri, Oct 16, 2015 at 02:44:36PM +0000, Howard, Stewart Jameson wrote:
> Hi All,
> 
> 
> My site has two separate clustered Samba instances (managed by two independent CTDB instances) running over GPFS.  In the last couple of weeks, we have seen a recurring issue that causes the `smbd` process in *one* of these instances to become unresponsive (as seen by CTDB), which results in flapping of CTDB and multiple IP takeover runs.
> 
> 
> The symptoms that we observe are:
> 
> 
> 1)  Samba becomes unresponsive
> 
> 
> 2)  The output of `smbstatus` starts to show "-1" for each connection where it should be showing user/group information.
> 
> 
> 3)  Samba starts terminating connected sessions and CTDB kills its IP address
> 
> 
> 4)  After some thrashing (Samba restarts, presumably), CTDB is able to recover and start serving again
> 
> 
> We have noticed that the following messages have started appearing in syslog, as well as in the winbind log on the afflicted cluster:
> 
> 
> """
> 
> [2015/10/16 10:25:30.892468,  0] ../source3/lib/util_tdb.c:313(tdb_log)
>   tdb(<PATH OMMITTED>/gencache_notrans.tdb): tdb_rec_read bad magic 0xd9fee666 at offset=517632
> 
> 
> [2015/10/16 10:25:37.827964,  0] ../source3/lib/util_tdb.c:313(tdb_log)
>   tdb(<PATH OMMITTED>/gencache_notrans.tdb): tdb_expand overflow detected current map_size[4294967295] size[124]!

tdb_rec_read bad magic  - this means a corrupted tdb
database.

Can you shutdown, remove these tdb's and restart ?



More information about the samba mailing list