TDB lock contention during "startup" event caused winbind crash
Martin Schwenke
martin at meltin.net
Fri May 27 21:24:15 UTC 2016
Hi Kenny,
On Fri, 27 May 2016 12:59:09 -0700, Kenny Dinh <kdinh at peaxy.net> wrote:
> I ran into a situation where contention for tdb lock caused "winbind" to
> crash. Below is the scenario
>
> -- ctdb process smbd process winbindd
> process
>
> -- "service start winbind"------------------------------ winbindd
> started (pid 14461)
> -- "service start smb" ------ smbd started ( pid 14602)
> -------------------------------------- smbd acquires lock g_lock.tdb - hung
> -- invoke hung script ------ smbd child pid (14602) is still hung
> -- ----------------------------------- smbd still lock g_lock.tdb
> -- CTDB restart all services
> -- kill existing winbindd ------------------------------------------
> winbindd (pid 14461 term)
> -- service start winbind ---------------------------------------- winbindd
> started (pid 14733)
> --
> -----------------------------------------------------------------------try
> to lock g_lock.tdb but failed
> --
> ------------------------------------------------------------------------
> PANIC
> -- kill existing smbd -------- Kill pid 14602
> -- service start smb ------- smbd started
>
> Attached are log file from ctdb, winbind, and smb, and winbind core
> backtrace.
>
> My propose patch is to make sure all winbindd, smbd, and nmbd services are
> terminated at the beginning of "startup" event.
Once again, this looks like the Samba/CTDB deadlock that is fixed in
CTDB in Samba 4.4.
00.ctdb really shouldn't know anything about Samba related processes.
I guess there's some possibility that the use of
update_config_from_tdb() in the 00.ctdb startup event triggers the bug. We dropped update_config_from_tdb() in CTDB in Samba 4.3.
I think you will be much happier if you can try to test with 4.4.
There are a lot of improvements and bug fixes... :-)
peace & happiness,
martin
More information about the samba-technical
mailing list