On winbind shutdown prior to the removal of gencache_stabilize we could crash due to races

Jeremy Allison jra at samba.org
Mon Mar 11 17:11:28 UTC 2019


On Mon, Mar 11, 2019 at 09:47:16AM -0700, Richard Sharpe via samba-technical wrote:
> Hi folks,
> 
> We are seeing this on winbind shutdown:
> 
> --------------------------------------------------
> 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11 01:16:19.153272,  0]
> ../source3/winbindd/winbindd.c:281(winbindd_sig_term_handler)
> 2019-01-11 01:16:19 systemd[1]: Starting Hammerspace Maintenance Target.
> 2019-01-11 01:16:19 winbindd[17540]:   Got sig[15] terminate (is_parent=0)
> 2019-01-11 01:16:19 winbindd[17497]: [2019/01/11 01:16:19.153546,  0]
> ../source3/winbindd/winbindd.c:281(winbindd_sig_term_handler)
> 2019-01-11 01:16:19 winbindd[17497]:   Got sig[15] terminate (is_parent=1)
> 2019-01-11 01:16:19 winbindd[17507]: [2019/01/11 01:16:19.153413,  0]
> ../source3/winbindd/winbindd.c:281(winbindd_sig_term_handler)
> 2019-01-11 01:16:19 winbindd[17507]:   Got sig[15] terminate (is_parent=0)
> 2019-01-11 01:16:19 systemd[1]: Stopped System Security Services Daemon.
> 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11 01:16:19.162163,  0]
> ../lib/util/fault.c:78(fault_report)
> 2019-01-11 01:16:19 winbindd[17540]:
> ===============================================================
> 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11 01:16:19.162202,  0]
> ../lib/util/fault.c:79(fault_report)
> 2019-01-11 01:16:19 winbindd[17540]:   INTERNAL ERROR: Signal 7 in pid
> 17540 (4.7.1-GIT-c0bd705-Hammerspace)
> 2019-01-11 01:16:19 winbindd[17540]:   Please read the
> Trouble-Shooting section of the Samba HOWTO
> 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11 01:16:19.162220,  0]
> ../lib/util/fault.c:81(fault_report)
> 2019-01-11 01:16:19 winbindd[17540]:
> ===============================================================
> 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11 01:16:19.162232,  0]
> ../source3/lib/util.c:804(smb_panic_s3)
> 2019-01-11 01:16:19 winbindd[17540]:   PANIC (pid 17540): internal error
> 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11 01:16:19.162550,  0]
> ../source3/lib/util.c:915(log_stack_trace)
> 2019-01-11 01:16:19 winbindd[17540]:   BACKTRACE: 25 stack frames:
> --------------------------------------------------------------
> 
> This is with a 4.7.1ish version of Samba.
> 
> It seems to be due to a race between the parent and child with both of
> them calling gencache_stabilize and with the right phase of the moon,
> one seems to have closed the tdb (and thus unmapped the mutexes
> memory) while the other is iterating the mutexes.
> 
> I see that the whole gencache_stabilize stuff was removed around December 2018.
> 
> 1. Is it worth filing a bug in case the change needs back porting?

Nope. 4.7. is out of maintanence (except for security), so even if you log a bug
the patch you'd attach would be a courtesy, but not go into a release.

> 2. Would it be better in my case to remove the calls to
> gencache_stabilize from winbind's shutdown or should I take the whole
> change in 1386200be5c583c680c3894a11688a0e0a3d2285?

I'd try and move forward if possible, the current code
is maintained.



More information about the samba-technical mailing list