On winbind shutdown prior to the removal of gencache_stabilize we could crash due to races

Rowland Penny rpenny at samba.org
Mon Mar 11 18:55:24 UTC 2019


On Mon, 11 Mar 2019 11:31:15 -0700
Jeremy Allison via samba-technical <samba-technical at lists.samba.org>
wrote:

> On Mon, Mar 11, 2019 at 11:20:49AM -0700, Richard Sharpe wrote:
> > On Mon, Mar 11, 2019 at 10:11 AM Jeremy Allison <jra at samba.org>
> > wrote:
> > >
> > > On Mon, Mar 11, 2019 at 09:47:16AM -0700, Richard Sharpe via
> > > samba-technical wrote:
> > > > Hi folks,
> > > >
> > > > We are seeing this on winbind shutdown:
> > > >
> > > > --------------------------------------------------
> > > > 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11
> > > > 01:16:19.153272,
> > > > 0] ../source3/winbindd/winbindd.c:281(winbindd_sig_term_handler)
> > > > 2019-01-11 01:16:19 systemd[1]: Starting Hammerspace
> > > > Maintenance Target. 2019-01-11 01:16:19 winbindd[17540]:   Got
> > > > sig[15] terminate (is_parent=0) 2019-01-11 01:16:19
> > > > winbindd[17497]: [2019/01/11 01:16:19.153546,
> > > > 0] ../source3/winbindd/winbindd.c:281(winbindd_sig_term_handler)
> > > > 2019-01-11 01:16:19 winbindd[17497]:   Got sig[15] terminate
> > > > (is_parent=1) 2019-01-11 01:16:19 winbindd[17507]: [2019/01/11
> > > > 01:16:19.153413,
> > > > 0] ../source3/winbindd/winbindd.c:281(winbindd_sig_term_handler)
> > > > 2019-01-11 01:16:19 winbindd[17507]:   Got sig[15] terminate
> > > > (is_parent=0) 2019-01-11 01:16:19 systemd[1]: Stopped System
> > > > Security Services Daemon. 2019-01-11 01:16:19 winbindd[17540]:
> > > > [2019/01/11 01:16:19.162163,
> > > > 0] ../lib/util/fault.c:78(fault_report) 2019-01-11 01:16:19
> > > > winbindd[17540]:
> > > > ===============================================================
> > > > 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11
> > > > 01:16:19.162202,  0] ../lib/util/fault.c:79(fault_report)
> > > > 2019-01-11 01:16:19 winbindd[17540]:   INTERNAL ERROR: Signal 7
> > > > in pid 17540 (4.7.1-GIT-c0bd705-Hammerspace) 2019-01-11
> > > > 01:16:19 winbindd[17540]:   Please read the Trouble-Shooting
> > > > section of the Samba HOWTO 2019-01-11 01:16:19 winbindd[17540]:
> > > > [2019/01/11 01:16:19.162220,
> > > > 0] ../lib/util/fault.c:81(fault_report) 2019-01-11 01:16:19
> > > > winbindd[17540]:
> > > > ===============================================================
> > > > 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11
> > > > 01:16:19.162232,  0] ../source3/lib/util.c:804(smb_panic_s3)
> > > > 2019-01-11 01:16:19 winbindd[17540]:   PANIC (pid 17540):
> > > > internal error 2019-01-11 01:16:19 winbindd[17540]: [2019/01/11
> > > > 01:16:19.162550,  0] ../source3/lib/util.c:915(log_stack_trace)
> > > > 2019-01-11 01:16:19 winbindd[17540]:   BACKTRACE: 25 stack
> > > > frames:
> > > > --------------------------------------------------------------
> > > >
> > > > This is with a 4.7.1ish version of Samba.
> > > >
> > > > It seems to be due to a race between the parent and child with
> > > > both of them calling gencache_stabilize and with the right
> > > > phase of the moon, one seems to have closed the tdb (and thus
> > > > unmapped the mutexes memory) while the other is iterating the
> > > > mutexes.
> > > >
> > > > I see that the whole gencache_stabilize stuff was removed
> > > > around December 2018.
> > > >
> > > > 1. Is it worth filing a bug in case the change needs back
> > > > porting?
> > >
> > > Nope. 4.7. is out of maintanence (except for security), so even
> > > if you log a bug the patch you'd attach would be a courtesy, but
> > > not go into a release.
> > 
> > The bug likely still exists in 4.8 and maybe 4.9 :-)
> 
> OK, I was confused, sorry. So you mean the gencache_stabilize()
> stuff is inherently racy and still exists in supported releases ?
> 
> If so, yeah logging a bug is the right thing to do.
> 

Hi Jeremy, Richard is reporting a problem with 4.7.x and seemingly
'thinks' the problem might still exist in 4.8 & 4.9. So you were quite
correct in suggesting upgrading to a later version, if the problem is
still there, then a bug report is warranted, if it isn't, then it
probably will never get fixed because, if all goes well, next Tuesday
4.7 will go EOL with the release of 4.10.0

Rowland



More information about the samba-technical mailing list