[patch] Re: smbmount problem
Michael H. Warfield
mhw at wittsend.com
Mon Feb 1 14:13:15 GMT 1999
Paul enscribed thusly:
> > Warning! If you use this patch and you do not implement the
> > fix for the deadly embrace, you will cause smbmount processes to hang
> > in a deadly embrace. You have to reboot in order to clear the mount
> > points.
> > If you didn't fix the setsid problem you could not have properly
> > tested it against autofs. I've already compared notes with HPA (author
> > of autofs) and determined the problem. There is a determinant deadly
> > embrace. It doesn't work with autofs unless you fix this problem! If
> > you don't fix this problem, a hung smbmnt process will lock up the mount
> > point until you reboot.
> All I can say for myself is that I never experianced such a thing in my
> short testing. I did notice the race, and the ugly error messages printed
> by the kernel smbfs code whilst the connection wasn't completed. Only the
> problems that I noticed were fixed by my patch. <shrug> I used Mike
> Warfield's smbmount syntax script between autofs and smbmount, and I did
> notice some comments in there about a workaround for such a thing. If you
> found the problem, great!
Well here is the interesting part... If you never experienced the
deadly embrace, then you never fixed the race condition. :-) It turns
out that the two are interrelated. The deadly embrace has always been
in the code. The reason it never showed up before was that the smbmount
parent process would exit prematurely and allow autofs to run, which would
release smbmnt and break the deadly embrace. This probably made the timing
problem worse than what it should be since it guarenteed that the parent
process would ALWAYS exit prematurely and the smbmnt process had at least
two system calls to "bump through" before it was finally done.
Any and all other fixes that attempted to fix the timing problem
but failed to address the setsid problem either had to have a timeout
which would allow smbmount to exit prematurely after the timeout or they
had a bug and were not doing what you thought they were doing, or they
would hang smbmnt and automount. The only way out of the deadly embrace
with autofs is if the timing window existed, with whatever delays people
may have inserted... Bizzare...
That's why I refused to check in any futher changes until I got
to the bottom of the deadly embrace. There was obviously something going
on in that code that we just didn't understand that was causing the hanging
condition when we fixed the timing window. I didn't want to fix one thing
and cause something worse or make things worse by fixing the wrong thing
(which has been done before). I am SO incredibly glad that HPA got back
to me on this. I would have never figured out that it was the call to
setsid() that was causing this if he hadn't made a remark about maybe the
process was no longer in the same process group...
> There are still some small problems with Samba and glibc 2.0.10x, which
> according to Ulrich Drepper should be released as a stable version some
> time in the near future, unless something really bad pops up. I'm not
> talking about the uid thing. I put up a post a few days ago detailing the
> problems I've seen so far.
I saw some remarks about cache and some possible buffer release
problems, but those are all kernel problems. I'm waiting to hear back
from Bill Hawes, who was the last party known to be supporting the smbfs
kernel modules. Hopely he knows whats going on with the glibc stuff as
I did just hear back from Alan Cox and the fix for the smb_fs.h
problem that I hacked around in smbumount.c is in the queue to make it
into the kernel sources. So another one bits the dust...
> Good work (esp. Mike), keep it up.
> Paul Laufer
Michael H. Warfield | (770) 985-6132 | mhw at WittsEnd.com
(The Mad Wizard) | (770) 925-8248 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0xDF1DD471 | possible worlds. A pessimist is sure of it!
More information about the samba-technical