Another showstopper in 2.2.5
ohrn at chl.chalmers.se
Tue Aug 20 22:41:02 GMT 2002
On Tue, 20 Aug 2002 jra at dp.samba.org wrote:
> > >
> > > Things to note is that the client is alive and have had no (obvious)
> > > reason to drop the connection or otherwise stop responding. I have tried
> > > this over and over and it's allways in send_keepalive that the SIGPIPE of
> > > death happens.
> The problem is this shouldn't happen - we block all SIGPIPE
> signals in smbd startup.....
> If we're getting them there is a problem somewhere. Remember,
> running under gdb may reset the process signal mask - when
> not running under gdb we will never see SIGPIPE.
Well, I'd say it does get a SIGPIPE even without the help of gdb.
Consider that my original problem was that smbd processes just dissapeared
whithout cleaning up after themselves.
After I added the BlockSignals call to send_keepalive this doesn't happen
anymore. Instead 2 new messages has started to appear in my logfiles:
[2002/08/21 08:47:55, 0, pid=4710] lib/util_sock.c:write_socket_data(501)
write_socket_data: write failure. Error = Broken pipe
[2002/08/21 08:47:55, 0, pid=4710] smbd/process.c:timeout_processing(1147)
password server keepalive failed.
These messages have never appeared before. No gdb was invloved here.
So the conclusion is that before the patch samba died with SIGPIPE when
write()ing the keepalive message, now it properly gets an EPIPE error
Back to the platform again, this happens on Linux boxes. I have a Sparc
Solaris 8 box serving files too, and it doesn't have this problem at all!
Lets assume that if the password server drops the connection to the Linux
boxes, then it should drop the connection to the Solaris box for the same
Well, it doesn't.
If the signal blocking on Solaris is broken, then I should have the same
problem with dissapearing smbd processes, but I don't.
If signal blocking works on Solaris I should get the 2 messages above in
the logfile on Solaris too, but I don't.
I hope I made sense. In conclusion, on Linux the connection to the
password server breaks and samba dies by SIGPIPE. But the connection
shouldn't break in the first place and even if it did the SIGPIPE should
have been blocked.
On Solaris it just works...
"It is easy to be blinded to the essential uselessness of computers by
the sense of accomplishment you get from getting them to work at all."
- Douglas Adams
Fredrik Öhrn Chalmers University of Technology
ohrn at chl.chalmers.se Sweden
More information about the samba-technical