Another showstopper in 2.2.5

Fredrik Ohrn ohrn at chl.chalmers.se
Tue Aug 20 22:41:02 GMT 2002


On Tue, 20 Aug 2002 jra at dp.samba.org wrote:

> > > 
> > > Things to note is that the client is alive and have had no (obvious)
> > > reason to drop the connection or otherwise stop responding. I have tried
> > > this over and over and it's allways in send_keepalive that the SIGPIPE of
> > > death happens.
> 
> The problem is this shouldn't happen - we block all SIGPIPE
> signals in smbd startup.....
> 
> If we're getting them there is a problem somewhere. Remember,
> running under gdb may reset the process signal mask - when 
> not running under gdb we will never see SIGPIPE.
> 


Well, I'd say it does get a SIGPIPE even without the help of gdb.


Consider that my original problem was that smbd processes just dissapeared 
whithout cleaning up after themselves.

After I added the BlockSignals call to send_keepalive this doesn't happen 
anymore. Instead 2 new messages has started to appear in my logfiles:


[2002/08/21 08:47:55, 0, pid=4710] lib/util_sock.c:write_socket_data(501)
  write_socket_data: write failure. Error = Broken pipe
[2002/08/21 08:47:55, 0, pid=4710] smbd/process.c:timeout_processing(1147)
  password server keepalive failed.


These messages have never appeared before. No gdb was invloved here.

So the conclusion is that before the patch samba died with SIGPIPE when 
write()ing the keepalive message, now it properly gets an EPIPE error 
message instead.



Back to the platform again, this happens on Linux boxes. I have a Sparc 
Solaris 8 box serving files too, and it doesn't have this problem at all!

Lets assume that if the password server drops the connection to the Linux 
boxes, then it should drop the connection to the Solaris box for the same 
(unknown) reason.

Well, it doesn't.

If the signal blocking on Solaris is broken, then I should have the same 
problem with dissapearing smbd processes, but I don't.

If signal blocking works on Solaris I should get the 2 messages above in 
the logfile on Solaris too, but I don't.


I hope I made sense. In conclusion, on Linux the connection to the 
password server breaks and samba dies by SIGPIPE. But the connection 
shouldn't break in the first place and even if it did the SIGPIPE should 
have been blocked.

On Solaris it just works...


/Fredrik

-- 
   "It is easy to be blinded to the essential uselessness of computers by
   the sense of accomplishment you get from getting them to work at all."
                                                   - Douglas Adams

Fredrik Öhrn                               Chalmers University of Technology
ohrn at chl.chalmers.se                                                  Sweden





More information about the samba-technical mailing list