FreeBSD + samba 2.2.2 problems; semi-solution

Jeremy Allison jra at samba.org
Wed Jan 16 23:22:02 GMT 2002


On Wed, Jan 16, 2002 at 11:32:56PM +0000, Mike Silbersack wrote:
> 
> As many of you may have noticed, there are reports on the various samba
> and freebsd lists report serious problems with oplocks.  Specifically,
> oplock error messages begin to appear in host-specific logfiles, similar
> to the following:
> 
> [2002/01/02 17:19:09, 0] smbd/oplock.c:request_oplock_break(981)
>   request_oplock_break: no response received to oplock break request to
> pid 51973 on port 4633 for dev = 27400, inode = 957708, file_id = 18
> 
> One then finds that the smbd process (pid 51973 in this case) must be
> manually killed for the problem to go away.
> 
> I've spent many hours poking at this problem, and I've come to the
> conclusion that oplocks are not (directly) to blame in this situation.
> 
> We seem to have two problems occuring:
> 
> 1.  Samba blocks indefinitely in some read() calls when a client "goes
> quiet."
> 
> 2.  There is some win98se <-> samba 2.2.2 interaction causing clients to
> "go quiet."
> 
> 1.  Samba issues some blocking calls with no timeout, causing the smbd
> process to indefinitely hang if a client suddenly "goes quiet."  The
> guilty path in the case I can recreate is this:
> 
> #0  0x281d3863 in read () from /usr/lib/libc.so.5
> #1  0x811118b in read_socket_data (fd=12, buffer=0x8227005 "<FF>SMB\013", N=57904) at lib/util_sock.c:465
> #2  0x8111699 in receive_smb (fd=12, buffer=0x8227001 "", timeout=0) at lib/util_sock.c:669
> #3  0x807ef20 in receive_message_or_smb (buffer=0x8227001 "", buffer_len=65600, timeout=60000) at smbd/process.c:246
> #4  0x80800c2 in smbd_process () at smbd/process.c:1252
> #5  0x804c34d in main (argc=4, argv=0xbfbffbe4) at smbd/server.c:827
> #6  0x804ae19 in _start ()
> 
> I've worked around this with the patch to util_sock.c which is attached;
> it replaces the above call from read_socket_data to
> read_socket_with_timeout, specifying a 10 second timeout.  As a result,
> smbd processes in this state will detect that the client suddenly went
> quiet, and exit after 10 seconds, dropping all held oplocks.  This is only
> a temporary workaround, albeit a very effective one.  A better fix would
> be to have real timeouts passed into receive_smb, then to have these
> timeouts propegated down.  Additionally, select()ion on the oplock socket
> could be added to these inner calls.  (That change might take a large
> rearchitecturing, however.)
> 
> In short, I think that it would be very wise to provide _some_ timeout
> whenever reading, just so that sysadmins do not have to go in and manually
> kill processes in cases like this.
> 
> 2.  There is some win98se <-> samba 2.2.2 interaction causing clients to
> "go quiet."
> 
> That this problem is occuring I can attest to; when copying the game
> "Serious Sam" to a samba share on my FreeBSD box, I can cause this
> condition to occur > 90% of the time.  This is not a simple problem with
> high load; I can copy a directory many times larger full of mpg / mov
> files without problem.  Hence, I suspect that there is some data dependant
> situation occuring.
> 
> I've tried comparing network parameters to linux boxes and changing my
> settings to match with mixed results.  By changing send / receive socket
> sizes, I am able to change the file in which the problem will occur, but
> it still occurs.  (Note that at the time of the hang, both send and
> receive socket buffers are empty; this is not a problem of data simply not
> being read.)

Can you reproduce this problem on any other system
than FreeBSD ? I'n particular, can you get this to occur
on a Linux box ?

I'm wondering if there's a TCP problem between Win98 and
FreeBSD when transporting SMB (which would be somewhat ironic
as they took their TCP stack from your source code in the first
place :-).

Jeremy.




More information about the samba-technical mailing list