Samba, NT, and transient network failures

Wed Jan 27 13:19:30 GMT 1999

Hi Nick,

In reverse order of interest::

> We did think so set the SO_KEEPALIVE option in the smb.conf thinking
> that it might speed up the process of getting old smbd processes to
> elicit a RST response to TCP keepalives, but, going by W. Richard
> Stevens' TCP/IP and Unix books, it seems that the default timeout
values
> associated with SO_KEEPALIVE are too large to be helpful with this
> problem.

You are right on all counts but there is also a 'keepalive =
time_in_sec' protocol-level keepalive in the smb.conf.  I believe the
default value is 5 minutes. At one time when the SIGPIPE code was broken
using this would kill the process so we called it the 'keepdead'
option.  Now it seems to do what it should.

> Thanks for the info about AMD. We're still using the standard Solaris
> automounter, so we're not affected. The Solaris automounter is
[mostly]
> multi-threaded, so it doesn't hang just because a mount request is
> hanging... And only some of us geeks re-export NFS-mounted partitions,

> from our workstations :)

The hang I was describing is not in AMD, it's in any process (like smbd)
that does a getcwd() in an AMD-mounted directory.  Since this covered
all of the user home dirs that were exported by Samba you can imagine
the fun and confusion when they would all hang at once because somebody
who was exporting a home dir from their desktop would power off their
desktop and go home.

Good luck,
Frank V