[Samba] Odd Samba 4 ("4.2.0pre1-GIT-b505111"; actually only using client) behaviour #2 - "accept: Software caused connection abort".

Tris Mabbs TM-Samba201302 at Firstgrade.Co.UK
Wed Dec 11 13:00:17 MST 2013


So I *finally* got around to looking into this in a bit more detail.

For brevity I've chopped off the original message; it was basically that I'm
getting hundreds (and hundreds, and ...) of "accept: Software caused
connection abort" messages logged.  I questioned whether perhaps this should
only be logged at higher debug levels; there was also the suggestion made of
putting a "sleep (x);" in the error path, but the message does not
repeatedly come out (the next "pollsys(...)" will typically not error, so
there aren't repeated consecutive errors from the same socket) so that might
not be appropriate. 
I said I'd look into it a bit more to see what was common about the errors
(E.g., always on the same interface? ...).

OK.

So I've now looked into it a bit further, as I had an "smbd" process (PID
224) doing this basically all day.  So (summarised a bit):

# truss -f -vall -p 224 >& /var/tmp/smbd.truss &
[1] 12345
# (wait quite a while ...)
# kill %1
Killed: truss ...
# grep ECONNAB /var/tmp/smbd.truss | wc -l
     141
# (checked on which descriptors this was happening; always 33, 35 or 37)
# grep ECONNAB /var/tmp/smbd.truss | grep -v '(3[357],' |wc -l
    0
# pfiles 224
...
  33: S_IFSOCK mode:0666 dev:557,0 ino:56426 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
        SOCK_STREAM
        SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(32768),SO_RCVBUF(33232)
        sockname: AF_INET X.X.X.X  port: 445
        congestion control: newreno
...
  35: S_IFSOCK mode:0666 dev:557,0 ino:58053 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
        SOCK_STREAM
        SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(32768),SO_RCVBUF(33232)
        sockname: AF_INET X.X.X.Y  port: 445
        congestion control: newreno
...
  37: S_IFSOCK mode:0666 dev:557,0 ino:34369 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
        SOCK_STREAM
        SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(32768),SO_RCVBUF(33232)
        sockname: AF_INET X.X.X.Z  port: 445
        congestion control: newreno
...
# (dig one of the ECONNABORTED messages out; they're all of the same form)
...
224:    pollsys(0x08088D48, 8, 0xFEFFDF58, 0x00000000) (sleeping...)
224:            fd=39 ev=POLLIN|POLLHUP rev=0
224:            fd=38 ev=POLLIN|POLLHUP rev=0
224:            fd=34 ev=POLLIN|POLLHUP rev=0
224:            fd=36 ev=POLLIN|POLLHUP rev=0
224:            fd=37 ev=POLLIN|POLLHUP rev=0
224:            fd=35 ev=POLLIN|POLLHUP rev=0
224:            fd=33 ev=POLLIN|POLLHUP rev=0
224:            fd=6  ev=POLLIN|POLLHUP rev=0
224:            timeout: 49.964000000 sec
224:    pollsys(0x08088D48, 8, 0xFEFFDF58, 0x00000000)  = 1
224:            fd=39 ev=POLLIN|POLLHUP rev=0
224:            fd=38 ev=POLLIN|POLLHUP rev=0
224:            fd=34 ev=POLLIN|POLLHUP rev=0
224:            fd=36 ev=POLLIN|POLLHUP rev=0
224:            fd=37 ev=POLLIN|POLLHUP rev=POLLIN
224:            fd=35 ev=POLLIN|POLLHUP rev=0
224:            fd=33 ev=POLLIN|POLLHUP rev=0
224:            fd=6  ev=POLLIN|POLLHUP rev=0
224:            timeout: 49.964000000 sec
224:    accept(37, 0xFEFFDE0C, 0xFEFFDDF8, SOV_DEFAULT) Err#130 ECONNABORTED
...
#

So it's *always* happening on a socket listening on port 445, never on 139
(or any other port) and it's happening on *every* socket listening on port
445 (3 interfaces, 3 sockets listening on that port).  That makes it
unlikely to be a resource issue (at client or server end) or it would be
unlikely to be that port specific, so perhaps some protocol corner-case? 
Or, of course, just Solaris being moronic (wouldn't be unheard of ...)?

I don't know whether that helps anyone at all?

Cheers, and season's greetings (and all that ...),

Tris.



More information about the samba mailing list