[Samba] Need help troubleshooting TCP thrashing, possible kernel bug?
Paul Klapperich
paul.klapperich at packetdigital.com
Wed Feb 8 21:59:16 UTC 2017
I have a FreeNAS 9.3 server running Samba Version 4.3.6 and a bunch of
Windows and Linux clients. Everything's been running fine for a while and
nothing changed on the server.
Recently (Jan 27th) some of the Archlinux clients updated from a 4.8.x
kernel to a 4.9.x kernel. Again, things ran fine. Then on Jan 30th around
2am the Archlinux clients using 4.9.x kernels and utilizing mount.cifs to
access samba shares began thrashing on TCP port 445, causing high CPU load
on the server. These machines now cause thrashing after 15-20 minutes
whenever a share is mounted using mount.cifs.
When it's thrashing, I see thousands of opened ports from a single client:
# sockstat -4 | grep 10.0.1.87 | wc
10013 70091 740962
And on the client, the port is constant changing:
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:53122 10.0.0.8:445
ESTABLISHED 0 1253359
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:53700 10.0.0.8:445
ESTABLISHED 0 1253439
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:53926 10.0.0.8:445
ESTABLISHED 0 1254557
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:54148 10.0.0.8:445
ESTABLISHED 0 1253578
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:54352 10.0.0.8:445
ESTABLISHED 0 1253604
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:54518 10.0.0.8:445
ESTABLISHED 0 1254685
$ netstat -net | grep 10.0.0.8
tcp 0 0 10.0.1.87:54698 10.0.0.8:445
ESTABLISHED 0 1252177
As a work around, I can downgrade these client machines to any 4.8.x kernel
and the issue goes away. My suspicion is something is weird in my smb.conf
and a change in the 4.9.x kernels exposes that weirdness. Or maybe there's
a bug that was introduced in 4.9 and our setup exposes it.
I've built 4.10rc kernels from Linus's git repo and they also have the
problem. The 4.9 kernel I built from Linus's git has the problem, but the
4.8 kernel I built does not, so I don't think it's related to any patching
done by Archlinux. I don't understand why the issue didn't happen
immediately after upgrading kernels on the 27th, but now it very
consistently acts up after less than 20 minutes.
Attached is the smb.conf used on one of my FreeNAS servers. I was able to
copy that config to an Archlinux system running Samba version 4.5.3
(commenting lines 24, 25, 55, and 79 and adjusting the "interfaces =" line)
and the problem persists, so it doesn't appear to be specific to FreeNas or
Samba 4.3.6.
--
Paul Klapperich
More information about the samba
mailing list