[Samba] extremely low performance on Samba 4.2.14-Debian
Rowland Penny
rpenny at samba.org
Thu Aug 10 17:46:12 UTC 2017
On Thu, 10 Aug 2017 19:21:53 +0200
Emmanuel Florac via samba <samba at lists.samba.org> wrote:
>
> Hi everyone,
>
> here's my problem: I have a fast server (dual Xeon E5-2620, 64 GB RAM)
> with a fast RAID array (24 disks, RAID-6, more than 2GB/s read/write
> local performance, XFS filesystem) and fast network : dual 10GigE
> (myri10g) and 40GigE (i40e).
>
> It's running Debian 8.11, tried various kernel versions (currently
> 4.4.x, but 4.9 isn't any better).
>
> It's slow as dead snails in molted mollasses using samba. Everything
> else is fine:
>
> * from a windows PC with a 10GigE card, using ftp.exe and vsftpd, I
> transfer files at 500/600 MB/s easily.
> * using Chrome and downloading files through HTTP, I've got 250 MB/s.
>
> * using Samba, it reaches 105/110 MB/s, tops. Awful.
>
> The Windows client and the Linux server are both connected to the same
> 10GigE/40GigE switch. Transferring from a windows machine to another
> works fine (700 MB/s and more). Therefore the windows machines are NOT
> at fault.
>
> Looking at what's happening on the server, I noticed that smbd uses
> gobs of CPU. Actually it uses about 1% of CPU (from 'top') for each
> MB/s. Therefore it reaches ~100MB/s, and the CPU core it's running on
> is maxed out! It's definitely NOT normal; on a very similar setup
> (same motherboard, same CPU, same amount of RAM, same RAID controller,
> same OS, etc) when an smbd process is writing at 500/600 MB/s the smbd
> CPU consumption maxes out at 47%!
>
> I don't know what's wrong and why smbd is burning CPU cycles like
> this.
>
> Here is a quick comparison I've done. Here is the "bad" machine:
>
> root at storiq-111:~# pidstat -p 11694 2 20
> Linux 4.4.78-storiq64-opteron (storiq-111) 10/08/2017
> _x86_64_ (32 CPU)
>
> 16:30:12 UID PID %usr %system %guest %CPU CPU
> Command 16:30:14 0 11694 0,00 0,00 0,00
> 0,00 8 smbd 16:30:16 10500 11694 48,00 8,00
> 0,00 56,00 8 smbd 16:30:18 10500 11694 54,00
> 13,00 0,00 67,00 8 smbd 16:30:20 10500 11694
> 54,00 12,00 0,00 66,00 8 smbd 16:30:22 10500
> 11694 61,50 11,50 0,00 73,00 8 smbd 16:30:24
> 10500 11694 61,50 10,00 0,00 71,50 8 smbd
> 16:30:26 10500 11694 64,00 10,00 0,00 74,00 8
> smbd 16:30:28 10500 11694 63,50 10,00 0,00 73,50
> 8 smbd 16:30:30 10500 11694 67,50 11,50 0,00
> 79,00 8 smbd
>
> root at storiq-111:~# numastat -p 11694
> Per-node process memory usage (in MBs) for PID 11694 (smbd)
> Node 0 Node 1 Total
> --------------- --------------- ---------------
> Huge 0.00 0.00 0.00
> Heap 0.28 0.28 0.56
> Stack 0.02 0.02 0.04
> Private 11.51 14.45 25.96
> ---------------- --------------- --------------- ---------------
> Total 11.80 14.76 26.56
>
> Notice that it burns tons of CPU in "user". By contrast, here's on
> another (different and much slower) machine:
>
> root at storiq-313:~# pidstat -p 19654 2 20
> Linux 4.4.79-storiq64-opteron (storiq-313) 10/08/2017
> _x86_64_ (16 CPU)
>
> 18:29:30 UID PID %usr %system %guest %CPU CPU
> Command 18:29:32 1000 19654 5,50 75,50 0,00
> 81,00 2 smbd 18:29:34 1000 19654 6,50 82,50
> 0,00 89,00 0 smbd 18:29:36 1000 19654 6,50
> 89,00 0,00 95,50 0 smbd 18:29:38 1000 19654
> 5,50 92,00 0,00 97,50 4 smbd 18:29:40 1000
> 19654 6,50 90,50 0,00 97,00 10 smbd 18:29:42
> 1000 19654 6,00 94,00 0,00 100,00 0 smbd
> 18:29:44 1000 19654 7,00 90,50 0,00 97,50 0
> smbd 18:29:46 1000 19654 7,50 87,00 0,00 94,50
> 0 smbd 18:29:48 1000 19654 6,00 92,00 0,00
> 98,00 0 smbd 18:29:50 1000 19654 7,00 91,00
> 0,00 98,00 0 smbd 18:29:52 1000 19654 6,00
> 89,00 0,00 95,00 0 smbd
>
>
> Per-node process memory usage (in MBs) for PID 19654 (smbd)
> Node 0 Node 2 Total
> --------------- --------------- ---------------
> Huge 0.00 0.00 0.00
> Heap 0.14 0.00 0.64
> Stack 0.02 0.00 0.04
> Private 4.61 0.00 8.57
> ---------------- --------------- --------------- ---------------
> Total 4.78 0.00 9.25
>
> The theoretically slower machine is actually 5x faster! That's not
> amusing... Also for some reason it uses much more memory, on both
> nodes, on the "bad" machine (but there are many more clients).
>
> I tried running strace on the working smbd process, but I don't see
> anything remarkable in its output. No hardware errors in mcelog
> either. I'm out of ideas... What's going on?
>
> smb.conf in case anyone spots something fishy (it's actually split in
> 3):
>
> /etc/samba/smb.conf:
>
> netbios name = storiq-111
> server string = %h server (Samba, Debian)
> include = /etc/samba/smb-common.ad.conf
> include = /etc/samba/smb-shares.conf
>
> /etc/samba/smb-common.ad.conf:
>
> security = ADS
> workgroup = TEST
> realm = AD.TEST.COM
>
>
> winbind sealed pipes = false
> require strong key = false
> winbind sealed pipes:TEST = true
> require strong key:TEST = true
> winbind refresh tickets = yes
> winbind trusted domains only = no
> winbind use default domain = yes
> winbind enum users = yes
> winbind enum groups = yes
> winbind cache time = 7200
> winbind offline logon = yes
>
>
> idmap config *:backend = tdb
> idmap config *:range = 2000-9999
> idmap config TEST:backend = rid
> idmap config TEST:range = 10000-50000000
>
> winbind nss info = template
> template shell = /bin/bash
> template homedir = /mnt/raid/%u
>
> client use spnego = yes
> client ntlmv2 auth = yes
> encrypt passwords = yes
> restrict anonymous = 2
> server signing = mandatory
> ntlm auth = yes
>
> log level = 0
> log file = /var/log/samba/smbd.log
> max log size = 50
>
> vfs objects = acl_xattr
> map acl inherit = yes
> store dos attributes = yes
>
> /etc/samba/smb-shares.conf:
>
> [test_tr]
> comment = Utilisateurs de test_tr
> valid users = @prod
> force group = prod
> force create mode = 775
> read only = no
> path = /mnt/raid/test_tr
> guest ok = no
> ; vfs objects = acl_xattr streams_xattr
>
>
>
4.2.x is EOL as far as Samba is concerned, there have been a lot of
changes since 4.2.* came out.
Can I suggest you go here: http://apt.van-belle.nl/
You can get a much more recent version there, 4.6.7
Rowland
More information about the samba
mailing list