[Samba] extremely low performance on Samba 4.2.14-Debian

Thu Aug 10 17:46:12 UTC 2017

On Thu, 10 Aug 2017 19:21:53 +0200
Emmanuel Florac via samba <samba at lists.samba.org> wrote:

> 
> Hi everyone, 
> 
> here's my problem: I have a fast server (dual Xeon E5-2620, 64 GB RAM)
> with a fast RAID array (24 disks, RAID-6, more than 2GB/s read/write
> local performance, XFS filesystem) and fast network : dual 10GigE
> (myri10g) and 40GigE (i40e). 
> 
> It's running Debian 8.11, tried various kernel versions (currently
> 4.4.x, but 4.9 isn't any better).
> 
> It's slow as dead snails in molted mollasses using samba. Everything
> else is fine:
> 
>  * from a windows PC with a 10GigE card, using ftp.exe and vsftpd, I
>    transfer files at 500/600 MB/s easily.
>  * using Chrome and downloading files through HTTP, I've got 250 MB/s.
>  
>  * using Samba, it reaches 105/110 MB/s, tops. Awful. 
> 
> The Windows client and the Linux server are both connected to the same
> 10GigE/40GigE switch. Transferring from a windows machine to another
> works fine (700 MB/s and more). Therefore the windows machines are NOT
> at fault.
> 
> Looking at what's happening on the server, I noticed that smbd uses
> gobs of CPU. Actually it uses about 1% of CPU (from 'top') for each
> MB/s. Therefore it reaches ~100MB/s, and the CPU core it's running on
> is maxed out! It's definitely NOT normal; on a very similar setup
> (same motherboard, same CPU, same amount of RAM, same RAID controller,
> same OS, etc) when an smbd process is writing at 500/600 MB/s the smbd
> CPU consumption maxes out at 47%!
> 
> I don't know what's wrong and why smbd is burning CPU cycles like
> this.
> 
> Here is a quick comparison I've done. Here is the "bad" machine:
> 
> root at storiq-111:~# pidstat -p 11694 2 20
> Linux 4.4.78-storiq64-opteron (storiq-111)   10/08/2017
> _x86_64_        (32 CPU)
> 
> 16:30:12      UID       PID    %usr %system  %guest    %CPU   CPU
> Command 16:30:14        0     11694    0,00    0,00    0,00
> 0,00     8  smbd 16:30:16    10500     11694   48,00    8,00
> 0,00   56,00     8  smbd 16:30:18    10500     11694   54,00
> 13,00    0,00   67,00     8  smbd 16:30:20    10500     11694
> 54,00   12,00    0,00   66,00     8  smbd 16:30:22    10500
> 11694   61,50   11,50    0,00   73,00     8  smbd 16:30:24
> 10500     11694   61,50   10,00    0,00   71,50     8  smbd
> 16:30:26    10500     11694   64,00   10,00    0,00   74,00     8
> smbd 16:30:28    10500     11694   63,50   10,00    0,00   73,50
> 8  smbd 16:30:30    10500     11694   67,50   11,50    0,00
> 79,00     8  smbd
> 
> root at storiq-111:~# numastat -p 11694
> Per-node process memory usage (in MBs) for PID 11694 (smbd)
>                            Node 0          Node 1           Total
>                   --------------- --------------- ---------------
> Huge                         0.00            0.00            0.00
> Heap                         0.28            0.28            0.56
> Stack                        0.02            0.02            0.04
> Private                     11.51           14.45           25.96
> ----------------  --------------- --------------- ---------------
> Total                       11.80           14.76           26.56
> 
> Notice that it burns tons of CPU in "user". By contrast, here's on
> another (different and much slower) machine:
> 
> root at storiq-313:~# pidstat -p 19654 2 20
> Linux 4.4.79-storiq64-opteron (storiq-313)      10/08/2017
> _x86_64_        (16 CPU)
> 
> 18:29:30      UID       PID    %usr %system  %guest    %CPU   CPU
> Command 18:29:32     1000     19654    5,50   75,50    0,00
> 81,00     2  smbd 18:29:34     1000     19654    6,50   82,50
> 0,00   89,00     0  smbd 18:29:36     1000     19654    6,50
> 89,00    0,00   95,50     0  smbd 18:29:38     1000     19654
> 5,50   92,00    0,00   97,50     4  smbd 18:29:40     1000
> 19654    6,50   90,50    0,00   97,00    10  smbd 18:29:42
> 1000     19654    6,00   94,00    0,00  100,00     0  smbd
> 18:29:44     1000     19654    7,00   90,50    0,00   97,50     0
> smbd 18:29:46     1000     19654    7,50   87,00    0,00   94,50
> 0  smbd 18:29:48     1000     19654    6,00   92,00    0,00
> 98,00     0  smbd 18:29:50     1000     19654    7,00   91,00
> 0,00   98,00     0  smbd 18:29:52     1000     19654    6,00
> 89,00    0,00   95,00     0  smbd
> 
> 
> Per-node process memory usage (in MBs) for PID 19654 (smbd)
>                            Node 0          Node 2           Total
>                   --------------- --------------- ---------------
> Huge                         0.00            0.00            0.00
> Heap                         0.14            0.00            0.64
> Stack                        0.02            0.00            0.04
> Private                      4.61            0.00            8.57
> ----------------  --------------- --------------- ---------------
> Total                        4.78            0.00            9.25
> 
> The theoretically slower machine is actually 5x faster! That's not
> amusing... Also for some reason it uses much more memory, on both
> nodes, on the "bad" machine (but there are many more clients).
> 
> I tried running strace on the working smbd process, but I don't see
> anything remarkable in its output. No hardware errors in mcelog
> either. I'm out of ideas... What's going on?
> 
> smb.conf in case anyone spots something fishy (it's actually split in
> 3):
> 
> /etc/samba/smb.conf:
> 
>    netbios name = storiq-111
>    server string = %h server (Samba, Debian)
>    include = /etc/samba/smb-common.ad.conf
>    include = /etc/samba/smb-shares.conf
> 
> /etc/samba/smb-common.ad.conf:
> 
>     security = ADS
>     workgroup = TEST
>     realm = AD.TEST.COM
> 
> 
>     winbind sealed pipes = false
>     require strong key = false
>     winbind sealed pipes:TEST = true
>     require strong key:TEST = true
>     winbind refresh tickets = yes
>     winbind trusted domains only = no
>     winbind use default domain = yes
>     winbind enum users  = yes
>     winbind enum groups = yes
>     winbind cache time = 7200
>     winbind offline logon = yes
> 
> 
>     idmap config *:backend = tdb
>     idmap config *:range = 2000-9999
>     idmap config TEST:backend = rid
>     idmap config TEST:range = 10000-50000000
> 
>     winbind nss info = template
>     template shell = /bin/bash
>     template homedir = /mnt/raid/%u
> 
>     client use spnego = yes
>     client ntlmv2 auth = yes
>     encrypt passwords = yes
>     restrict anonymous = 2
>     server signing = mandatory
>     ntlm auth = yes
> 
>     log level = 0
>     log file = /var/log/samba/smbd.log
>     max log size = 50
> 
>     vfs objects = acl_xattr
>     map acl inherit = yes
>     store dos attributes = yes
> 
> /etc/samba/smb-shares.conf:
> 
> [test_tr]
>    comment = Utilisateurs de test_tr
>    valid users = @prod
>    force group = prod
>    force create mode = 775
>    read only = no
>    path = /mnt/raid/test_tr
>    guest ok = no
> ;  vfs objects = acl_xattr streams_xattr
> 
> 
> 

4.2.x is EOL as far as Samba is concerned, there have been a lot of
changes since 4.2.* came out.

Can I suggest you go here: http://apt.van-belle.nl/

You can get a much more recent version there, 4.6.7

Rowland