[Samba] Third Try: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf
awl1
awl1 at mnet-online.de
Fri Jul 14 15:37:11 UTC 2017
Hello again, Jeremy, hello again, Samba experts/developers,
as "all good things come in threes" and "third time is a charm",
following kind advice from Björn Jacke, I do indeed try again on this
list to arouse your interest one more time, giving an even shorter
summary of the issue - and having tested with a number of older Samba
versions between 3.5.x and 4.6.x to exactly pinpoint when the issue
started...
As I am 99.99% confident that this is not a configuration issue on my
side, I would really appreciate if somebody from the Samba team would be
interested in tracking down why - for the specific scenario with a huge
number of small files - performance is (so) much worse with Samba
4.x/SMB2 than it used to be with Samba 3.x/SMB1.
(Please note that, for a small number of larger or even huge files, as
expected, I can also confirm from my observations that Samba 4.x/SMB2 is
typically faster than Samba 3.x/SMB1, sometimes even considerably, so
the issue is NOT with Samba 4.x/SMB2 in general, but seems to be caused
to the specific scenario of a huge number of small files.)
Summary:
* Win10 client using TotalCommander 9.0a to copy files
* Copying files from/to a Samba share running on my Home Office
Thecus NAS
* Thecus N4200pro NAS (Intel(R) Atom(TM) CPU D525, 2 cores/4 HT
threads @ 1.80GHz, Linux kernel 2.6.33, 3 GB RAM) and either
Thecus original Samba 3.5.16 or several self-compiled
(using gcc-5.2) Samba versions:
- Samba 4.6.5, SMB2 dialect 3.1.1
- Samba 4.2.14, SMB2 dialect 3.0
- Samba 4.0.26, SMB2 dialect 3.0
- Samba 3.6.25, SMB2 dialect 2.0.2 (single line
"min protocol = SMB2" added to smb.conf)
- Samba 3.6.25, SMB1 dialect 1.5
- Thecus original Samba 3.5.16, SMB1 dialect 1.5
* Exact same hardware, network, complete software stack for all
cases (except varying Samba version on Thecus NAS)
* Exact same smb.conf for both versions (see attached)
* Definitely no other load on/access to the NAS during my testing
* Recorded Wireshark captures in pcapng format for both Write/Read
scenarios in all above Samba versions
* Looking at Grand Total Sum of Wireshark "Service Response Time
Statistics" (SRT) in seconds for all captures to compare
performance below
A) "Write" Scenario:
Write ~ 1000 Small Files (between <1kB and ~ 20kB) to Samba share on
Thecus NAS, copying from a directory of ~ 5000 files stored on Win10
local NTFS
Samba version SMB/SMB2 dialect Total SRT (sec)
3.5.16 1.5 25
3.6.25 1.5 21
3.6.25 2.0.2 341 (!!!)
4.0.26 3.0 387 (!!!)
4.2.14 3.0 355 (!!!)
4.6.5 3.1.1 346 (!!!)
B) "Read" Scenario:
Read ~ 2000 Small Files (between <1kB and ~ 20kB) from a directory of ~
5000 files from Samba share on Thecus NAS, copy to local NTFS on Win10
Samba version SMB/SMB2 dialect Total SRT (sec)
3.5.16 1.5 101
3.6.25 1.5 100
3.6.25 2.0.2 139 (!)
4.0.26 3.0 152 (!)
4.2.14 3.0 140 (!)
4.6.5 3.1.1 144 (!)
(Note that the read scenario spends most of the time - even in 3.x/SMB
1.5 - determining the whole number of ~ 5000 files in this directory,
before Total Commander even starts copying the ~ 2000 files.)
Summary of findings:
* For both Write and Read scenario and a huge number of small files,
performance with SMB2/dialect 2.0/3.0/3.1.1 in all Samba versions
>= 3.6 up to most recent 4.6 is (much) worse than SMB performance
with SMB/dialect 1.5 in Samba 3.6 and before.
* While in the Read scenario, performance is "only" worse by a factor
of 40% (which might possibly at least partly be explained by
additional complexity in SMB2), for the Write scenario, performance
is about *fourteen times* (1400%) worse, a finding which definitely
cannot be explained to be "working as designed".
* While SMB/1.5 performance is still fine in the latest 3.6.25, *all
SMB2-capable releases of Samba from the very first SMB2/2.x
implementation in Samba 3.6 onwards* seem to be affected by the
performance regression.
As it seems prohibited to attach Excel or PDF documents when posting to
this list, I am providing my (anonymized) smb.conf (global section and
particular share definition) as well as an Excel sheet and a PDF with
the detailed Wireshark Service Response Time Statistics for Write and
Read scenario over here:
http://home.mnet-online.de/awl1/smb.conf
http://home.mnet-online.de/awl1/Performance%20Regression.xls
http://home.mnet-online.de/awl1/Performance%20Regression.pdf
Am 13.06.2017 um 18:36 schrieb Jeremy Allison:
> Can you get comparitive wireshark traces for the two cases ?
> That would help discover what the bottleneck is.
As requested by Jeremy, the Wireshark "pcapng" packet traces/recordings
are available for all Samba versions mentioned above in both Read and
Write scenario. Unfortunately, these recordings do indeed contain
confidential data both from my machine and the share, so please get back
to me directly and request access: I will then send you a download link
and password to the capture files ZIP via private mail.
I also hereby promise that I will do everything I can in order to
support your analysis, including running follow-up tests on my
platform/scenario, digging deeper into packet traces or even do source
code investigations based on your instructions.
I truly hope we will be able to improve general Samba 4.x / SMB2
performance for the "huge number of small files" scenario as a result of
this exercise...
Many thanks one more time for your kind help with this!
Best regards,
Andreas
More information about the samba
mailing list