[Samba] Friendly Reminder: Would you please comment on my findings?

awl1 awl1 at mnet-online.de
Fri Aug 18 20:54:29 UTC 2017


Hello Andrew,

many thanks for joining this discussion! :-)

Am 18.08.2017 um 21:46 schrieb Andrew Bartlett:
> I do realise you are in between a rock and a hard place.  You have
> identified an interesting issue, triggered by a massive protocol change
> (so not able to be bisected down to a regression) that requires
> significant work to understand and may or not be possible to resolve.
Note that I have tracked down the issue to what I believe to be the root 
cause, and the root cause is NOT an issue in Samba, but an issue with 
Microsoft's SMB2/SMB3 client that uses completely inefficient 
SMB2_FIND_ID_BOTH_DIRECTORY_INFO requests in SMB2/3 as opposed to 
efficient FIND_FIRST2 requests in SMB1:

The main parts of my analysis of the issue are contained here:

https://lists.samba.org/archive/samba/2017-July/209749.html
https://lists.samba.org/archive/samba/2017-July/209750.html
https://lists.samba.org/archive/samba/2017-July/209751.html

Just citing key findings for your reference:

In SMB1, the Windows client executes one FIND_FIRST2 Request for each 
file to be copied (i.e. in my scenario, ~ 1000 requests) returning 
STATUS_NO_SUCH_FILE every time before actually creating/writing to the file.

When looking at the same file in the SMB2 3.1.1, the Windows client 
issues a different Find operation (SMB2_FIND_ID_BOTH_DIRECTORY_INFO with 
Pattern "*") that does not look for the particular file name that is 
about to be written, but seems to try and list the whole
current directory's content with a pattern of "*". Note that, looping 
through the 2000 files to be written in my scenario, the length of the 
Samba's Find Response increases with every file successfully copied: 
When copying file number 1000, the Find Response sends back a list of 
all 999 files that have been successfully copied to this directory 
before, and this list of 999 file names is not needed for any meaningful 
purpose, as the goal only is to check whether file number 1000 already 
exists in this list of 999 files (which it of course never
does!) or not. The last such call to SMB2_FIND_ID_BOTH_DIRECTORY_INFO 
contained in the traces has a response length of about 64kB (containing 
filenames that have already been written to the target directory but are 
not needed/helpful in any way) and interestingly does not return 
"STATUS_NO_MORE_FILES", but "STATUS_INFO_LENGTH_MISMATCH", maybe because 
the buffer size for the result of the pattern lookup is only 64kB!?

Looking at the exact same scenario from a Linux Linux mount.cifs 
vers=3.0 client unveils only four (!!!) SMB2 Find requests for the whole 
scenario, where Windows Explorer sends no less than 2140 SMB2 Find 
requests to copy ~ 1000 files to the share (1036 times 
"SMB2_FIND_ID_BOTH_DIRECTORY_INFO, Pattern: *" plus 1104 times 
SMB2_FIND_NAME_INFO Pattern: <file name>), and Windows command line 
"xcopy" is even worse (3741 find requests in order to copy ~ 1000 files).

While even the Linux SMB2 client is still slower than the Windows SMB1 
client, I tend to think that the remaining difference from 25 seconds 
with SMB 1.5 in Win10 to 36 seconds with SMB2 3.0 in Linux (44%) might 
be tolerable...

So IMHO I have already uncovered that it is the implementation of the 
Windows SMB2/SMB3 client that is faulty, and what I'd ask the Samba team 
is to

a) verify that my assessment is correct and
b) engage in raising this huge performance regression with Microsoft 
(because this will definitely end up nowhere when I am trying to raise 
this with MS as a private individual customer based on a single Win10 
Pro license)...

> Have you tried to engage with Thecus on this?  I know it seems odd, and
> getting to speak with an engineer who actually understands what you are
> trying to warn them might be very difficult, but they will be upgrading
> at some point and then it is a regression to them, and they may have
> the incentive to look into it.  It seems like a long shot, but similar
> long shots include getting the attention of another NAS Vendor already
Engaging with Thecus on this will be rejected, as my NAS (a 2008 
N4200PRO) is an EOL product. I have compiled my own version of Samba 
4.6.5 and deployed it onto my NAS as an installable module, replacing 
default Samba 3.5.16.

> using Samba 4.x, like NETGEAR, or as an enterprise linux customer?
As the performance regression bug is in the Windows client, even using a 
very recent NAS with Samba 4.x will most definitely show the exact same 
behaviour.

> Does this just happen on your NAS, or can you reproduce on stock Samba
> locally on a PC?  Are you sure it always happened with SMB2?  If you
> can find any SMB2-supporting release (early support was in 3.5 I think,
> and 3.6 had it off by default) that is not slow then bisect your way
> between that and master, it might undercover a regression (for example,
> due to our symlinks security fix).
As I have tracked it down to be a Windows SMB2 client-side issue, this 
will most definitely show with every Windows SMB2 client and any Samba 
server that speaks SMB2 or higher (i.e. versions 3.6 onwards).

I had already tried to "bisect" this very early in the process and 
analyze other Samba versions, as laid out here:

https://lists.samba.org/archive/samba/2017-July/209731.html
http://home.mnet-online.de/awl1/Performance%20Regression.xls
http://home.mnet-online.de/awl1/Performance%20Regression.pdf

The results for different Samba server versions were consistent, but 
only then (i.e. after my bisect attempts) it became apparent that it 
rather is the Windows client to blame, and only if the protocol is SMB2...

> I hope this helps,
It will be most helpful when somebody from the Samba team (whether 
Jeremy, you or somebody else) can spend some time in order to try and 
understand/reproduce/assess my analysis. If you agree with my findings, 
the "real" work afterwards will be to raise the issue with Mi9crosoft 
and make them aware of the detrimental effects that their client-side 
implementation of SMB2 has with regards to performance, when comparing 
to the exact same scenario in SMB1.

As stated before, I am very convinced that I am not the only one 
affected by this issue. The sad truth rather seems to be that everybody 
else besides me seems to have silently accepted the poor performance in 
a "huge number of small files" scenario, even though it became only as 
poor as it is with SMB2 and was perfectly fine before with SMB1... :-(:-(:-(

Did I succeed in making myself clear enough? (Not that easy for a 
non-native English speaker, as the issue is rather complex...)

Many thanks for considering my request for help with this & best regards
Andreas




More information about the samba mailing list