[Samba] Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

awl1 awl1 at mnet-online.de
Fri Jul 14 20:47:17 UTC 2017

Hello again, Jeremy,

OK, trying to move a little more forward with regards to my knowledge of 
Wireshark and packet inspection/filtering ;-):


Using the most recent smb.conf with Samba 4.6.5 / SMB2, for 1020 files 
written, we have:

Find                    14   1588     0.001458     0.275006 0.052488    

There are 2140 packets with smb2.cmd == 14 (Find) of which are 1070 find 
requests ((smb2.cmd == 14) && (smb2.flags.response == 0)) and 1070 find 
responses ((smb2.cmd == 14) && (smb2.flags.response == 1))
(NOTE: why is there a difference to the number of 1588 Find calls as 
shown in the service response time statistics???)

518 of these 1070 Find Requests do have the following type:
Find Request File: RFP_files SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: 
*;Find Request File: RFP_files SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *

and 552 are of type (note the value for "pattern" of course does change 
with each call):

Find Request File: RFP_files SMB2_FIND_NAME_INFO Pattern: RFP2372.htm
Info Level: SMB2_FIND_NAME_INFO (12)

With the original smb.conf in the original Samba 3.5.16 write scenario, 
for 1024 files written, we have:

Transaction2 Sub-Commands
FIND_FIRST2                          1   2042     0.001599 0.017518     
0.003191     6.515678

There are 4084 packets with smb.trans2.cmd == 0x0001 (FIND_FIRST2) of 
which are 2042 find requests ((smb.trans2.cmd == 0x0001) && 
(smb.flags.response == 0)) and 2042 find responses ((smb2.cmd == 14) && 
(smb.flags.response == 1))
(NOTE: Here, we see no difference to the 2042 Find calls as shown in the 
service response time statistics...!?)

2039 of the 2042 FIND_FIRST2 requests are of type (value for "pattern" 
changes with each call)
Trans2 Request, FIND_FIRST2, Pattern: \napp\Header_RFP_files\RFP.bmp
Level of Interest: Find File Names Info (259)

only 3 of the 2042 FIND_FIRST2 requests are of type
Trans2 Request, FIND_FIRST2, Pattern: \*
Level of Interest: Find File Both Directory Info (260)

(3) Base line:

It looks like the SMB2 logic does execute far too many 
FIND_ID_BOTH_DIRECTORY_INFO calls with "*" search pattern - it should 
rather execute more FIND_NAME_INFO calls (probably at least one per the 
1020 files written).
Do you agree with this assessment? Does this account for the difference 
in performance?

Many thanks one more time & best regards

Am 14.07.2017 um 19:49 schrieb Jeremy Allison via samba:
> On Fri, Jul 14, 2017 at 07:44:38PM +0200, awl1 wrote:
>> Hello Jeremy,
>> many thanks for getting back to me! :-)
>> Am 14.07.2017 um 19:33 schrieb Jeremy Allison:
>>> It would be quicker for you to help I'm afraid. As you have nicely
>>> identified the SMB2_QUERY_DIRECTORY as cause of the regression,
>>> can you look into the wireshark traces and tell me what info level
>>> the SMB1 client is asking for and what info level the SMB2 client
>>> is asking for ? If the SMB2 client is also asking for security
>>> descriptors, this may be part of it.
>> I will try to do my best in helping you, but I will need more
>> information, as I am not yet clear what exactly you want me to look
>> for in Wireshark:
>> How did you derive SMB2_QUERY_DIRECTORY from the information I
>> listed? For me, the main "Write" bottleneck pointed to SMB2 "Find",
>> how do I get to SMB2_QUERY_DIRECTORY from there?
>> Index               Procedure  Calls  Min SRT (s)  Max SRT (s)  Avg
>> SRT (s)  Sum SRT (s)
>> Find                       14   1607     0.001383     0.746684
>> 0.193413   310.814294
>> So in order to check I should open both traces and compare exactly
>> what information? It would be great if you can describe what I have
>> to do in the Wireshark application in as much detail as possible...
> First try fixing the smb.conf in the way I reviewed. Then
> let's look.

More information about the samba mailing list