[Samba] Friendly Reminder: Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf

awl1 awl1 at mnet-online.de
Sat Jul 15 17:13:12 UTC 2017


Hello Jeremy,

Am 14.07.2017 um 23:33 schrieb Jeremy Allison via samba:
> Look into it some more (I'm a bit busy with other stuff right now).
no worries, the issue really is not that time critical - I would only 
like to see it resolved "sooner or later" in order to be able to update 
my NAS to a recent Samba version (e.g. to be safe against SambaCry) 
without the performance setback for the huge number of files scenario - 
which right now still makes me return to Samba 3.5 when I need the "huge 
number of small files" to be written with proper speed...


I have tried to determine some more information in the meantime.

While I think I might have stumbled over some issue that might cause at 
least part of the performance regression, I am not completely sure, as I 
am missing any information about the design concepts: How should the 
process of writing a file to the Samba share take place properly - i.e. 
which sequence of commands/message types (Find with Find subtypes, 
Create, Write, Close, ...) would be expected vs. which sequence is 
indeed seen.

 From what I am able to determine, the command sequence in SMB 1.5 for 
the Write scenario with good performance looks like the following:

93    6.460072    192.168.1.121    192.168.1.221    SMB    208 Trans2 
Request, FIND_FIRST2, Pattern: \napp\Header_Post_files\arches.png
94    6.466082    192.168.1.221    192.168.1.121    SMB    93 Trans2 
Response, FIND_FIRST2, Error: STATUS_NO_SUCH_FILE
109    6.474144    192.168.1.121    192.168.1.221    SMB    212 NT 
Create AndX Request, FID: 0xb6ba, Path: \napp\Header_Post_files\arches.png
110    6.482842    192.168.1.221    192.168.1.121    SMB    193 NT 
Create AndX Response, FID: 0xb6ba
111    6.482978    192.168.1.121    192.168.1.221    SMB    130 Trans2 
Request, QUERY_FILE_INFO, FID: 0xb6ba, Query File Internal Info
112    6.483318    192.168.1.221    192.168.1.121    SMB    126 Trans2 
Response, FID: 0xb6ba, QUERY_FILE_INFO
115    6.484004    192.168.1.121    192.168.1.221    SMB    130 Trans2 
Request, QUERY_FILE_INFO, FID: 0xb6ba, Query File Basic Info
116    6.484290    192.168.1.221    192.168.1.121    SMB    158 Trans2 
Response, FID: 0xb6ba, QUERY_FILE_INFO
117    6.484473    192.168.1.121    192.168.1.221    SMB    142 Trans2 
Request, SET_FILE_INFO, FID: 0xb6ba
118    6.486985    192.168.1.221    192.168.1.121    SMB    116 Trans2 
Response, FID: 0xb6ba, SET_FILE_INFO
125    6.487335    192.168.1.121    192.168.1.221    SMB    69 Write 
AndX Request, FID: 0xb6ba, 8707 bytes at offset 0
129    6.489836    192.168.1.221    192.168.1.121    SMB    105 Write 
AndX Response, FID: 0xb6ba, 8707 bytes
129    6.489836    192.168.1.221    192.168.1.121    SMB    105 Write 
AndX Response, FID: 0xb6ba, 8707 bytes
130    6.489999    192.168.1.121    192.168.1.221    SMB    174 Trans2 
Request, SET_FILE_INFO, FID: 0xb6ba
131    6.491616    192.168.1.221    192.168.1.121    SMB    116 Trans2 
Response, FID: 0xb6ba, SET_FILE_INFO
and about ten seconds later
13356    17.184267    192.168.1.121    192.168.1.221    SMB    99 Close 
Request, FID: 0xb6ba
13359    17.188793    192.168.1.221    192.168.1.121    SMB    93 Close 
Response, FID: 0xb6ba

So it seems to be "normal" that Samba executes a Find First Request 
returning STATUS_NO_SUCH_FILE every time before actually 
creating/writing to the file even in the "good" case.

Now when looking at the same file in the SMB2 3.1.1 Write scenario, we have

37    0.113403    192.168.1.121    192.168.1.221    SMB2    260 Find 
Request File: napp\Header_Post_files SMB2_FIND_ID_BOTH_DIRECTORY_INFO 
Pattern: *;Find Request File: napp\Header_Post_files 
SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *
38    0.115086    192.168.1.221    192.168.1.121    SMB2    434 Find 
Response;Find Response, Error: STATUS_NO_MORE_FILES
49    0.125778    192.168.1.121    192.168.1.221    SMB2    486 Create 
Request File: nappPS RFP_files\Header_Post_files\arches.png
50    0.139180    192.168.1.221    192.168.1.121    SMB2    374 Create 
Response File: nappPS RFP_files\Header_Post_files\arches.png
51    0.139685    192.168.1.121    192.168.1.221    SMB2    275 GetInfo 
Request FS_INFO/FileFsVolumeInformation File: nappPS 
RFP_files\Header_Post_files\arches.png;GetInfo Request 
FS_INFO/FileFsAttributeInformation File: nappPS 
RFP_files\Header_Post_files\arches.png
52    0.141013    192.168.1.221    192.168.1.121    SMB2    258 GetInfo 
Response;GetInfo Response
53    0.141292    192.168.1.121    192.168.1.221    SMB2    162 SetInfo 
Request FILE_INFO/SMB2_FILE_ENDOFFILE_INFO File: nappPS 
RFP_files\Header_Post_files\arches.png
54    0.142555    192.168.1.221    192.168.1.121    SMB2    124 SetInfo 
Response
61    0.142938    192.168.1.121    192.168.1.221    SMB2    117 Write 
Request Len:8707 Off:0 File: nappPS RFP_files\Header_Post_files\arches.png
65    0.144311    192.168.1.221    192.168.1.121    SMB2    138 Write 
Response
66    0.144500    192.168.1.121    192.168.1.221    SMB2    194 SetInfo 
Request FILE_INFO/SMB2_FILE_BASIC_INFO File: nappPS 
RFP_files\Header_Post_files\arches.png
67    0.145521    192.168.1.221    192.168.1.121    SMB2    124 SetInfo 
Response
and about ten seconds later
5653    10.808959    192.168.1.121    192.168.1.221    SMB2 146    Close 
Request File: nappPS RFP_files\Header_Post_files\arches.png
5676    10.833568    192.168.1.221    192.168.1.121    SMB2 182    Close 
Response

In this case, the Find operation does not look for the particular file 
name that is about to be written, but seems to try and list the whole 
current directory's content with a pattern of "*". Note that, looping 
through the 2000 files to be written, the length of the Find Response 
above increases (significantly!) with every file successfully copied: 
When copying file number 1000, the Find Response sends back a list of 
all 999 files that have been successfully copied to this directory 
before, and this list of 999 file names is not needed for any meaningful 
purpose, as the goal seems to be to check whether file number 1000 
already exists in this list of 999 files (which it of course never 
does!) or not.

The last call to SMB2_FIND_ID_BOTH_DIRECTORY_INFO that is contained in 
the traces has a response length of about 64kB (containing filenames 
that have already been written to the target directory but are not 
needed/helpful in any way) and interestingly does not return 
"STATUS_NO_MORE_FILES", but "STATUS_INFO_LENGTH_MISMATCH", maybe because 
the buffer size for the result of the pattern lookup is only 64kB!?

Base line: I assume that this sequence must be optimized in terms of 
always only calling the find operation subtype, like in SMB1 above, that 
must not use a wildcard "*", but looks for the the exact name of the 
file that is about to be written.

So far, so good - but there also is - for unknown reasons - also a more 
lengthy scenario for some (552 of the 1020) files. I have not been able 
to determine why for these files, there also is an additional call to 
SMB2_FIND_NAME_INFO with the expected return code STATUS_NO_SUCH_FILE as 
we have seen in the SMB 1.5 scenario above. Anyways, before this call, 
we here also still have the unnecessary call to 
SMB2_FIND_ID_BOTH_DIRECTORY_INFO with Pattern "*" as above.

The following sample is for file "napp1408.htm" some 10 seconds later - 
which is the first file to show this new pattern. Before that time, 
there is no single call to SMB2_FIND_NAME_INFO (why does this new 
behavior only start at/after 10 seconds?):

5703    10.858710    192.168.1.121    192.168.1.221    SMB2 260    Find 
Request File: nappPS RFP_files SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: 
*;Find Request File: nappPS RFP_files SMB2_FIND_ID_BOTH_DIRECTORY_INFO 
Pattern: *
5725    10.917195    192.168.1.221    192.168.1.121    SMB2 1034    Find 
Response;Find Response, Error: STATUS_NO_MORE_FILES [!!! This response 
grows in size with every iteration!!!]
5727    10.917420    192.168.1.121    192.168.1.221    SMB2 178    Find 
Request File: nappPS RFP_files SMB2_FIND_NAME_INFO Pattern: napp1408.htm 
[!!! this seems to be more or less equivalent than what is done in SMB 
1.5 !!!]
5730    10.921782    192.168.1.221    192.168.1.121    SMB2 131    Find 
Response, Error: STATUS_NO_SUCH_FILE
5735    10.923888    192.168.1.121    192.168.1.221    SMB2 422    
Create Request File: nappPS RFP_files\napp1408.htm
5738    10.934808    192.168.1.221    192.168.1.121    SMB2 374    
Create Response File: nappPS RFP_files\napp1408.htm
5739    10.935151    192.168.1.121    192.168.1.221    SMB2 162    
SetInfo Request FILE_INFO/SMB2_FILE_ENDOFFILE_INFO File: nappPS 
RFP_files\napp1408.htm
5742    10.937324    192.168.1.221    192.168.1.121    SMB2 124    
SetInfo Response
5744    10.937512    192.168.1.121    192.168.1.221    SMB2 1475    
Write Request Len:2765 Off:0 File: nappPS RFP_files\napp1408.htm
5750    10.940218    192.168.1.221    192.168.1.121    SMB2 138    Write 
Response
5751    10.940343    192.168.1.121    192.168.1.221    SMB2 194    
SetInfo Request FILE_INFO/SMB2_FILE_BASIC_INFO File: nappPS 
RFP_files\napp1408.htm
5754    10.942392    192.168.1.221    192.168.1.121    SMB2 124    
SetInfo Response
10873    22.132423    192.168.1.121    192.168.1.221    SMB2 146    
Close Request File: nappPS RFP_files\napp1408.htm
10876    22.137120    192.168.1.221    192.168.1.121    SMB2 182    
Close Response


So what do you think? Looking forward to your assessment... ;-)

Thanks again & best regards
Andreas




More information about the samba mailing list