[Samba] io_uring cause data corruption

A L mail at lechevalier.se
Fri May 1 12:59:34 UTC 2020


On 2020-04-30 22:56, Jeremy Allison via samba wrote:
> On Thu, Apr 30, 2020 at 10:25:49AM +0200, A L wrote:
>
>> So I did some more tests. smbclient mget does not copy in the same way
>> Windows Explorer does. When copying in Windows Explorer, there are many
>> multiple concurrent threads used to transfer the files. With smbclient mget
>> there are no corruptions, both locally and over the network from another
>> Linux machine.
>>
>> I analysed the difference between a correct file and a corrupt file.
>> At position 0x7A0000 the corrupt file started to contain only binary zero.
>> At position 0x800000 the zeroes ended and correct data continued. To me it
>> sound like some wrong memory is copied somehow.
>>
>> These two files shows the difference as shown in a hex-editor.
>> https://paste.tnonline.net/files/MO1FJvDOG6E8_smb_1
>> https://paste.tnonline.net/files/Rglite4KWmU8_smb_2
> Is it always the same area in the file that is corrupt ?
> The fact that it's on a 4K page-aligned boundary is
> interesting. If you can corrolate I'd love to see
> the SMB2 traffic on the wire that corresponds to the
> corrupted data write/read.
>
Hi again,

It is not always the same area. The "blank" areas seems to be early in 
the files, bot not at the same offsets.

I set up the following two shares:

###### smb.conf ######
[global]
     log level = 1
     workgroup = WORKGROUP
     netbios name = SAMBA
     server string = Samba Server
     server role = standalone server
     hosts allow = 192.168.0. 127.
     interfaces = lan
     max protocol = SMB3_11
#   max protocol = SMB2 # Windows 10 clients do not want to connect to SMB2

     log file = /var/log/samba/%I.log
     max log size = 10240

     security = user
     passdb backend = tdbsam
#   username map = /etc/samba/users.map
     wins support = yes
     dns proxy = yes

[share_io_uring]
     comment = USB Backup - Media files
     path = /media/usb-backup
     writeable = no
     browseable = yes
     read only = yes
     create mask = 0664
     directory mask = 0775
     guest only = Yes
     guest ok = Yes
     force user = nasuser
     force group = nas
     inherit owner = Yes
     vfs objects = btrfs, io_uring

[share_no_io_uring]
     comment = USB Backup - Media files
     path = /media/usb-backup
     writeable = no
     browseable = yes
     read only = yes
     create mask = 0664
     directory mask = 0775
     guest only = Yes
     guest ok = Yes
     force user = nasuser
     force group = nas
     inherit owner = Yes
     vfs objects = btrfs
############

The test files are in a read-only folder on /media/usb-backup/test2-ro
The same folder is shared twice. One share with IO_URING enabled and 
another with it disabled (as per smb.conf above):

\\SAMBA\share_io_uring\test2-ro
\\SAMBA\share_no_io_uring\test2-ro

I copied the the test directory from each share to a local drive on two 
separate Windows 10 clients using Windows File Explorer over 1 Gbit network.

All files copied from \\SAMBA\share_no_io_uring\test2-ro were 100% 
correct according to sha256sum.
Files copied from \\SAMBA\share_io_uring\test2-ro had more or less all 
corruptions. The corruptions start at varying offsets, and seem to be 
varying lengths.
https://paste.tnonline.net/files/HbaYcVSePiK7_data_compare_io_uring_copy1.png
https://paste.tnonline.net/files/xJ2ClkRAGEH0_data_compare_io_uring_copy2.png

https://paste.tnonline.net/files/FWun1DBgri4i_data_compare_io_uring_copy_3.png
https://paste.tnonline.net/files/MQsBwavh6Bkb_data_compare_io_uring_copy_4.png

Not saying there are no issues with Windows, but the problems seems to 
be very repeatable in my home setup. Removing "vfs modules = io_uring" 
makes everything 100% OK in all tests.


A note regarding threading. I do not know how Windows is requesting the 
files, but looking in iotop, it seems at least several smb threads are 
used serving the requests:

This is "iotop -o" when copying from the no_io_uuing share:
https://paste.tnonline.net/files/G1impe8EHhUU_no_io_uring_copy.png

###### NO IO_URING ######
# iotop -o
Total DISK READ :      86.32 M/s | Total DISK WRITE : 29.76 K/s
Actual DISK READ:      88.13 M/s | Actual DISK WRITE: 22.32 K/s
   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN IO>    COMMAND
29898 be/4 nasuser    10.06 M/s    0.00 B/s  0.00 % 25.41 % smbd -D
29894 be/4 nasuser     9.53 M/s    0.00 B/s  0.00 % 25.12 % smbd -D
29893 be/4 nasuser    10.58 M/s    0.00 B/s  0.00 % 22.86 % smbd -D
29892 be/4 nasuser    13.40 M/s    0.00 B/s  0.00 % 19.01 % smbd -D
29891 be/4 nasuser    10.73 M/s    0.00 B/s  0.00 % 18.57 % smbd -D
29895 be/4 nasuser     9.44 M/s    0.00 B/s  0.00 % 18.36 % smbd -D
29897 be/4 nasuser    11.11 M/s    0.00 B/s  0.00 % 17.86 % smbd -D
29896 be/4 nasuser    11.48 M/s    0.00 B/s  0.00 % 17.08 % smbd -D
29816 be/4 root        0.00 B/s   29.76 K/s  0.00 %  0.00 % 
[kworker/u64:8-btrfs-endio]
############


This is "iotop -o" when copying from the with_io_uring share:
https://paste.tnonline.net/files/HCv5TZtNDd7h_with_io_uring_copy.png

###### WITH IO_URING ######
# iotop -o
Total DISK READ :      77.47 M/s | Total DISK WRITE : 0.00 B/s
Actual DISK READ:      77.61 M/s | Actual DISK WRITE: 0.00 B/s
   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN IO>    COMMAND
29984 be/4 root        8.36 M/s    0.00 B/s  0.00 % 20.21 % 
[io_wqe_worker-1]
29986 be/4 root        9.55 M/s    0.00 B/s  0.00 % 19.44 % 
[io_wqe_worker-1]
29983 be/4 root        4.37 M/s    0.00 B/s  0.00 % 18.47 % 
[io_wqe_worker-0]
29981 be/4 root        9.26 M/s    0.00 B/s  0.00 % 17.94 % 
[io_wqe_worker-0]
29980 be/4 root        5.53 M/s    0.00 B/s  0.00 % 17.85 % 
[io_wqe_worker-0]
29982 be/4 root        5.12 M/s    0.00 B/s  0.00 % 17.46 % 
[io_wqe_worker-1]
29987 be/4 root       11.75 M/s    0.00 B/s  0.00 % 16.38 % 
[io_wqe_worker-0]
29985 be/4 root        6.53 M/s    0.00 B/s  0.00 % 15.81 % 
[io_wqe_worker-0]
29820 be/4 nasuser    17.00 M/s    0.00 B/s  0.00 %  4.18 % smbd -D
############

I am not sure show to do a dump of the SMB (SMB3) traffic. I could use 
tcpdump on linux or Wireshark on Windows. Could you provide a recipe for 
me to use? I'd be happy to help out more.

I'll upload all test files for you to check and analyze the differences 
between faulty and correct files.
https://mirrors.tnonline.net/samba/

Regards,
Anders



More information about the samba mailing list