[Samba] io_uring cause data corruption
A L
mail at lechevalier.se
Mon Apr 27 21:21:35 UTC 2020
On 2020-04-27 18:45, Jeremy Allison via samba wrote:
> On Mon, Apr 27, 2020 at 10:27:17AM +0200, A L via samba wrote:
>> On 2020-04-26 19:46, Jeremy Allison via samba wrote:
>>> On Sun, Apr 26, 2020 at 11:51:42AM +0200, A L via samba wrote:
>>>> * Connected from a Windows 10 computer over 1G ethernet. * Copy
>>>> data using Windows Explorer and FastCopy(1) from the Samba share to
>>>> a local disk. * Verify the sha-256 sum on the files. From what I
>>>> can see there is data corruption on many of the files. Sha-256 does
>>>> not match. I copied the same files many times and the data
>>>> corruption occurs within minutes. The total data set is about 800GB.
>>> Can you do checksums on file fragments so we can discover at what
>>> offset (if non-zero) the corruption occurs.
>> Yes, I will check this. I saw a patch on the kernel mailing list
>> about possible corruptions in during re-scheduling. I wonder if this
>> is the problem I am hitting. I'll make some more tests with this
>> patch. https://www.spinics.net/lists/io-uring/msg01706.html
> Oh, that might explain it. I won't do further work until you can
> confirm the Samba corruptions happen with this kernel patch also.
Hello again,
I set up the following test case:
* Linux 5.7-rc3 (with the patch from previous mail)
* samba-4.12.1
* gcc-9.3.0
* liburing-0.6
* glibc-2.30-r8
=================================
Test 1)
Copy 10 10GB files.
1) ddrescue -s 10G -v -f /dev/urandom 0.bin
2) for((i=1;i<=10;i+=1)); do cp --reflink=always 0.bin $i.bin; done
3) sha256sum *.bin > sha256sum.txt
4) Windows 10, file explorer, copy the 10 files to a local disk D:\test\
5) Verify local files in D:\test with sha256sum
6) sha256sum was correct.
7) redid step 4 and 5. Now sha256sum was wrong, but all 10 files had the
same (but wrong) csum!
=================================
Test 2)
Copy 1000 10MB files.
1) ddrescue -s 10M -v -f /dev/urandom 0.bin
2) for((i=1;i<=1000;i+=1)); do cp --reflink=always 0.bin $i.bin; done
3) sha256sum *.bin > sha256sum.txt
4) Windows 10, file explorer, copy all 1000 files to a local disk D:\test\
5) Verify local files in D:\test with sha256sum
The results are very surprising!
Correct sha256sum is:
c5ce0d7596c26b18a11eb0609abcd1ba5a4fc12cedcf5ce011a4bf1e227347ae
This is how the files verified:
=======================
D:\TEST\sha256sum.exe *.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *0.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *1.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *10.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *100.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *1000.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *101.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *102.bin
...
...
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *153.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *154.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *155.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *156.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *157.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *158.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *159.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *16.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *160.bin
...
The csum changed here and continued for roughly 200 files until
eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *308.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *309.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *31.bin
0c627526f677704d7beec0b56dedb89a1118b78e481d3f012fbc01f923211838 *310.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *311.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *312.bin
0d3d00122af0d486b2a9e1231239c0a77c034564957f069e310186ef8a7ba4aa *313.bin
de84b1fae759a25b0679f73da68747fd8183635d1a6d39d1b28b35d306837fd2 *314.bin
e681cdbc8bf557047967edfe2de71d62753af58c8eba422dfb1a7c6220b58f7b *315.bin
fd517535b5d7115ef7f76480b7f121f957eaba07baee4e58c47f0c2dd3c8614c *316.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *317.bin
c5ce0d7596c26b18a11eb0609abcd1ba5a4fc12cedcf5ce011a4bf1e227347ae *318.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *319.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *32.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *320.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *321.bin
0c627526f677704d7beec0b56dedb89a1118b78e481d3f012fbc01f923211838 *322.bin
c5ce0d7596c26b18a11eb0609abcd1ba5a4fc12cedcf5ce011a4bf1e227347ae *323.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *324.bin
ae2601d8dcd1ef592a92907843a673703e6161173164b1094a52508ff65ab60a *325.bin
c5ce0d7596c26b18a11eb0609abcd1ba5a4fc12cedcf5ce011a4bf1e227347ae *326.bin
2eae29bc03989b7594550837dacacdebedc8757bec6b889e39e8289492653881 *327.bin
6720f9dc964c0fa2125366338f921a6772b2b0751ef194b12779989ef25a9be8 *328.bin
ec8c312be8d7c20bb39e20007d53e6ac49022df56877ec00f9dc757f74deaa7d *329.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *33.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *330.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *331.bin
7bff13155c38b825def06e269fae3543395f7273104aff8eb7c8b488419d09fe *332.bin
...
continued with the same (wrong) csums till the end.
RESULT: 23 out of 1001 files had correct csum.
=================================
Test 3)
1) Remove io_uring from vfs objects in smb.confand restart Samba.
2) Copy all original files from Test 2
3) All 1001 files' csums are now correct.
=================================
smb.conf:
###########################
[global]
log level = 1
workgroup = WORKGROUP
netbios name = SAMBA
server string = Samba Server
server role = standalone server
hosts allow = 192.168.0. 127.
interfaces = lan
max protocol = SMB3_11
log file = /var/log/samba/%I.log
max log size = 10240
security = user
passdb backend = tdbsam
wins support = yes
dns proxy = yes
[usb-backup]
comment = USB Backup - Media files
path = /media/usb-backup
writeable = no
browseable = yes
read only = yes
create mask = 0664
directory mask = 0775
guest only = Yes
guest ok = Yes
force user = nasuser
force group = nas
store dos attributes = yes
ea support = no
acl group control = no
inherit owner = Yes
vfs objects = btrfs, io_uring
###########################
The samba logfile does not contain much. These are the logs during the test:
[2020/04/27 23:12:54.432587, 1]
../../source3/param/loadparm.c:2512(lp_idmap_range)
idmap range not specified for domain '*'
[2020/04/27 23:13:02.227075, 1]
../../source3/param/loadparm.c:2512(lp_idmap_range)
idmap range not specified for domain '*'
[2020/04/27 23:13:02.882414, 1]
../../source3/param/loadparm.c:2512(lp_idmap_range)
idmap range not specified for domain '*'
Regards,
Anders
More information about the samba
mailing list