Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3

Thu May 7 16:50:40 UTC 2020

On 5/7/20 10:48 AM, Jeremy Allison wrote:
> On Thu, May 07, 2020 at 10:43:17AM -0600, Jens Axboe wrote:
>>
>> Replying here, as I missed the storm yesterday... The reason why it's
>> different is that later kernels no longer attempt to prevent the short
>> reads. They happen when you get overlapping buffered IO. Then one sqe
>> will find that X of the Y range is already in cache, and return that.
>> We don't retry the latter blocking. We previously did, but there was
>> a few issues with it:
>>
>> - You're redoing the whole IO, which means more copying
>>
>> - It's not safe to retry, it'll depend on the file type. For socket,
>>   pipe, etc we obviously cannot. This is the real reason it got disabled,
>>   as it was broken there.
>>
>> Just like for regular system calls, applications must be able to deal
>> with short IO.
> 
> Thanks, that's a helpful definitive reply. Of course, the SMB3
> protocol is designed to deal with short IO replies as well, and
> the Samba and linux kernel clients are well-enough written that
> they do so. MacOS and Windows however..

I'm honestly surprised that such broken clients exists! Even being
a somewhat old timer cynic...

> Unfortunately they're the most popular clients on the planet,
> so we'll probably have to fix Samba to never return short IOs.

That does sound like the best way forward, short IOs is possible
with regular system calls as well, but will definitely be a lot
more frequent with io_uring depending on the access patterns,
page cache, number of threads, and so on.

-- 
Jens Axboe