Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3

Jens Axboe axboe at
Thu May 7 16:43:17 UTC 2020

On 5/6/20 9:42 AM, Pavel Begunkov wrote:
> On 06/05/2020 18:20, Stefan Metzmacher wrote:
>> Am 06.05.20 um 14:55 schrieb Pavel Begunkov:
>>> On 05/05/2020 23:19, Stefan Metzmacher wrote:
>>> AFAIK, it can. io_uring first tries to submit a request with IOCB_NOWAIT,
>>> in short for performance reasons. And it have been doing so from the beginning
>>> or so. The same is true for writes.
>> See the other mails in the thread. The test I wrote shows the
> Cool you resolved the issue!
>> implicit IOCB_NOWAIT was not exposed to the caller in  (at least in 5.3
>> and 5.4).
> # git show remotes/origin/for-5.3/io_uring:fs/io_uring grep "kiocb->ki_flags |=
> IOCB_NOWAIT" -A 5 -B 5
> if (force_nonblock)
>         kiocb->ki_flags |= IOCB_NOWAIT;
> And it have been there since 5.2 or even earlier. I don't know, your results
> could be because of different policy in block layer, something unexpected in
> io_uring, etc., but it's how it was intended to be.
>> I think the typical user don't want it to be exposed!
>> I'm not sure for blocking reads on a socket, but for files
>> below EOF it's really not what's expected.
> Hard to say, but even read(2) without any NONBLOCK doesn't guarantee that.
> Hopefully, BPF will help us with that in the future.

Replying here, as I missed the storm yesterday... The reason why it's
different is that later kernels no longer attempt to prevent the short
reads. They happen when you get overlapping buffered IO. Then one sqe
will find that X of the Y range is already in cache, and return that.
We don't retry the latter blocking. We previously did, but there was
a few issues with it:

- You're redoing the whole IO, which means more copying

- It's not safe to retry, it'll depend on the file type. For socket,
  pipe, etc we obviously cannot. This is the real reason it got disabled,
  as it was broken there.

Just like for regular system calls, applications must be able to deal
with short IO.

Jens Axboe

More information about the samba-technical mailing list