Data Corruption bug with Samba's vfs_iouring and Linux 5.6.7/5.7rc3

Fri May 8 18:53:39 UTC 2020

On Fri, May 08, 2020 at 09:35:40PM +0530, Anoop C S wrote:
> 
> Jeremy's patch(for handling short reads) still works with updated
> kernel(v5.6.10).
> 
> > Can you please also test the patch I posted here:
> > https://bugzilla.samba.org/show_bug.cgi?id=14361#c21
> 
> Unfortunately copy failed to start with this patch set. I don't know if
> its because of updated kernel(v5.6.8 -> v5.6.10) but smbd terminated
> with SIGABRT right at the beginning of copy. I have attached backtrace
> and log file. Please have a look.

I see the problem. It's the reason I initially introduced
an immediate event, and then refactored my code to call
_vfs_io_uring_queue_run() in a loop if it returned 'need_retry'.

Metze's current code is recursive, and it's running out
of stack.

It uses:

vfs_io_uring_pread_send()
	->vfs_io_uring_pread_submit()  <-----------------------------------
		->vfs_io_uring_request_submit()                           |
			->vfs_io_uring_queue_run()                        |
                                                                          |
But inside vfs_io_uring_queue_run() once a read                           |
completes it calls:                                                       |
                                                                          |
vfs_io_uring_queue_run()                                                  |
	->vfs_io_uring_finish_req()                                       |
		->cur->completion_fn()                                    |
                                                                          |
cur->completion_fn() for pread is set to vfs_io_uring_pread_completion()  |
                                                                          |
vfs_io_uring_pread_completion()                                           |
	->vfs_io_uring_pread_submit() -------------------------------------

(the ascii art will probably only work in fixed-font text mailers
but you should get the idea).

I don't think this pattern will work. You have to exit
from vfs_io_uring_queue_run() all the way out so that
you've called io_uring_cq_advance() to move the completion
queue forward before you can submit another request to
finish the short IO.

That's what my initial patch did with an immedate event,
until I got the idea of wrapping vfs_io_uring_queue_run()
inside a retry wrapper function.

I'll take a look to see if I can fix this so the short
read re-submission happens *outside* of vfs_io_uring_queue_run(),
as I think that's the only way to make this work.

Jeremy.