Hitting an AIO problem on FreeBSD when we hit the limit of 100 (actually) queued requests

Jeremy Allison jra at samba.org
Fri Jan 4 16:48:18 MST 2013


On Thu, Jan 03, 2013 at 06:41:17PM -0800, Richard Sharpe wrote:
> Hi folks,
> 
> I am hitting a problem with AIO (write) on 3.6.6 when we hit the limit
> of outstanding AIO requests.
> 
> My debugging shows the following:
> 
> [2013/01/03 17:57:14.260653,  5] smbd/smb2_server.c:879(smbd_smb2_request_pendin
> g_queue)
>   setting pending: MID = 0x2e4, req->subreq = 0x804a77550
> [2013/01/03 17:57:14.260974, 10] smbd/smb2_server.c:2574(smbd_smb2_request_incom
> ing)
>   smbd_smb2_request_incoming: idx[1] of 4 vectors
> [2013/01/03 17:57:14.261175, 10] smbd/smb2_server.c:354(smb2_validate_message_id
> )
>   smb2_validate_message_id: clearing id 741 (position 741) from bitmap
> [2013/01/03 17:57:14.261240, 10] smbd/smb2_server.c:1224(smbd_smb2_request_dispa
> tch)
>   smbd_smb2_request_dispatch: opcode[SMB2_WRITE] mid = 741
> [2013/01/03 17:57:14.261314,  4] smbd/uid.c:351(change_to_user)
>   Skipping user change - already user
> [2013/01/03 17:57:14.261395, 10] smbd/smb2_write.c:243(smbd_smb2_write_send)
>   smbd_smb2_write: another-big-one.txt - fnum[19485]
> [2013/01/03 17:57:14.261474,  3] smbd/aio.c:534(schedule_aio_smb2_write)
>   smb2: Already have 100 aio activities outstanding.
> 
> Thus, after MID 0x2e4 we hit the limit.
> 
> Then I see the following completions:
> 
>   setting pending: MID = 0x2e4, req->subreq = 0x804a77550
> **** After this we hit the limit and we see the following requests complete
>   aio_write:       MID = 0x2e0, subreq = 0x804a27550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2e1, subreq = 0x804a3b550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2e2, subreq = 0x804a4f550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2e3, subreq = 0x804a63550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2e4, subreq = 0x804a77550, errcode: Unknown error: 0
>   aio_write:       MID = 0x282, subreq = 0x8042b1550, errcode: Unknown error: 0
>   aio_write:       MID = 0x29f, subreq = 0x8044f2550, errcode: Unknown error: 0
>   aio_write:       MID = 0x284, subreq = 0x8042b7550, errcode: Unknown error: 0
>   aio_write:       MID = 0x286, subreq = 0x8042f8550, errcode: Unknown error: 0
> ...
>   aio_write:       MID = 0x2da, subreq = 0x8049ad550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2db, subreq = 0x8049c1550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2dc, subreq = 0x8049d5550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2dd, subreq = 0x8049e9550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2de, subreq = 0x8049fd550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2df, subreq = 0x804a13550, errcode: Unknown error: 0
>   aio_write:       MID = 0x2e0, subreq = 0, errcode: Invalid argument
> 
> At this point we crash, because 0x2e0 has been replayed. It is the
> 100th element and it has already been responded to first in the list
> above, so the subreq has been zeroed out. I think the invalid argument
> response is because the kernel knows that the AIOCB is not in the
> system.
> 
> I will continue to look at this, but it looks like there is a problem
> in the tevent signal handling stuff and an off-by-one problem. Has
> anyone seen this?

No, I'm not aware of anything here. Let me know if you want some
help tracking this one down !

Jeremy.


More information about the samba-technical mailing list