[PATCH v3 0/4] smb: smbdirect/client/server: relax WARN_ON_ONCE(SMBDIRECT_SOCKET_*) checks

Stefan Metzmacher metze at samba.org
Thu Nov 27 15:45:16 UTC 2025


Am 27.11.25 um 00:10 schrieb Namjae Jeon:
> On Thu, Nov 27, 2025 at 1:03 AM Stefan Metzmacher <metze at samba.org> wrote:
>>
>> Am 26.11.25 um 16:18 schrieb Stefan Metzmacher via samba-technical:
>>> Am 26.11.25 um 16:17 schrieb Namjae Jeon:
>>>> On Wed, Nov 26, 2025 at 4:16 PM Stefan Metzmacher <metze at samba.org> wrote:
>>>>>
>>>>> Am 26.11.25 um 00:50 schrieb Namjae Jeon:
>>>>>> On Tue, Nov 25, 2025 at 11:22 PM Stefan Metzmacher <metze at samba.org> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> here are some small cleanups for a problem Nanjae reported,
>>>>>>> where two WARN_ON_ONCE(sc->status != ...) checks where triggered
>>>>>>> by a Windows 11 client.
>>>>>>>
>>>>>>> The patches should relax the checks if an error happened before,
>>>>>>> they are intended for 6.18 final, as far as I can see the
>>>>>>> problem was introduced during the 6.18 cycle only.
>>>>>>>
>>>>>>> Given that v1 of this patchset produced a very useful WARN_ONCE()
>>>>>>> message, I'd really propose to keep this for 6.18, also for the
>>>>>>> client where the actual problem may not exists, but if they
>>>>>>> exist, it will be useful to have the more useful messages
>>>>>>> in 6.16 final.
>>>>>> First, the warning message has been improved. Thanks.
>>>>>> However, when copying a 6-7GB file on a Windows client, the following
>>>>>> error occurs. These error messages did not occur when testing with the
>>>>>> older ksmbd rdma(https://github.com/namjaejeon/ksmbd).
>>>>>
>>>>> With transport_rdma.* from restored from 6.17?
>>>> I just tested it and this issue does not occur on ksmbd rdma of the 6.17 kernel.
>>>
>>> 6.17 or just transport_rdma.* from 6.17, but the rest from 6.18?
>>>
>>
>> Can you also test with 6.17 + fad988a2158d743da7971884b93482a73735b25e
>> Maybe that changed things in order to let RDMA READs fail or cause a
>> disconnect.
> The kernel version I tested was 6.17.9 and this patch was already applied.

Ah, good it also has:
smb: server: let smb_direct_flush_send_list() invalidate a remote key first

>> Otherwise I'd suggest to test the following commits in order
>> to find where the problem was introduced:
>> 177368b9924314bde7d2ea6dc93de0d9ba728b61
> I don't have time to do this right now due to other work.
> Did you test it with a Windows client and can't find this issue?

I can only reproduce the problem this patchset is fixing,
(recv completion before getting the ESTABLISHED callback).

I tested with an Intel-E810-CQDA2 card in RoCEv2 mode
and a Windows 2025 server (acting as client against ksmbd).

And I can only reproduce the problem with the recv completion
before the ESTABLISHED event. So this patchset is not only
used for setups with a connectx-7, btw were you testing with infiniband or rocev2?

Otherwise copying a 2G and 5G file to and from the share works
without problems.

I used this to verify that rdma offload was used:

root at rdmatest04l0:~# cat ksmbd-rdma-xmit.bt
kprobe:smb_direct_rdma_xmit {
         printf("%s pid=%d %s\n", comm, pid, func)
}
root at rdmatest04l0:~# bpftrace ksmbd-rdma-xmit.bt

And it printed a lot of kworker/4:1 pid=6162 smb_direct_rdma_xmit lines...

 From the logs you send it seems the client terminated the tcp and rdma connections,
do you see something in the clients event log?

metze



More information about the samba-technical mailing list