Work in progress SMB-Direct driver for the linux kernel

Tom Talpey ttalpey at microsoft.com
Fri Feb 2 14:52:42 UTC 2018


> -----Original Message-----
> From: linux-cifs-owner at vger.kernel.org <linux-cifs-owner at vger.kernel.org> On
> Behalf Of Stefan Metzmacher
> Sent: Friday, February 2, 2018 4:25 AM
> To: Richard Sharpe <realrichardsharpe at gmail.com>; Jason Gunthorpe
> <jgg at ziepe.ca>
> Cc: linux-cifs at vger.kernel.org; linux-rdma at vger.kernel.org; Samba Technical
> <samba-technical at lists.samba.org>; Steve French <smfrench at gmail.com>;
> David Disseldorp <ddiss at samba.org>
> Subject: Re: Work in progress SMB-Direct driver for the linux kernel
> 
> Hi Jason,
> 
> >>> The first goal is to provide a socket fd to userspace (or in kernel
> >>> consumers)
> >>> which provides semantics like a TCP socket which is used as transport
> >>> for SMB3. Basically frames are submitted with a 4 byte length header.
> >>
> >> Part of the point of RDMA is that we don't need to make protocol
> >> specific kernel modules like this - is there a specific reason this
> >> needs to be in the kernel like this?
> >
> > If I had to guess it would be because Samba currently uses a fork
> > model ... it might be years before it gets to a completely threaded
> > model.
> 
> Yes, and it also means that our client and server code only need
> minimal changes in order to work in the same way it would work
> over tcp.
> 
> Only the RDMA read and writes need some more work, but I have
> some ideas where the userspace gives the kernel an fd, offset and length
> plus a remove memory descriptor as ioctl on the connection fd. Then the
> kernel can get the content from the filesystem and directly pass it to
> the rdma adapter, avoiding the copy from kernel to userspace and back.
> From userspace we'll just wait in the syscall and don't have to care
> about memory registrations and all other complex stuff.

Doesn't this sort of transport shimming put back all the overhead it was
trying to avoid? Stripping off the 4-byte record marker, rearranging the
read/write data and SMB3_READ operation header to add the channel
(memory registration) handles, and most importantly placing the data
in bounce buffers to accommodate the readv()/writev() calls are quite
complex and expensive. And, just to present a file descriptor? 

Experience in early NFS/RDMA and Windows Sockets Direct have taught
that transparency above the RDMA transport interface is generally the
enemy of performance. The shims are forced to perform additional syscalls,
RDMA work requests, and sometimes even network round trips. Do you
have performance results for yours?

> It also happens that smbd sometimes blocks in syscalls like unlink for
> a long time. It's good to have the kernel as 2nd entity that takes care
> of keepalives.

I agree that implementing SMB Direct in your userspace SMB3 daemon
may be problematic. But what of the existing SMB Direct code in the
CIFS kernel client? How will that coexist going forward?

Tom.


More information about the samba-technical mailing list