[[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to receive data

Mon Aug 14 23:35:32 UTC 2017

> -----Original Message-----
> From: Long Li
> Sent: Monday, August 14, 2017 7:25 PM
> To: Tom Talpey <ttalpey at microsoft.com>; Steve French <sfrench at samba.org>;
> linux-cifs at vger.kernel.org; samba-technical at lists.samba.org; linux-
> kernel at vger.kernel.org
> Subject: RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to
> receive data
> 
> 
> 
> > -----Original Message-----
> > From: Tom Talpey
> > Sent: Monday, August 14, 2017 1:57 PM
> > To: Long Li <longli at microsoft.com>; Steve French <sfrench at samba.org>;
> > linux-cifs at vger.kernel.org; samba-technical at lists.samba.org; linux-
> > kernel at vger.kernel.org
> > Subject: RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer
> > to receive data
> >
> > > -----Original Message-----
> > > From: linux-cifs-owner at vger.kernel.org [mailto:linux-cifs-
> > > owner at vger.kernel.org] On Behalf Of Long Li
> > > Sent: Wednesday, August 2, 2017 4:11 PM
> > > To: Steve French <sfrench at samba.org>; linux-cifs at vger.kernel.org;
> > > samba- technical at lists.samba.org; linux-kernel at vger.kernel.org
> > > Cc: Long Li <longli at microsoft.com>
> > > Subject: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer
> > > to receive data
> > >
> > >  /*
> > > + * Read data from receive reassembly queue
> > > + * All the incoming data packets are placed in reassembly queue
> > > + * buf: the buffer to read data into
> > > + * size: the length of data to read
> > > + * return value: actual data read
> > > + */
> > > +int cifs_rdma_read(struct cifs_rdma_info *info, char *buf, unsigned
> > > +int size) {
> > >...
> > > +       spin_lock_irqsave(&info->reassembly_queue_lock, flags);
> > > +       log_cifs_read("size=%d info->reassembly_data_length=%d\n", size,
> > > +               atomic_read(&info->reassembly_data_length));
> > > +       if (atomic_read(&info->reassembly_data_length) >= size) {
> >
> > If the reassembly queue is protected by a lock, why is an atomic_read() of its
> > length needed?
> 
> Will change this to non-atomic.
> 
> >
> > > +                       // this is for reading rfc1002 length
> > > +                       if (response->first_segment && size==4) {
> > > +                               unsigned int rfc1002_len =
> > > +                                       data_length + remaining_data_length;
> > > +                               *((__be32*)buf) = cpu_to_be32(rfc1002_len);
> > > +                               data_read = 4;
> > > +                               response->first_segment = false;
> > > +                               log_cifs_read("returning rfc1002 length %d\n",
> > > +                                       rfc1002_len);
> > > +                               goto read_rfc1002_done;
> > > +                       }
> >
> > I am totally confused. What does RFC1002 framing have to do with receiving
> > an SMB Direct packet???
> 
> The upper layer expects RFC1002 length at the beginning of the payload. A lot
> of protocol processing logic check and act on this value. Returning this value
> will avoid changes to lots of other upper layer code.
> 
> This will be eventually fixed when a transport layer is added to upper layer
> code. I recommend we do it in another patch.

It's totally non-obvious that you are *inserting* an RFC1002 length into the received
message. Need to state that, and include the above explanation.

OK on deferring that work. So far so good!

> > > +                       to_copy = min_t(int, data_length - offset, to_read);
> > > +                       memcpy(
> > > +                               buf + data_read,
> > > +                               (char*)data_transfer + data_offset + offset,
> > > +                               to_copy);
> >
> > Is it really necessary to perform all these data copies, especially under the
> > reassembly_queue spinlock? This seems quite inefficient. Can the receive
> > buffers not be loaned out and chained logically?
> 
> This will require upper layer code changes to move to use new buffers
> allocated/loaned this way, and also deal with packet boundaries.
> 
> This code is not used to actually carry file data, which are normally done
> through RDMA read/write.

Disagree - RDMA will only be used when the size exceeds a threshold, perhaps 4KB or
even 8KB. That means you'll be performing several memcpy()s for such workloads.
RDMA is only used for bulk data.

At the very least, try to rearrange the code to avoid holding the reassembly lock for
so long, definitely not when memcpy'ing. It will significantly single-thread all your receives.

> If we want to do it, I suggest do another patch since more changes other than
> transport are involved.

Sure, that's fine but 1) we will definitely want to do it and 2) this needs a comment.

Tom.