[linux-cifs-client] [patch] Increase send time out on a socket long enough inorder to eliminate any timeouts on large sends

Jeff Layton jlayton at redhat.com
Thu Jul 23 09:34:39 MDT 2009


On Thu, 23 Jul 2009 09:51:32 -0500
Shirish Pargaonkar <shirishpargaonkar at gmail.com> wrote:

> On Thu, Jul 23, 2009 at 7:05 AM, Jeff Layton<jlayton at redhat.com> wrote:
> > On Wed, 22 Jul 2009 20:14:38 -0500
> > Shirish Pargaonkar <shirishpargaonkar at gmail.com> wrote:
> >
> >> Inspite of a set of data integrity patches in cifs last yer, there
> >> still persist errors
> >> caused due to timeouts resulting in sending incomplete data and
> >> hence data integrity errors.
> >>
> >> The proposed socket send timeout is large enough to elminate that possibility.
> >
> > On what evidence do you base the above statement? Who's to say that 30s
> > is long enough if someone has a high-latency enough connection?
> >
> >> The tests with this patches have resulted in elminating data integrity errors on
> >> an 80 hours test runs which otherwise manifest in matter of hours of a test run.
> >>
> >
> > Also, can you give some details about these data integrity errors? Were
> > writes failing? If so, were they not reported at fsync or close?
> 
> The errors logged by cifs client were like this
> This is what I had seen last year when the patches were developed.
> The entire write could not be sent because of socket timeout, other thread
> fills in rest of the 56K write so that second 56K is not responded and client
> logs 'No response for cmd'.
> The longer timeout seems to be long enough for server to receive entire
> smbwrite (56K).
> 
> May 12 05:17:09 voyBCSsles11-rc3 kernel:  CIFS VFS: server not responding
> May 12 05:17:09 voyBCSsles11-rc3 kernel:  CIFS VFS: No response for cmd 50 mid
> 20646
> May 12 05:17:09 voyBCSsles11-rc3 kernel:  CIFS VFS: No response to cmd 47 mid
> 20647
> May 12 05:17:09 voyBCSsles11-rc3 kernel:  CIFS VFS: Write2 ret -11, wrote 0
> May 12 05:17:11 voyBCSsles11-rc3 kernel:  CIFS VFS: Write2 ret -9, wrote 0
> May 12 05:17:39 voyBCSsles11-rc3 kernel:  CIFS VFS: server not responding
> May 12 05:17:39 voyBCSsles11-rc3 kernel:  CIFS VFS: No response for cmd 50 mid
> 21347
> May 12 05:17:39 voyBCSsles11-rc3 kernel:  CIFS VFS: No response to cmd 47 mid
> 21348
> May 12 05:17:39 voyBCSsles11-rc3 kernel:  CIFS VFS: Write2 ret -11, wrote 0
> May 12 05:17:39 voyBCSsles11-rc3 kernel:  CIFS VFS: Write2 ret -9, wrote 0
> May 12 05:18:09 voyBCSsles11-rc3 kernel:  CIFS VFS: server not responding
> May 12 05:18:09 voyBCSsles11-rc3 kernel:  CIFS VFS: No response to cmd 46 mid
> 24859
> May 12 05:18:09 voyBCSsles11-rc3 kernel:  CIFS VFS: Send error in read = -11
> May 12 05:18:09 voyBCSsles11-rc3 kernel:  CIFS VFS: No response for cmd 50 mid
> 24858
> 
> 

It sounds like the original bug was never fixed then, only made less
likely by changing the timing. This patch looks like it just does the
same thing.

Rather than papering over the bug by increasing the timeout, I think a
patch is needed that fixes the actual bug. That is, you need to make it
impossible for these sorts of interleaved sends to occur.

-- 
Jeff Layton <jlayton at redhat.com>


More information about the linux-cifs-client mailing list