[linux-cifs-client] Re: read/write retry behavior with dead network/server/disconnected cable

Steven French sfrench at us.ibm.com
Tue Aug 29 18:31:18 GMT 2006


Did more research on this - for the paradoxical case of cifs reading 
successfully with 
the cable unplugged there is no problem that I can see.    The data being 
returned
to the application was readahead into the cache on the client.

In my testing using 4K reads, i see the next 16 reads succeed (after the 
cable is unplugged or
the server or network stopped) because the last successful reads (over 
the network) had previously read 64k beyond where the application was 
reading.  So this
is a simple case of readahead working as it should.    I checked the data 
returned on each
read - and the data was correct (I had a file with a distinct pattern 
incrementing every eight bytes).

/proc/fs/cifs/DebugData would show the state of the session as 
disconnected,
but the client thinks that there is good data in the cache.

We could make the behavior configurable (ie not have cifs call 
generic_file_read
in this case so we do not attempt to read from the page cache when the 
connection is dead)
but I am not sure that that is needed.

Am investigating the write case, but the read behavior seems fine.


Steven French/Austin/IBM wrote on 08/18/2006 05:45:17 PM:

> Shirish did some useful research today, confirming that there is one 
cifs 
> write case (case d below) that needs more investigation in regards to 
> retry behavior when the cable is unplugged (or network or server is 
down). 
>  Cases a) and b) look fine. See below:
> 
> a) with hard option, read/write block forver but resume once the link is 

> restored.
> 
> b) with forcedirectio, read/write block for approximately 30/50 seconds 
or 
> so and then return with return code of -1 and errno set to 0x70  (112 
> EHOSTDOWN).
> 
> c) with default options, read blocks for approximately 30 seconds or so 
> and then returns without any error with return code of 0x0, read call 
> keeps reading for 3 seconds and then blocks one
> more time for approximately 30 seconds and then returns with return code 

> of -1 and errno set to 0x70.  I do not know what it is reading (cached 
> data) for three seconds without returning an 
> error code.
> 
> d) with default option, write blocks for approximately 50 seconds and 
then 
> keeps on writing every 10 seconds without any error.  I do not know what 

> it is writing and where.
> 
> 
> Steve French
> Senior Software Engineer
> Linux Technology Center - IBM Austin
> phone: 512-838-2294
> email: sfrench at-sign us dot ibm dot com



More information about the linux-cifs-client mailing list