[linux-cifs-client] close-to-open cache consistency and cifs

Fri Dec 19 16:26:42 GMT 2008

On Fri, 19 Dec 2008 09:50:58 -0600
Steven French <sfrench at us.ibm.com> wrote:

> Yes - Shirish is correct.
> 
> We don't reconnect on close because close on the server side is done 
> automatically when the session drops.   Before we get to cifs_close on the 
> client side, the VFS (fput) does flush so we have written out all dirty 
> pages by this point.   We have talked about trying filemap_fdatawait twice 
> in flush to handle errors, but by the time we get to close we have tried 
> hard to flush the pages.
> 

If that's the case, then why are we bothering with all of this waiting
in cifs_close? I have to express some reservation with any scheme that
is basically "wait for a little while and then give up". There are
places where that's appropriate, but it doesn't seem like flush/close
is one of them.

> On the point of close to open consistency, I don't believe NFS does it in 
> a way that is safe, and couldn't really do it in any case without a close 
> and open operation.   Between the write and the getattr can be writes from 
> other clients which are not reflected in the page cache on the client, and 
> this could be a very big race if time stamps are not very granular (e.g. 
> if 1 second ext3 time stamps).   CIFS can get the time from the close 
> request of course, which makes it much safer in that regard.   

Actually, I misspoke. NFS uses the attrs on the last write reply when
closing. That should make the scheme safe (well, at least as safe as it
can be assuming sufficient time granularity).

> Should make a correction ... we can set the time on SMB Close
> 
> SMB2 close can return the last write time (which is more useful).

That's a good point. We can't guarantee CTO consistency if we don't get
attributes from the last write call. I don't think even getting the
attrs from SMB2 close is sufficient unless we were holding an oplock.
If we don't have an oplock then another writer could race in and change
the file before we issue the close.

> If there is a failure in filemap_fdatawrite then we should be retrying at 
> least once (remember that we could be failing due to memory pressure that 
> is unresolvable unless we return back to flush), and we are not going to 
> close a file handle while there are writes on that handle (that has 
> nothing to do with flush/close races ... that is simply due to the fact 
> that cifs_writepages does not have any way of picking the correct file 
> handle and can pick one that is going to be closed as soon as 
> filemap_fdatawrite/filemap_fdatawait finishes - remember that the kernel 
> does not tell us which file handle to use in writepages so when a file is 
> open multiple times we may have to wait in close if writepages is using 
> the "wrong" handle).
> 
> In flush we could be doing filemap_fdatawait but I don't think it matters. 
>   Our writes in cifs_writepages are synchronous anyway.

I suppose my point here is that there are possible situations where the
server filehandle might never be closed. The scheme in cifs_flush is
"wait for a little while and then give up". Once we "give up", what
closes the filehandle?

-- 
Jeff Layton <jlayton at redhat.com>