[linux-cifs-client] close-to-open cache consistency and cifs

Steven French sfrench at us.ibm.com
Fri Dec 19 16:01:14 GMT 2008


If there is a failure in filemap_fdatawrite then we should be retrying at 
least once (remember that we could be failing due to memory pressure that 
is unresolvable unless we return back to flush), and we are not going to 
close a file handle while there are writes on that handle (that has 
nothing to do with flush/close races ... that is simply due to the fact 
that cifs_writepages does not have any way of picking the correct file 
handle and can pick one that is going to be closed as soon as 
filemap_fdatawrite/filemap_fdatawait finishes - remember that the kernel 
does not tell us which file handle to use in writepages so when a file is 
open multiple times we may have to wait in close if writepages is using 
the "wrong" handle).

In flush we could be doing filemap_fdatawait but I don't think it matters. 
  Our writes in cifs_writepages are synchronous anyway.


Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: sfrench at-sign us dot ibm dot com



Jeff Layton <jlayton at redhat.com> 
Sent by: linux-cifs-client-bounces+sfrench=us.ibm.com at lists.samba.org
12/19/2008 09:05 AM

To
"Shirish Pargaonkar" <shirishpargaonkar at gmail.com>
cc
ssorce at redhat.com, linux-cifs-client at lists.samba.org
Subject
Re: [linux-cifs-client] close-to-open cache consistency and cifs






On Fri, 19 Dec 2008 08:44:53 -0600
"Shirish Pargaonkar" <shirishpargaonkar at gmail.com> wrote:

> On Fri, Dec 19, 2008 at 7:08 AM, Jeff Layton <jlayton at redhat.com> wrote:
> > One of the things that NFS implements is close-to-open cache
> > consistency. NFS flushes all pages to the server on close and then 
does
> > a GETATTR call to make sure the attribute cache is up to date. Then, 
on
> > open() it does another getattr to ensure that the file hasn't changed.
> > This allows it to avoid unnecessary cache invalidation. It also does
> > some similar things around posix locks so that they are also used as
> > cache consistency points.
> >
> > I think we should shoot for implementing the same semantics in CIFS. 
To
> > do this though, we need to tighten up the behavior at close and open:
> >
> > cifs_close has this:
> >
> >                if (pTcon) {
> >                        /* no sense reconnecting to close a file that 
is
> >                           already closed */
> >                        if (!pTcon->need_reconnect) {
> >                                write_unlock(&GlobalSMBSeslock);
> >                                timeout = 2;
> >                                while 
((atomic_read(&pSMBFile->wrtPending) != 0)
> >
> > ...I'm not sure that I agree with the "no sense" comment there. What 
if
> > there are still dirty pages? Should we turn all of that business into 
a
> > filemap_fdatawait (or just call cifs_fsync)?
> >
> > --
> > Jeff Layton <jlayton at redhat.com>
> > _______________________________________________
> > linux-cifs-client mailing list
> > linux-cifs-client at lists.samba.org
> > https://lists.samba.org/mailman/listinfo/linux-cifs-client
> >
> 
> 
> Jeff,
> 
> Does not flush happen before close, and cifs_flush calls
> filemap_fdatawrite, so cifs does flush out pages.
>

Consider this scenario:

We go to close the file and cifs_flush is called. The server times out
while trying to write out the pages and needs_reconnect gets set.  We
then call into cifs_close, needs_reconnect is set so we don't issue the
close call. cifs_writepages picks back up again and then we end up
reconnecting the socket and reopening the file to flush the pages. When
does that FH actually get closed?

To come back to my original point...we want to get a read of the mtime of
the file after we've flushed all the pages. This allows us to recheck the
mtime when we reopen a file and determine whether it has been changed and
the cache is consistent. We can't do that in a deterministic fashion
though unless we can ensure that all of the pages have been flushed.

The problem here is that we have a lot places in this code where we sleep
for a little while and wait for something to occur, and eventually give
up. This is:

(a) inefficient since you periodically have to wake up to check the
condition

(b) wrong. If you're planning to give up after a while, then why wait
in the first place? Either you need to wait for the thing to happen or
not.

IMO, if we're going to just kick off filemap_fdatawrite in the cifs_flush
then we should do a filemap_fdatawait in cifs_close to ensure that the
pages under writeback have all actually been flushed. That gets rid of
the need as well for all of this unreliable "sleep for a little while"
business as well...

-- 
Jeff Layton <jlayton at redhat.com>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client at lists.samba.org
https://lists.samba.org/mailman/listinfo/linux-cifs-client




More information about the linux-cifs-client mailing list