[linux-cifs-client] close-to-open cache consistency and cifs

Fri Dec 19 15:05:53 GMT 2008

On Fri, 19 Dec 2008 08:44:53 -0600
"Shirish Pargaonkar" <shirishpargaonkar at gmail.com> wrote:

> On Fri, Dec 19, 2008 at 7:08 AM, Jeff Layton <jlayton at redhat.com> wrote:
> > One of the things that NFS implements is close-to-open cache
> > consistency. NFS flushes all pages to the server on close and then does
> > a GETATTR call to make sure the attribute cache is up to date. Then, on
> > open() it does another getattr to ensure that the file hasn't changed.
> > This allows it to avoid unnecessary cache invalidation. It also does
> > some similar things around posix locks so that they are also used as
> > cache consistency points.
> >
> > I think we should shoot for implementing the same semantics in CIFS. To
> > do this though, we need to tighten up the behavior at close and open:
> >
> > cifs_close has this:
> >
> >                if (pTcon) {
> >                        /* no sense reconnecting to close a file that is
> >                           already closed */
> >                        if (!pTcon->need_reconnect) {
> >                                write_unlock(&GlobalSMBSeslock);
> >                                timeout = 2;
> >                                while ((atomic_read(&pSMBFile->wrtPending) != 0)
> >
> > ...I'm not sure that I agree with the "no sense" comment there. What if
> > there are still dirty pages? Should we turn all of that business into a
> > filemap_fdatawait (or just call cifs_fsync)?
> >
> > --
> > Jeff Layton <jlayton at redhat.com>
> > _______________________________________________
> > linux-cifs-client mailing list
> > linux-cifs-client at lists.samba.org
> > https://lists.samba.org/mailman/listinfo/linux-cifs-client
> >
> 
> 
> Jeff,
> 
> Does not flush happen before close, and cifs_flush calls
> filemap_fdatawrite, so cifs does flush out pages.
>

Consider this scenario:

We go to close the file and cifs_flush is called. The server times out
while trying to write out the pages and needs_reconnect gets set.  We
then call into cifs_close, needs_reconnect is set so we don't issue the
close call. cifs_writepages picks back up again and then we end up
reconnecting the socket and reopening the file to flush the pages. When
does that FH actually get closed?

To come back to my original point...we want to get a read of the mtime of
the file after we've flushed all the pages. This allows us to recheck the
mtime when we reopen a file and determine whether it has been changed and
the cache is consistent. We can't do that in a deterministic fashion
though unless we can ensure that all of the pages have been flushed.

The problem here is that we have a lot places in this code where we sleep
for a little while and wait for something to occur, and eventually give
up. This is:

(a) inefficient since you periodically have to wake up to check the
condition

(b) wrong. If you're planning to give up after a while, then why wait
in the first place? Either you need to wait for the thing to happen or
not.

IMO, if we're going to just kick off filemap_fdatawrite in the cifs_flush
then we should do a filemap_fdatawait in cifs_close to ensure that the
pages under writeback have all actually been flushed. That gets rid of
the need as well for all of this unreliable "sleep for a little while"
business as well...

-- 
Jeff Layton <jlayton at redhat.com>