[linux-cifs-client] Re: fsx-linux failing with latest cifs-2.6 git tree

Jeff Layton jlayton at redhat.com
Sat Nov 22 15:39:16 GMT 2008


On Fri, 21 Nov 2008 22:47:09 -0600
Dave Kleikamp <shaggy at linux.vnet.ibm.com> wrote:

> Nick's point is that it isn't really safe to mark the page Uptodate
> until cifs_write_end(), since the data being overwritten hasn't yet
> been.
> 
> How about cifs_write_begin() marking the page PageChecked instead of
> PageUptodate, of course leaving it PageUptodate if it already is?  Then
> cifs_write_end() only has to handle the slow path when the page is
> neither PageUptodate or PageChecked.
> 
> PageChecked == PG_owner_priv_1, which is reserved for whatever the
> filesystem wants to use it for.
> 

Right -- Nick's point about the uptodate flag was definitely correct,
though I'm not sure I understand when we'd get less data eventually
copied to the page than we expected in write_begin...

I had the same thought last night about using a different flag. I was
considering PG_private, but PG_checked sounds like a better choice.
Thoughts on this patch? I've run some basic regression tests against
it and it seems to work.

---------------------[snip]-------------------
From 2ac44918f498f722ccfd3e5293ad0b966a9261ca Mon Sep 17 00:00:00 2001
From: Jeff Layton <jlayton at redhat.com>
Date: Sat, 22 Nov 2008 10:12:49 -0500
Subject: [PATCH] cifs: fix regression in cifs_write_begin/cifs_write_end

The conversion to write_begin/write_end interfaces had a bug where we
were passing a bad parameter to cifs_readpage_worker. Rather than
passing the page offset of the start of the write, we needed to pass the
offset of the beginning of the page. This was reliably showing up as
data corruption in the fsx-linux test from LTP.

It also became evident that this code was occasionally doing unnecessary
read calls. Optimize those away by co-opting the PG_checked flag and
using that in place of PG_uptodate to indicate whether cifs_write_end
should do a sync write.

Signed-off-by: Jeff Layton <jlayton at redhat.com>
Signed-off-by: Steve French <smfrench at us.ibm.com>
---
 fs/cifs/file.c |   83 +++++++++++++++++++++++++++++++++++++++++---------------
 1 files changed, 61 insertions(+), 22 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index b691b89..f890e08 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1475,10 +1475,11 @@ static int cifs_write_end(struct file *file, struct address_space *mapping,
 	cFYI(1, ("write_end for page %p from pos %lld with %d bytes",
 		 page, pos, copied));
 
-	if (!PageUptodate(page) && copied == PAGE_CACHE_SIZE)
-		SetPageUptodate(page);
-
-	if (!PageUptodate(page)) {
+	/*
+	 * if PG_checked is set, then the parts of the page that weren't
+	 * copied over are expected to be up to date.
+	 */
+	if (!PageChecked(page)) {
 		char *page_data;
 		unsigned offset = pos & (PAGE_CACHE_SIZE - 1);
 		int xid;
@@ -1496,6 +1497,8 @@ static int cifs_write_end(struct file *file, struct address_space *mapping,
 
 		FreeXid(xid);
 	} else {
+		SetPageUptodate(page);
+		ClearPageChecked(page);
 		rc = copied;
 		pos += copied;
 		set_page_dirty(page);
@@ -2056,45 +2059,81 @@ bool is_size_safe_to_change(struct cifsInodeInfo *cifsInode, __u64 end_of_file)
 		return true;
 }
 
+/*
+ * Prepare a page for writing. cifs_write_end expects that the page will have
+ * PG_checked set if the unwritten parts of the page are up to date.
+ */
 static int cifs_write_begin(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned flags,
 			struct page **pagep, void **fsdata)
 {
 	pgoff_t index = pos >> PAGE_CACHE_SHIFT;
 	loff_t offset = pos & (PAGE_CACHE_SIZE - 1);
+	loff_t page_start = pos & PAGE_MASK;
+	loff_t i_size;
+	struct inode *inode;
+	struct page *page;
+	char *page_data;
+	int rc = 0;
 
 	cFYI(1, ("write_begin from %lld len %d", (long long)pos, len));
 
-	*pagep = __grab_cache_page(mapping, index);
-	if (!*pagep)
-		return -ENOMEM;
+	page = __grab_cache_page(mapping, index);
+	if (!page) {
+		rc = -ENOMEM;
+		goto out;
+	}
 
-	if (PageUptodate(*pagep))
-		return 0;
+	if (PageUptodate(page)) {
+		SetPageChecked(page);
+		goto out;
+	}
 
 	/* If we are writing a full page it will be up to date,
 	   no need to read from the server */
-	if (len == PAGE_CACHE_SIZE && flags & AOP_FLAG_UNINTERRUPTIBLE)
-		return 0;
-
-	if ((file->f_flags & O_ACCMODE) != O_WRONLY) {
-		int rc;
+	if (len == PAGE_CACHE_SIZE && flags & AOP_FLAG_UNINTERRUPTIBLE) {
+		SetPageChecked(page);
+		goto out;
+	}
 
-		/* might as well read a page, it is fast enough */
-		rc = cifs_readpage_worker(file, *pagep, &offset);
+	/*
+	 * optimize away the read when we have an oplock, and we're not
+	 * expecting to use any of the data we'd be reading in. That is, when
+	 * the page lies beyond the EOF, or straddles the EOF and the write
+	 * would cover all of the existing data in the file.
+	 */
+	inode = page->mapping->host;
+	if (CIFS_I(inode)->clientCanCacheRead) {
+		i_size = i_size_read(inode);
+		if (page_start >= i_size ||
+		    (offset == 0 && (pos + len) >= i_size)) {
+			page_data = kmap(page);
+			memset(page_data, 0, offset);
+			memset(page_data + offset + len, 0,
+			       PAGE_CACHE_SIZE - (offset + len));
+			kunmap(page);
+			SetPageChecked(page);
+			goto out;
+		}
+	}
 
-		/* we do not need to pass errors back
-		   e.g. if we do not have read access to the file
-		   because cifs_write_end will attempt synchronous writes
-		   -- shaggy */
+	if ((file->f_flags & O_ACCMODE) != O_WRONLY) {
+		/*
+		 * might as well read a page, it is fast enough. If we get
+		 * an error, we don't need to return it. cifs_write_end can
+		 * do a sync write instead.
+		 */
+		if (!cifs_readpage_worker(file, page, &page_start))
+			SetPageChecked(page);
 	} else {
 		/* we could try using another file handle if there is one -
 		   but how would we lock it to prevent close of that handle
 		   racing with this read? In any case
 		   this will be written out by write_end so is fine */
 	}
-
-	return 0;
+out:
+	*pagep = page;
+	return rc;
 }
 
 const struct address_space_operations cifs_addr_ops = {
-- 
1.5.5.1



More information about the linux-cifs-client mailing list