[linux-cifs-client] OOM kills when running fsstress on CIFS

Jeff Layton jlayton at redhat.com
Tue May 25 05:49:29 MDT 2010


On Tue, 25 May 2010 21:16:39 +1000
Nick Piggin <npiggin at suse.de> wrote:

> On Tue, May 25, 2010 at 06:57:05AM -0400, Jeff Layton wrote:
> > Since 2.6.34, I've been able to consistently reproduce OOM kills when running fsstress (from the LTP suite) on CIFS. I spent some time yesterday and bisected it down to this patch:
> > 
> > ---------------------[snip]---------------------
> > commit 315e995c63a15cb4d4efdbfd70fe2db191917f7a
> > Author: Nick Piggin <npiggin at suse.de>
> > Date:   Wed Apr 21 03:18:28 2010 +0000
> > 
> >     [CIFS] use add_to_page_cache_lru
> >     
> >     add_to_page_cache_lru is exported, so it should be used. Benefits over
> >     using a private pagevec: neater code, 128 bytes fewer stack used, percpu
> >     lru ordering is preserved, and finally don't need to flush pagevec
> >     before returning so batching may be shared with other LRU insertions.
> >     
> >     Signed-off-by: Nick Piggin <npiggin at suse.de>
> >     Reviewed-by: Dave Kleikamp <shaggy at linux.vnet.ibm.com>
> >     Signed-off-by: Steve French <sfrench at us.ibm.com>
> > ---------------------[snip]---------------------
> > 
> > Here's how I've been reproducing it:
> > 
> > Mount up a samba share with -o sec=krb5i,nounix,noserverino
> > 
> > Run: fsstress -d /path/to/dir/on/cifs/ -n 1000 -l0 -p8
> > 
> > ...within an hour or two, I start getting OOM kills. After backing out
> > the patch above, I was able to run the test overnight. I'm not sure yet
> > what the actual problem is, but there seems to be something wrong with
> > that patch.
> > 
> > Thoughts?
> 
> Yep, it's my fault. The problem is the refcounting. Previously the
> code hands off the references to the LRU, wheras now the lru takes
> a new reference. (the other filesystems converted to use this
> function seemed to more conventionally open-code lru_cache_add).
> 
> Can we get rid of a refcount increment anywhere? Otherwise we'll
> need to just drop the references after adding the pages.
> 

The only caller of this function is cifs_readpages, and I don't see
where it takes any references. I'm guessing that the pages come from
the VFS with the refcount already incremented?

-- 
Jeff Layton <jlayton at redhat.com>


More information about the linux-cifs-client mailing list