[linux-cifs-client] Re: linux-cifs-client Digest, Vol 39, Issue 19

Steve French (smfltc) smfltc at us.ibm.com
Mon Feb 26 04:36:37 GMT 2007


linux-cifs-client-request at lists.samba.org wrote:

>Send linux-cifs-client mailing list submissions to
>	linux-cifs-client at lists.samba.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
>	https://lists.samba.org/mailman/listinfo/linux-cifs-client
>or, via email, send a message with subject or body 'help' to
>	linux-cifs-client-request at lists.samba.org
>
>You can reach the person managing the list at
>	linux-cifs-client-owner at lists.samba.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of linux-cifs-client digest..."
>
>
>Today's Topics:
>
>   1. i_mutex and deadlock (Steve French (smfltc))
>   2. Re: i_mutex and deadlock (Dave Kleikamp)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Fri, 23 Feb 2007 10:02:16 -0600
>From: "Steve French (smfltc)" <smfltc at us.ibm.com>
>Subject: [linux-cifs-client] i_mutex and deadlock
>To: linux-fsdevel at vger.kernel.org
>Cc: linux-cifs-client at lists.samba.org
>Message-ID: <45DF1008.3090903 at us.ibm.com>
>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>A field in i_size_write (i_size_seqcount) must be protected against 
>simultaneous update otherwise we risk looping in i_size_read.
>
>The suggestion in fs.h is to use i_mutex which seems too dangerous due 
>to the possibility of deadlock.
>
>There are 65 places in the fs directory which lock an i_mutex, and seven 
>more in the mm directory.   The vfs does clearly lock file inodes in 
>some paths before calling into a particular filesystem (nfs, ext3, cifs 
>etc.) - in particular for fsync but probably for others that are harder 
>to trace.  This seems to introduce the possibility of deadlock if a 
>filesystem also uses i_mutex to protect file size updates
>
>Documentation/filesystems/Locking describes the use of i_mutex (was 
>"i_sem" previously) and indicates that it is held by the vfs on three 
>additional calls on file inodes which concern me (for deadlock 
>possibility), setattr, truncate and unlink.
>
>nfs seems to limit its use of i_mutex to llseek and invalidate_mapping, 
>and does not appear to grab the i_mutex (or any sem for that matter) to 
>protect i_size_write
>(nfs calls i_size_write in nfs_grow_file) - and for the case of 
>nfs_fhget (in which they bypass i_size_write and set i_size directly) 
>does not seem to grab i_mutex either.
>
>ext3 also does not use i_mutex for this purpose (protecting 
>i_size_write) - ony to protect a journalling ioctl.
>
>I am concerned about using i_mutex to protect the cifs calls to 
>i_size_write (although it seems to fix a problem reported in i_size_read 
>under stress) because of the following:
>
>1) no one else calls i_size_write AFAIK (on our file inodes)
>2) we don't block inside i_size_write do we ... (so why in the world do 
>they take a slow mutex instead of a fast spinlock)
>3) we don't really know what happens inside fsync (the paths through the 
>page cache code seem complex and we don't want to reenter writepage in 
>low memory conditions and deadlock updating the file size), and there is 
>some concern that the vfs takes the i_mutex in other paths on file 
>inodes before entering our code and could deadlock.
>
>Any reason, why an fs shouldn't simply use something else (a spinlock) 
>other than i_mutex to protect the i_size_write call?
>
>
>------------------------------
>
>Message: 2
>Date: Fri, 23 Feb 2007 10:29:53 -0600
>From: Dave Kleikamp <shaggy at linux.vnet.ibm.com>
>Subject: Re: [linux-cifs-client] i_mutex and deadlock
>To: "Steve French (smfltc)" <smfltc at us.ibm.com>
>Cc: linux-fsdevel at vger.kernel.org, linux-cifs-client at lists.samba.org
>Message-ID: <1172248194.15810.10.camel at kleikamp.austin.ibm.com>
>Content-Type: text/plain
>
>On Fri, 2007-02-23 at 10:02 -0600, Steve French (smfltc) wrote:
>  
>
>>A field in i_size_write (i_size_seqcount) must be protected against 
>>simultaneous update otherwise we risk looping in i_size_read.
>>
>>The suggestion in fs.h is to use i_mutex which seems too dangerous due 
>>to the possibility of deadlock.
>>    
>>
>
>I'm not sure if it's as much a suggestion as a way of documenting the
>locking  that exists (or existed when the comment was written).
>
>"... i_size_write() does need locking around it  (normally i_mutex) ..."
>
>  
>
>>There are 65 places in the fs directory which lock an i_mutex, and seven 
>>more in the mm directory.   The vfs does clearly lock file inodes in 
>>some paths before calling into a particular filesystem (nfs, ext3, cifs 
>>etc.) - in particular for fsync but probably for others that are harder 
>>to trace.  This seems to introduce the possibility of deadlock if a 
>>filesystem also uses i_mutex to protect file size updates
>>
>>I am concerned about using i_mutex to protect the cifs calls to 
>>i_size_write (although it seems to fix a problem reported in i_size_read 
>>under stress) because of the following:
>>
>>1) no one else calls i_size_write AFAIK (on our file inodes)
>>    
>>
>
>I think you're right.
>
>  
>
I produced a patch (attached to 
http://bugzilla.kernel.org/show_bug.cgi?id=7903) which
fixes this, has tested out fine, has no sideeffects/changes outside of 
cifs and simply uses 
inode->i_lock (spinlock) to protect i_size_write calls on cifs inodes 
which seemed
faster/safer especially as there was one other path outside the cifs 
code which calls
i_size_write (vmtruncate in mm/memory.c) which was easy to work around 
since cifs
could call its own version of this small function.  This is also a lot 
easier to prove/
understand as there were 72 places which called i_mutex outside of cifs 
(ie in mm and fs)
many of which were potential deadlocks if cifs used i_mutex to protect 
its i_size_writes
(network filesystems do updates of the file size in more places than a 
local filesystem
of course ... so have to be careful about grabbing the same sem twice ...)

I don't want to push that over akpm's objections as he knows the mm 
well, but this
seems the simplest/safest approach.


More information about the linux-cifs-client mailing list