fs:cifs deadlock due to server drop-off

david.kondrad at legrand.us david.kondrad at legrand.us
Fri Sep 3 08:07:48 MDT 2010


Greetings:

Not sure if we cover the kernel cifs implementation here,
but here it goes:

There's a nasty deadlock possibility with kernel cifs code whereby you
connect to a server (say a share on a laptop) and then that laptop
leaves the network while you are still connected.

If you later then try to perform any access (ls, read, write, etc) or
unmount the share, the command that issued a filesystem call will
deadlock until that device returns to the network. For most cases, you
will be waiting a long time since that device has gone home with a co-
worker for the day!

Upon looking at fs/cifs/connect.c, it appears that there are two main
factors contributing to this issue:

Issue 1:

cifs_reconnect loops on tcpStatus != CifsExiting and != CifsGood
Since the server is no longer on the network, all attempts to open
sockets to said server will fail and thus the tcpStatus will never
transition to one of these cases.

Issue 2:

Callers of cifs_reconnect do not check the return code of the function
and automatically assume that if it returns then the connection is
good. This may be why the loop was put there in the first place
(kludge?). So, even if the code were modified to break out of the loop
after a number of retries, the calling code will also have to be aware
that cifs_reconnect may return in an error condition (in fact it
already does in a few instances).

Unfortunately, we're using an ancient kernel for our embedded product
(2.6.10 + many patches from Montavista) so any patches I make will
certainly not apply upstream. Although, I'm fairly certain I've also
seen this on my 11.2 openSUSE install.

Anyone have any experience with this code to be able fix this upstream
while I attempt to fix it for the version we have? Is there anything I
need to especially careful of in order not to introduce regressions in
that code?

Best regards,
Dave

--
David Kondrad
Software Design Engineer
Home Systems Division
Legrand, North America

717.546.5442
david.kondrad at legrand.us
www.legrand.us/onq

This email, and any document attached hereto, may contain
confidential and/or privileged information.  If you are not the
intended recipient (or have received this email in error) please
notify the sender immediately and destroy this email.  Any
unauthorized, direct or indirect, copying, disclosure, distribution
or other use of the material or parts thereof is strictly
forbidden.


More information about the samba-technical mailing list