[linux-cifs-client] system lockup copying files to cifs (2.6.9-mm1)

Steven French sfrench at us.ibm.com
Mon Dec 6 20:52:55 GMT 2004


> When I backup my system on cifs filesystem, my system hang randomly. The 

> hang happens rarely, about one time by 10Gb.

Thanks for the information, I am not aware of an exact match on the 
problem you are seeing, but a little more information could help prove one 
way or the other.

That the server (or client) tcp/ip stack gets stuck for 30 seconds+ 
sending data and the client hangs the session up, and -EAGAIN (rc=11) is 
returned is not that unusual in my experience - especially after copying 
gigabytes of data (lots of strange things can happen below the sockets 
layer, even without buggy hardware or routers).  The client normally 
recovers from that fine (ie reopen session/retry operation after getting 
EAGAIN).  The issue is really the hang ... so to debug that.  If you can 
get the kernel stack trace of active processes dumped via sysrq (in /proc) 
to dmesg that would be helpful.   And if the whole client system the cifs 
code for a while has dumped active requests - of course if the whole 
system is frozen (and you can't get to the client keyboard) that is harder 
unless you can run a network trace on the server and catch the last 
smb/cifs network request sent or the last response received (my guess is 
that it will be SMBwriteX or SMBreadX of course but it would be helpful to 
know that the client actually sent it), but if just that session - I would 
love to see the output of /proc/fs/cifs/DebugData from a more recent 
version that logs the active operations (smb multiplex ids) which includes 
the pending command(s) etc. 

Low memory issues have plagued the current somewhat primitive 
implementation of cifs_writepage - since it is not asynchronous and 
therefore the client will cache aggressively.

Any chance you could run the refresh version of the code (simply overlays 
files in fs/cifs directory) ?

And of course if you are backing up (large sequential reads or writes) you 
are probably better off using even more current code (I will need to post 
a version of this to the project page or you would need to pick up 
2.6.10-rc3-mm5 when it comes out) as you can mount with the new flag 
"direct" which bypasses the client page cache on reads/writes and could 
give huge memory improvements (because client buffers are not being 
aggressively allocated to cache the client side inodes on this mount 
point).

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: sfrench at-sign us dot ibm dot com
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the linux-cifs-client mailing list