[linux-cifs-client] system lockup copying files to cifs
(2.6.9-mm1)
Steven French
sfrench at us.ibm.com
Mon Dec 6 20:52:55 GMT 2004
> When I backup my system on cifs filesystem, my system hang randomly. The
> hang happens rarely, about one time by 10Gb.
Thanks for the information, I am not aware of an exact match on the
problem you are seeing, but a little more information could help prove one
way or the other.
That the server (or client) tcp/ip stack gets stuck for 30 seconds+
sending data and the client hangs the session up, and -EAGAIN (rc=11) is
returned is not that unusual in my experience - especially after copying
gigabytes of data (lots of strange things can happen below the sockets
layer, even without buggy hardware or routers). The client normally
recovers from that fine (ie reopen session/retry operation after getting
EAGAIN). The issue is really the hang ... so to debug that. If you can
get the kernel stack trace of active processes dumped via sysrq (in /proc)
to dmesg that would be helpful. And if the whole client system the cifs
code for a while has dumped active requests - of course if the whole
system is frozen (and you can't get to the client keyboard) that is harder
unless you can run a network trace on the server and catch the last
smb/cifs network request sent or the last response received (my guess is
that it will be SMBwriteX or SMBreadX of course but it would be helpful to
know that the client actually sent it), but if just that session - I would
love to see the output of /proc/fs/cifs/DebugData from a more recent
version that logs the active operations (smb multiplex ids) which includes
the pending command(s) etc.
Low memory issues have plagued the current somewhat primitive
implementation of cifs_writepage - since it is not asynchronous and
therefore the client will cache aggressively.
Any chance you could run the refresh version of the code (simply overlays
files in fs/cifs directory) ?
And of course if you are backing up (large sequential reads or writes) you
are probably better off using even more current code (I will need to post
a version of this to the project page or you would need to pick up
2.6.10-rc3-mm5 when it comes out) as you can mount with the new flag
"direct" which bypasses the client page cache on reads/writes and could
give huge memory improvements (because client buffers are not being
aggressively allocated to cache the client side inodes on this mount
point).
Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: sfrench at-sign us dot ibm dot com
-------------- next part --------------
HTML attachment scrubbed and removed
More information about the linux-cifs-client
mailing list