[linux-cifs-client] when a mounted remote smb server goes down - cifs vfs tries to re-connect - with strange side-effects
Jeff Layton
jlayton at samba.org
Wed Jan 6 05:12:23 MST 2010
On Wed, 6 Jan 2010 03:16:25 +0100
Günter Kukkukk <linux at kukkukk.com> wrote:
> Am Dienstag 05 Januar 2010 13:18:04 schrieb Jeff Layton:
> > On Tue, 5 Jan 2010 05:24:43 +0100
> >
> > Günter Kukkukk <linux at kukkukk.com> wrote:
> > > Hi Jeff, Steve, ...
> > >
> > > I'm running here on opensuse 11.2 - tried default kernel - and now
> > > 2.6.33-rc2-0.1-default from latest cifs git.
> > >
> > > When mounting a remote smb server - and then that server goes down (just
> > > shutdown, no network cable unplugged), my client KDE4.x desktop becomes
> > > _very_ unresponsive - the (kicker) taskbar is now _unusable_ - no
> > > response at all - which also means that a user can't even use the GUI to
> > > shutdown.
> > >
> > > As soon i restart the remote smb server, all is fine again.
> > >
> > > Used many tools (like those from sysstat) - but at least CPU usage seem
> > > to be _not_ the problem.
> > >
> > > Could it be some kernel semaphores....?
> > >
> > > When the remote server is down, wireshark only shows some 3 seconds
> > > tcp/ip traffic on ports 139 and 445.
> > >
> > > From the cifs (debug 7) kernel log:
> > >
> > > Jan 5 04:38:01 linux300 kernel: [46253.309431] fs/cifs/inode.c: Getting
> > > info on //linux700/homegk Jan 5 04:38:01 linux300 kernel: [46253.309438]
> > > fs/cifs/cifssmb.c: In QPathInfo (Unix) the path //linux700/homegk Jan 5
> > > 04:38:04 linux300 kernel: [46256.080085] fs/cifs/connect.c: Socket
> > > created Jan 5 04:38:04 linux300 kernel: [46256.080740]
> > > fs/cifs/connect.c: Error -111 connecting to server via ipv4 Jan 5
> > > 04:38:04 linux300 kernel: [46256.080770] fs/cifs/connect.c: reconnect
> > > error -111 Jan 5 04:38:07 linux300 kernel: [46259.084106]
> > > fs/cifs/connect.c: Socket created Jan 5 04:38:07 linux300 kernel:
> > > [46259.084773] fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan 5 04:38:07 linux300 kernel: [46259.084804] fs/cifs/connect.c:
> > > reconnect error -111 Jan 5 04:38:10 linux300 kernel: [46262.088085]
> > > fs/cifs/connect.c: Socket created Jan 5 04:38:10 linux300 kernel:
> > > [46262.088728] fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan 5 04:38:10 linux300 kernel: [46262.088758] fs/cifs/connect.c:
> > > reconnect error -111 Jan 5 04:38:11 linux300 kernel: [46263.308062]
> > > fs/cifs/cifssmb.c: gave up waiting on reconnect in smb_init Jan 5
> > > 04:38:11 linux300 kernel: [46263.308082] fs/cifs/inode.c: error on
> > > getting revalidate info -112 Jan 5 04:38:11 linux300 kernel:
> > > [46263.308090] fs/cifs/inode.c: cifs_revalidate - inode unchanged Jan 5
> > > 04:38:11 linux300 kernel: [46263.308102] fs/cifs/inode.c: CIFS VFS:
> > > leaving cifs_revalidate (xid = 224) rc = -112
> > > Jan 5 04:38:11 linux300 kernel: [46263.310204] fs/cifs/inode.c: CIFS
> > > VFS: in cifs_revalidate as Xid: 225 with uid: 1000
> > > Jan 5 04:38:11 linux300 kernel: [46263.310226] fs/cifs/inode.c:
> > > Revalidate: //linux700/homegk inode 0xf41c3e84 count 1 dentry: 0xdaa7325c
> > > d_time 1951477551 jiffies 11490827
> > > Jan 5 04:38:11 linux300 kernel: [46263.310236] fs/cifs/inode.c: Getting
> > > info on //linux700/homegk Jan 5 04:38:11 linux300 kernel: [46263.310243]
> > > fs/cifs/cifssmb.c: In QPathInfo (Unix) the path //linux700/homegk Jan 5
> > > 04:38:13 linux300 kernel: [46265.092103] fs/cifs/connect.c: Socket
> > > created Jan 5 04:38:13 linux300 kernel: [46265.092744]
> > > fs/cifs/connect.c: Error -111 connecting to server via ipv4 Jan 5
> > > 04:38:13 linux300 kernel: [46265.092774] fs/cifs/connect.c: reconnect
> > > error -111 Jan 5 04:38:16 linux300 kernel: [46268.096101]
> > > fs/cifs/connect.c: Socket created Jan 5 04:38:16 linux300 kernel:
> > > [46268.096751] fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan 5 04:38:16 linux300 kernel: [46268.096781] fs/cifs/connect.c:
> > > reconnect error -111 Jan 5 04:38:19 linux300 kernel: [46271.100088]
> > > fs/cifs/connect.c: Socket created Jan 5 04:38:19 linux300 kernel:
> > > [46271.100729] fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan 5 04:38:19 linux300 kernel: [46271.100758] fs/cifs/connect.c:
> > > reconnect error -111 Jan 5 04:38:21 linux300 kernel: [46273.308054]
> > > fs/cifs/cifssmb.c: gave up waiting on reconnect in smb_init Jan 5
> > > 04:38:21 linux300 kernel: [46273.308075] fs/cifs/inode.c: error on
> > > getting revalidate info -112 Jan 5 04:38:21 linux300 kernel:
> > > [46273.308083] fs/cifs/inode.c: cifs_revalidate - inode unchanged Jan 5
> > > 04:38:21 linux300 kernel: [46273.308097] fs/cifs/inode.c: CIFS VFS:
> > > leaving cifs_revalidate (xid = 225) rc = -112
> > > Jan 5 04:38:21 linux300 kernel: [46273.311877] fs/cifs/inode.c: CIFS
> > > VFS: in cifs_revalidate as Xid: 226 with uid: 1000
> > > Jan 5 04:38:21 linux300 kernel: [46273.311901] fs/cifs/inode.c:
> > > Revalidate: //linux700/homegk inode 0xf41c3e84 count 1 dentry: 0xdaa7325c
> > > d_time 1951477551 jiffies 11493327
> > > Jan 5 04:38:21 linux300 kernel: [46273.311912] fs/cifs/inode.c: Getting
> > > info on //linux700/homegk
> >
> > Most likely the problem is that you have programs that are trying to
> > repeatedly access the CIFS share while the server is down. The kernel
> > looks like it's doing what it's supposed to do, which is to attempt to
> > reconnect to the server so it can satisfy the syscall.
> >
> > It looks like the program is just retrying the syscall over and over
> > here, but you might want to try and verify that by stracing the program
> > (if you can track it down). If so, then there's not much we can do at
> > the kernel level. You might be able to get away with doing a lazy umount
> > (umount -l) to work around it if the program is retrying the syscalls.
> >
>
> Hi Jeff,
> i was absolutely sure that no other program was accessing that mount.
> Opened a root console and did that mount, then stopped smbd on
> the remote server. The kde4 taskbar problem was immediately seen.
>
> So i was hunting for a while - and you won't believe.
> For some testing purpose i had added the following line to /etc/fstab
> some weeks ago (and forgot about it):
>
> //linux700/homegk /mnt/linux cifs cred=/root/creds/gk.creds,rw,noauto 0 0
>
> As soon as i remove that line and save fstab, all behaves correctly again.
> I can do so back and forth interactively with emacs - and the change is
> immediately (!) caught by some "watcher" .... atm no idea who it is. :-)
>
Interesting. One thing that might be good to do is to change the
cFYI/cERROR macros to printk the PID of the process as well
(current->tgid). That might help to track down what's doing this.
--
Jeff Layton <jlayton at samba.org>
More information about the linux-cifs-client
mailing list