[linux-cifs-client] when a mounted remote smb server goes down - cifs vfs tries to re-connect - with strange side-effects

Jeff Layton jlayton at samba.org
Wed Jan 6 05:12:23 MST 2010


On Wed, 6 Jan 2010 03:16:25 +0100
Günter Kukkukk <linux at kukkukk.com> wrote:

> Am Dienstag 05 Januar 2010 13:18:04 schrieb Jeff Layton:
> > On Tue, 5 Jan 2010 05:24:43 +0100
> > 
> > Günter Kukkukk <linux at kukkukk.com> wrote:
> > > Hi Jeff, Steve, ...
> > >
> > > I'm running here on opensuse 11.2 - tried default kernel - and now
> > > 2.6.33-rc2-0.1-default from latest cifs git.
> > >
> > > When mounting a remote smb server - and then that server goes down (just
> > > shutdown, no network cable unplugged), my client KDE4.x desktop becomes
> > > _very_ unresponsive - the (kicker) taskbar is now _unusable_ - no
> > > response at all - which also means that a user can't even use the GUI to
> > > shutdown.
> > >
> > > As soon i restart the remote smb server, all is fine again.
> > >
> > > Used many tools (like those from sysstat) - but at least CPU usage seem
> > > to be  _not_ the problem.
> > >
> > > Could it be some kernel semaphores....?
> > >
> > > When the remote server is down, wireshark only shows some 3 seconds
> > > tcp/ip traffic on ports 139 and 445.
> > >
> > > From the cifs (debug 7) kernel log:
> > >
> > > Jan  5 04:38:01 linux300 kernel: [46253.309431]  fs/cifs/inode.c: Getting
> > > info on //linux700/homegk Jan  5 04:38:01 linux300 kernel: [46253.309438]
> > >  fs/cifs/cifssmb.c: In QPathInfo (Unix) the path //linux700/homegk Jan  5
> > > 04:38:04 linux300 kernel: [46256.080085]  fs/cifs/connect.c: Socket
> > > created Jan  5 04:38:04 linux300 kernel: [46256.080740] 
> > > fs/cifs/connect.c: Error -111 connecting to server via ipv4 Jan  5
> > > 04:38:04 linux300 kernel: [46256.080770]  fs/cifs/connect.c: reconnect
> > > error -111 Jan  5 04:38:07 linux300 kernel: [46259.084106] 
> > > fs/cifs/connect.c: Socket created Jan  5 04:38:07 linux300 kernel:
> > > [46259.084773]  fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan  5 04:38:07 linux300 kernel: [46259.084804]  fs/cifs/connect.c:
> > > reconnect error -111 Jan  5 04:38:10 linux300 kernel: [46262.088085] 
> > > fs/cifs/connect.c: Socket created Jan  5 04:38:10 linux300 kernel:
> > > [46262.088728]  fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan  5 04:38:10 linux300 kernel: [46262.088758]  fs/cifs/connect.c:
> > > reconnect error -111 Jan  5 04:38:11 linux300 kernel: [46263.308062] 
> > > fs/cifs/cifssmb.c: gave up waiting on reconnect in smb_init Jan  5
> > > 04:38:11 linux300 kernel: [46263.308082]  fs/cifs/inode.c: error on
> > > getting revalidate info -112 Jan  5 04:38:11 linux300 kernel:
> > > [46263.308090]  fs/cifs/inode.c: cifs_revalidate - inode unchanged Jan  5
> > > 04:38:11 linux300 kernel: [46263.308102]  fs/cifs/inode.c: CIFS VFS:
> > > leaving cifs_revalidate (xid = 224) rc = -112
> > > Jan  5 04:38:11 linux300 kernel: [46263.310204]  fs/cifs/inode.c: CIFS
> > > VFS: in cifs_revalidate as Xid: 225 with uid: 1000
> > > Jan  5 04:38:11 linux300 kernel: [46263.310226]  fs/cifs/inode.c:
> > > Revalidate: //linux700/homegk inode 0xf41c3e84 count 1 dentry: 0xdaa7325c
> > > d_time 1951477551 jiffies 11490827
> > > Jan  5 04:38:11 linux300 kernel: [46263.310236]  fs/cifs/inode.c: Getting
> > > info on //linux700/homegk Jan  5 04:38:11 linux300 kernel: [46263.310243]
> > >  fs/cifs/cifssmb.c: In QPathInfo (Unix) the path //linux700/homegk Jan  5
> > > 04:38:13 linux300 kernel: [46265.092103]  fs/cifs/connect.c: Socket
> > > created Jan  5 04:38:13 linux300 kernel: [46265.092744] 
> > > fs/cifs/connect.c: Error -111 connecting to server via ipv4 Jan  5
> > > 04:38:13 linux300 kernel: [46265.092774]  fs/cifs/connect.c: reconnect
> > > error -111 Jan  5 04:38:16 linux300 kernel: [46268.096101] 
> > > fs/cifs/connect.c: Socket created Jan  5 04:38:16 linux300 kernel:
> > > [46268.096751]  fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan  5 04:38:16 linux300 kernel: [46268.096781]  fs/cifs/connect.c:
> > > reconnect error -111 Jan  5 04:38:19 linux300 kernel: [46271.100088] 
> > > fs/cifs/connect.c: Socket created Jan  5 04:38:19 linux300 kernel:
> > > [46271.100729]  fs/cifs/connect.c: Error -111 connecting to server via
> > > ipv4 Jan  5 04:38:19 linux300 kernel: [46271.100758]  fs/cifs/connect.c:
> > > reconnect error -111 Jan  5 04:38:21 linux300 kernel: [46273.308054] 
> > > fs/cifs/cifssmb.c: gave up waiting on reconnect in smb_init Jan  5
> > > 04:38:21 linux300 kernel: [46273.308075]  fs/cifs/inode.c: error on
> > > getting revalidate info -112 Jan  5 04:38:21 linux300 kernel:
> > > [46273.308083]  fs/cifs/inode.c: cifs_revalidate - inode unchanged Jan  5
> > > 04:38:21 linux300 kernel: [46273.308097]  fs/cifs/inode.c: CIFS VFS:
> > > leaving cifs_revalidate (xid = 225) rc = -112
> > > Jan  5 04:38:21 linux300 kernel: [46273.311877]  fs/cifs/inode.c: CIFS
> > > VFS: in cifs_revalidate as Xid: 226 with uid: 1000
> > > Jan  5 04:38:21 linux300 kernel: [46273.311901]  fs/cifs/inode.c:
> > > Revalidate: //linux700/homegk inode 0xf41c3e84 count 1 dentry: 0xdaa7325c
> > > d_time 1951477551 jiffies 11493327
> > > Jan  5 04:38:21 linux300 kernel: [46273.311912]  fs/cifs/inode.c: Getting
> > > info on //linux700/homegk
> > 
> > Most likely the problem is that you have programs that are trying to
> > repeatedly access the CIFS share while the server is down. The kernel
> > looks like it's doing what it's supposed to do, which is to attempt to
> > reconnect to the server so it can satisfy the syscall.
> > 
> > It looks like the program is just retrying the syscall over and over
> > here, but you might want to try and verify that by stracing the program
> > (if you can track it down). If so, then there's not much we can do at
> > the kernel level. You might be able to get away with doing a lazy umount
> > (umount -l) to work around it if the program is retrying the syscalls.
> > 
> 
> Hi Jeff,
> i was absolutely sure that no other program was accessing that mount.
> Opened a root console and did that mount, then stopped smbd on
> the remote server. The kde4 taskbar problem was immediately seen.
> 
> So i was hunting for a while - and you won't believe.
> For some testing purpose i had added the following line to /etc/fstab
> some weeks ago (and forgot about it):
> 
> //linux700/homegk    /mnt/linux           cifs       cred=/root/creds/gk.creds,rw,noauto 0 0
> 
> As soon as i remove that line and save fstab, all behaves correctly again.
> I can do so back and forth interactively with emacs - and the change is
> immediately (!) caught by some "watcher" .... atm no idea who it is.  :-)
> 

Interesting. One thing that might be good to do is to change the
cFYI/cERROR macros to printk the PID of the process as well
(current->tgid). That might help to track down what's doing this.

-- 
Jeff Layton <jlayton at samba.org>


More information about the linux-cifs-client mailing list