[linux-cifs-client] when a mounted remote smb server goes down - cifs vfs tries to re-connect - with strange side-effects

Günter Kukkukk linux at kukkukk.com
Tue Jan 5 19:16:25 MST 2010


Am Dienstag 05 Januar 2010 13:18:04 schrieb Jeff Layton:
> On Tue, 5 Jan 2010 05:24:43 +0100
> 
> Günter Kukkukk <linux at kukkukk.com> wrote:
> > Hi Jeff, Steve, ...
> >
> > I'm running here on opensuse 11.2 - tried default kernel - and now
> > 2.6.33-rc2-0.1-default from latest cifs git.
> >
> > When mounting a remote smb server - and then that server goes down (just
> > shutdown, no network cable unplugged), my client KDE4.x desktop becomes
> > _very_ unresponsive - the (kicker) taskbar is now _unusable_ - no
> > response at all - which also means that a user can't even use the GUI to
> > shutdown.
> >
> > As soon i restart the remote smb server, all is fine again.
> >
> > Used many tools (like those from sysstat) - but at least CPU usage seem
> > to be  _not_ the problem.
> >
> > Could it be some kernel semaphores....?
> >
> > When the remote server is down, wireshark only shows some 3 seconds
> > tcp/ip traffic on ports 139 and 445.
> >
> > From the cifs (debug 7) kernel log:
> >
> > Jan  5 04:38:01 linux300 kernel: [46253.309431]  fs/cifs/inode.c: Getting
> > info on //linux700/homegk Jan  5 04:38:01 linux300 kernel: [46253.309438]
> >  fs/cifs/cifssmb.c: In QPathInfo (Unix) the path //linux700/homegk Jan  5
> > 04:38:04 linux300 kernel: [46256.080085]  fs/cifs/connect.c: Socket
> > created Jan  5 04:38:04 linux300 kernel: [46256.080740] 
> > fs/cifs/connect.c: Error -111 connecting to server via ipv4 Jan  5
> > 04:38:04 linux300 kernel: [46256.080770]  fs/cifs/connect.c: reconnect
> > error -111 Jan  5 04:38:07 linux300 kernel: [46259.084106] 
> > fs/cifs/connect.c: Socket created Jan  5 04:38:07 linux300 kernel:
> > [46259.084773]  fs/cifs/connect.c: Error -111 connecting to server via
> > ipv4 Jan  5 04:38:07 linux300 kernel: [46259.084804]  fs/cifs/connect.c:
> > reconnect error -111 Jan  5 04:38:10 linux300 kernel: [46262.088085] 
> > fs/cifs/connect.c: Socket created Jan  5 04:38:10 linux300 kernel:
> > [46262.088728]  fs/cifs/connect.c: Error -111 connecting to server via
> > ipv4 Jan  5 04:38:10 linux300 kernel: [46262.088758]  fs/cifs/connect.c:
> > reconnect error -111 Jan  5 04:38:11 linux300 kernel: [46263.308062] 
> > fs/cifs/cifssmb.c: gave up waiting on reconnect in smb_init Jan  5
> > 04:38:11 linux300 kernel: [46263.308082]  fs/cifs/inode.c: error on
> > getting revalidate info -112 Jan  5 04:38:11 linux300 kernel:
> > [46263.308090]  fs/cifs/inode.c: cifs_revalidate - inode unchanged Jan  5
> > 04:38:11 linux300 kernel: [46263.308102]  fs/cifs/inode.c: CIFS VFS:
> > leaving cifs_revalidate (xid = 224) rc = -112
> > Jan  5 04:38:11 linux300 kernel: [46263.310204]  fs/cifs/inode.c: CIFS
> > VFS: in cifs_revalidate as Xid: 225 with uid: 1000
> > Jan  5 04:38:11 linux300 kernel: [46263.310226]  fs/cifs/inode.c:
> > Revalidate: //linux700/homegk inode 0xf41c3e84 count 1 dentry: 0xdaa7325c
> > d_time 1951477551 jiffies 11490827
> > Jan  5 04:38:11 linux300 kernel: [46263.310236]  fs/cifs/inode.c: Getting
> > info on //linux700/homegk Jan  5 04:38:11 linux300 kernel: [46263.310243]
> >  fs/cifs/cifssmb.c: In QPathInfo (Unix) the path //linux700/homegk Jan  5
> > 04:38:13 linux300 kernel: [46265.092103]  fs/cifs/connect.c: Socket
> > created Jan  5 04:38:13 linux300 kernel: [46265.092744] 
> > fs/cifs/connect.c: Error -111 connecting to server via ipv4 Jan  5
> > 04:38:13 linux300 kernel: [46265.092774]  fs/cifs/connect.c: reconnect
> > error -111 Jan  5 04:38:16 linux300 kernel: [46268.096101] 
> > fs/cifs/connect.c: Socket created Jan  5 04:38:16 linux300 kernel:
> > [46268.096751]  fs/cifs/connect.c: Error -111 connecting to server via
> > ipv4 Jan  5 04:38:16 linux300 kernel: [46268.096781]  fs/cifs/connect.c:
> > reconnect error -111 Jan  5 04:38:19 linux300 kernel: [46271.100088] 
> > fs/cifs/connect.c: Socket created Jan  5 04:38:19 linux300 kernel:
> > [46271.100729]  fs/cifs/connect.c: Error -111 connecting to server via
> > ipv4 Jan  5 04:38:19 linux300 kernel: [46271.100758]  fs/cifs/connect.c:
> > reconnect error -111 Jan  5 04:38:21 linux300 kernel: [46273.308054] 
> > fs/cifs/cifssmb.c: gave up waiting on reconnect in smb_init Jan  5
> > 04:38:21 linux300 kernel: [46273.308075]  fs/cifs/inode.c: error on
> > getting revalidate info -112 Jan  5 04:38:21 linux300 kernel:
> > [46273.308083]  fs/cifs/inode.c: cifs_revalidate - inode unchanged Jan  5
> > 04:38:21 linux300 kernel: [46273.308097]  fs/cifs/inode.c: CIFS VFS:
> > leaving cifs_revalidate (xid = 225) rc = -112
> > Jan  5 04:38:21 linux300 kernel: [46273.311877]  fs/cifs/inode.c: CIFS
> > VFS: in cifs_revalidate as Xid: 226 with uid: 1000
> > Jan  5 04:38:21 linux300 kernel: [46273.311901]  fs/cifs/inode.c:
> > Revalidate: //linux700/homegk inode 0xf41c3e84 count 1 dentry: 0xdaa7325c
> > d_time 1951477551 jiffies 11493327
> > Jan  5 04:38:21 linux300 kernel: [46273.311912]  fs/cifs/inode.c: Getting
> > info on //linux700/homegk
> 
> Most likely the problem is that you have programs that are trying to
> repeatedly access the CIFS share while the server is down. The kernel
> looks like it's doing what it's supposed to do, which is to attempt to
> reconnect to the server so it can satisfy the syscall.
> 
> It looks like the program is just retrying the syscall over and over
> here, but you might want to try and verify that by stracing the program
> (if you can track it down). If so, then there's not much we can do at
> the kernel level. You might be able to get away with doing a lazy umount
> (umount -l) to work around it if the program is retrying the syscalls.
> 

Hi Jeff,
i was absolutely sure that no other program was accessing that mount.
Opened a root console and did that mount, then stopped smbd on
the remote server. The kde4 taskbar problem was immediately seen.

So i was hunting for a while - and you won't believe.
For some testing purpose i had added the following line to /etc/fstab
some weeks ago (and forgot about it):

//linux700/homegk    /mnt/linux           cifs       cred=/root/creds/gk.creds,rw,noauto 0 0

As soon as i remove that line and save fstab, all behaves correctly again.
I can do so back and forth interactively with emacs - and the change is
immediately (!) caught by some "watcher" .... atm no idea who it is.  :-)

Probably this behavior should be documented somewhere.
Cheers, Günter
 


More information about the linux-cifs-client mailing list