RFC: can we improve the mount program's error messages?

David Collier-Brown davec-b at rogers.com
Sat Jun 2 12:56:29 MDT 2012


  Fred Weigel, one of my Smarter Colleagues at work made an smb mount of
a drive on our internal network, and some hours later connected to a VPN
to a customer site.  This all was fine until he wanted to do an ls on
his local machine, and it */hung/*....

Approximately three minutes late it unstuck, and said
---
[fred at dejah ~]$ ls
ls: cannot access UserHome: Host is down
ls: cannot access Solutions: Host is down
2ax3                  gambc-stuff                scheme
a                        gfortran_switches       scm.2
a.c                     gsc-config                   setfpucw.txt
ambeg               gscript                        show-passes.py
---

We tried a GUI to do the ls (using Thunar) but it hung too, and didn't
even bring up it's initial window...

Eventually we got the ls error message, then Thunar started and
displayed the same message.  UserHome and Solutions were the names of
shares, and while "Host is down" wasn't actually correct, it did allow
us to see it was a software problem. This was a great relief, as we'd
both suspected a hardware problem.

Looking at syslog gave us a bit more information
---
[fred at dejah ~]$ sudo tail /var/log/messages
May 31 14:46:51 dejah kernel: [63915.904107] CIFS VFS: Unexpected lookup
error -112
May 31 14:46:54 dejah pptp[13117]: nm-pptp-service-13107
log[logecho:pptp_ctrl.c:692]: Echo Reply received.
May 31 14:46:54 dejah pptp[13117]: nm-pptp-service-13107
log[logecho:pptp_ctrl.c:694]: no more Echo Reply/Request packets will be
reported.
May 31 14:47:11 dejah kernel: [63935.906107] CIFS VFS: Unexpected lookup
error -112
May 31 14:47:31 dejah kernel: [63955.908132] CIFS VFS: Unexpected lookup
error -112
May 31 14:47:51 dejah kernel: [63975.910114] CIFS VFS: Unexpected lookup
error -112
May 31 14:48:11 dejah kernel: [63995.912131] CIFS VFS: Unexpected lookup
error -112
May 31 14:48:31 dejah kernel: [64015.913287] CIFS VFS: Unexpected lookup
error -112
May 31 14:48:45 dejah wall[13654]: wall: user fred broadcasted 1 lines
(14 chars)
May 31 14:48:51 dejah kernel: [64035.915128] CIFS VFS: Unexpected lookup
error -112
--

After a few moments, we figured out what had happened.  The VPN had
changed the default route, and we had tried to do an SMB request to a
machine on the other end of the VPN, so of course it failed, and
eventually timed out.

This therefore was NOT a samba/mount.cifs problem, but rather a problem
with the VPN, merely exacerbated by the longish timeout and the
minimalist error messages.  We can help a bit by improving the messages,
and I specifically would like your opinion about the famous

        NFS server foo nor responding, still trying

message from the early days of network filesystems. 

That message existed because the failure was at the kernel level, and
the nfs program didn't know what user or users were affected by the
problem. So it used wall(1) to notify everyone.  If you didn't like wall
messages, you set "mesg n" and shut it up.

We still suffer the same problem as early Unix, only now it's worse. 
It's hard to tell who is suffering from a slow, unreachable or failed
host, and the GUIs don't help.  What might help are
1) better messages
2) a notifier like wall that works with the major GUIs (is there one???)
3) a way to make the timeouts shorter without introducing new problems.

Today, I'd like to propose two variants of (1)
i) use wall for the first failed response in a specified time period
ii) make the message "SMB server //<host>/<share> not responding, still
trying"

The ls messages only supply the share, and they say "Host is down",
which will make non-techies go looking for a host named <share>. The
syslog messages are precise but not meaningful, so I'd append the "not
responding" message to .


--dave
Sigh: I hardly get to code any more, just diagnose stuff. Maybe we
should make the messages *more* obscure, so I'd get asked interesting
questions more often...

-- 
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb at spamcop.net           |                      -- Mark Twain
(416) 223-8968



More information about the samba-technical mailing list