[Samba] Intermittent Internal Error Signal 11 with 3.0.24

Joe Murphy joe.murphy at clear.net.nz
Thu May 24 00:09:45 GMT 2007


Hi Samba list,
 
We're experiencing some issues with our Samba 3.0.24
environments. Hopefully somebody can offer suggestions or
guidance.
 
A bit of background. We have 3 application environments,
which consist of a Samba host providing file sharing
services to 7 Windows application servers.
 
These Samba hosts intermittently experiencing problem
providing file sharing. So far we haven't established a
pattern with the failures, so for now the best we can
establish is that every couple of days a Samba host will
experience a Internal Error (signal 11) in an smbd process. 
>From that point onwards the smbd process will operate
unreliability such that Windows clients will generally not
be able to connect to the share, file copies that were
underway will abort with errors, etc. All this will require
a restart of the Linux host to clear, and once restarted
things are fine.
 
All three environments are the same for hardware/OS and
software. They operate independently of each other. All
experience the same issue. Other than this issue we do not
experience any other Samba problems, the file shares run
without problems, until a signal 11 occurs.
 
- SuSE Enterprise Linux 9 (2.6.5-7.97-bigsmp)
- Samba 3.0.24 
- /data (total 1TB, .5TB in use) - /dev/sdc1 type ext3
(rw,acl,user_xattr)
 
The signal 11 crashes appear to have started following our
upgrading to Samba 3.0.24 in March 2007.
 
Example message attached in signal_11.txt. I've attached
these instead of placing inline as my webmail has fixed
width formatting which messes up the syslog line - hope this
is okay.
 
Things we've tested:
 
- fsck
- testparm
- Samba config changes: 
  kernel oplocks = no
  oplocks = False
  level2 oplocks = False
 
I though I'd preemptively post this to the mailing list to
see if anyone has experienced similar issues. I will post
some 'gdb smb PID' output once I'm able to catch it.
 
Our suspicion is that this occurs under load, though we've
not yet been able to reproduce the problem under testing.
Upgrading to 3.0.25 is an option, although we'd like to do
this once we more clearly identified the cause and fix.
 
Finally, an example of the volume of errors we're
experiencing (from a single host) is attached in volume.txt.
 
Happy to post other info.
 
Kind regards
Joe Murphy
Info Systems Technical Team
joe.murphy at clear.net.nz

-------------- next part --------------
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] smbd/oplock.c:oplock_timeout_handler(351)
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/fault.c:fault_report(41)
May 23 13:47:54 host smbd[5799]:  ===============================================================
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/fault.c:fault_report(42)
May 23 13:47:54 host smbd[5799]:   INTERNAL ERROR: Signal 11 in pid 5799 (3.0.24-SerNet-SuSE)
May 23 13:47:54 host smbd[5799]:   Please read the Trouble-Shooting section of the Samba3-HOWTO
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/fault.c:fault_report(44)
May 23 13:47:54 host smbd[5799]: 
May 23 13:47:54 host smbd[5799]:   From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/fault.c:fault_report(45)
May 23 13:47:54 host smbd[5799]:  ===============================================================
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/util.c:smb_panic(1599)
May 23 13:47:54 host smbd[5799]:   PANIC (pid 5799): internal error
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/util.c:log_stack_trace(1706)
May 23 13:47:54 host smbd[5799]:   BACKTRACE: 14 stack frames:
May 23 13:47:54 host smbd[5799]:    #0 /usr/sbin/smbd(log_stack_trace+0x22) [0x822b6fb]
May 23 13:47:54 host smbd[5799]:    #1 /usr/sbin/smbd(smb_panic+0x6f) [0x822b59a]
May 23 13:47:54 host smbd[5799]:    #2 /usr/sbin/smbd[0x8219b3f]
May 23 13:47:54 host smbd[5799]:    #3 /usr/sbin/smbd[0x8219b50]
May 23 13:47:54 host smbd[5799]:    #4 [0xffffe420]
May 23 13:47:54 host smbd[5799]:    #5 /lib/tls/libc.so.6(vsnprintf+0xb6) [0x40284656]
May 23 13:47:54 host smbd[5799]:    #6 /usr/sbin/smbd(dbgtext+0x2e) [0x8219956]
May 23 13:47:54 host smbd[5799]:    #7 /usr/sbin/smbd [0x825b360]
May 23 13:47:54 host smbd[5799]:    #8 /usr/sbin/smbd(run_events+0x15f) [0x8242d7d]
May 23 13:47:54 host smbd[5799]:    #9 /usr/sbin/smbd [0x80f2801]
May 23 13:47:54 host smbd[5799]:    #10 /usr/sbin/smbd(smbd_process+0x10e) [0x80f4122]
May 23 13:47:54 host smbd[5799]:    #11 /usr/sbin/smbd(main+0x946) [0x82beea9]
May 23 13:47:54 host smbd[5799]:    #12 /lib/tls/libc.so.6(__libc_start_main+0xe0) [0x4023f250]
May 23 13:47:54 host smbd[5799]:    #13 /usr/sbin/smbd [0x808ceb1]
May 23 13:47:54 host smbd[5799]: [2007/05/23 13:47:54, 0] lib/fault.c:dump_core(173)
May 23 13:47:54 host smbd[5799]:   dumping core in /var/log/samba/cores/smbd
May 23 13:47:54 host smbd[5799]:
-------------- next part --------------
host ~> zgrep -i "internal error: signal 11" /var/log/messages-20070*.gz
/var/log/messages-20070311.gz:Mar  6 15:33:50 host smbd[26829]:   INTERNAL ERROR: Signal 11 in pid 26829 (3.0.24-SerNet-SuSE)
/var/log/messages-20070311.gz:Mar  8 16:36:16 host smbd[18948]:   INTERNAL ERROR: Signal 11 in pid 18948 (3.0.24-SerNet-SuSE)
/var/log/messages-20070322.gz:Mar 20 15:26:38 host smbd[28620]:   INTERNAL ERROR: Signal 11 in pid 28620 (3.0.24-SerNet-SuSE)
/var/log/messages-20070322.gz:Mar 20 17:51:06 host smbd[9397]:   INTERNAL ERROR: Signal 11 in pid 9397 (3.0.24-SerNet-SuSE)
/var/log/messages-20070322.gz:Mar 21 11:27:37 host smbd[2055]:   INTERNAL ERROR: Signal 11 in pid 2055 (3.0.24-SerNet-SuSE)
/var/log/messages-20070331.gz:Mar 22 20:27:56 host smbd[13374]:   INTERNAL ERROR: Signal 11 in pid 13374 (3.0.24-SerNet-SuSE)
/var/log/messages-20070331.gz:Mar 23 17:15:33 host smbd[20928]:   INTERNAL ERROR: Signal 11 in pid 20928 (3.0.24-SerNet-SuSE)
/var/log/messages-20070331.gz:Mar 30 08:15:00 host smbd[20047]:   INTERNAL ERROR: Signal 11 in pid 20047 (3.0.24-SerNet-SuSE)
/var/log/messages-20070331.gz:Mar 30 08:53:50 host smbd[22358]:   INTERNAL ERROR: Signal 11 in pid 22358 (3.0.24-SerNet-SuSE)
/var/log/messages-20070331.gz:Mar 30 11:47:19 host smbd[12808]:   INTERNAL ERROR: Signal 11 in pid 12808 (3.0.24-SerNet-SuSE)
/var/log/messages-20070404.gz:Apr  2 12:44:47 host smbd[32219]:   INTERNAL ERROR: Signal 11 in pid 32219 (3.0.24-SerNet-SuSE)
/var/log/messages-20070404.gz:Apr  3 10:06:34 host smbd[4661]:   INTERNAL ERROR: Signal 11 in pid 4661 (3.0.24-SerNet-SuSE)
/var/log/messages-20070404.gz:Apr  3 14:04:05 host smbd[29901]:   INTERNAL ERROR: Signal 11 in pid 29901 (3.0.24-SerNet-SuSE)
/var/log/messages-20070404.gz:Apr  3 14:15:13 host smbd[4787]:   INTERNAL ERROR: Signal 11 in pid 4787 (3.0.24-SerNet-SuSE)
/var/log/messages-20070412.gz:Apr  4 08:05:50 host smbd[21222]:   INTERNAL ERROR: Signal 11 in pid 21222 (3.0.24-SerNet-SuSE)
/var/log/messages-20070412.gz:Apr  4 11:06:32 host smbd[21853]:   INTERNAL ERROR: Signal 11 in pid 21853 (3.0.24-SerNet-SuSE)
/var/log/messages-20070412.gz:Apr 11 17:25:47 host smbd[17109]:   INTERNAL ERROR: Signal 11 in pid 17109 (3.0.24-SerNet-SuSE)
/var/log/messages-20070418.gz:Apr 17 16:29:24 host smbd[5035]:   INTERNAL ERROR: Signal 11 in pid 5035 (3.0.24-SerNet-SuSE)
/var/log/messages-20070421.gz:Apr 19 14:07:49 host smbd[24857]:   INTERNAL ERROR: Signal 11 in pid 24857 (3.0.24-SerNet-SuSE)
/var/log/messages-20070421.gz:Apr 19 21:13:24 host smbd[29483]:   INTERNAL ERROR: Signal 11 in pid 29483 (3.0.24-SerNet-SuSE)
/var/log/messages-20070421.gz:Apr 20 11:41:58 host smbd[19938]:   INTERNAL ERROR: Signal 11 in pid 19938 (3.0.24-SerNet-SuSE)
/var/log/messages-20070421.gz:Apr 20 19:06:56 host smbd[1294]:   INTERNAL ERROR: Signal 11 in pid 1294 (3.0.24-SerNet-SuSE)
/var/log/messages-20070421.gz:Apr 20 20:53:02 host smbd[3943]:   INTERNAL ERROR: Signal 11 in pid 3943 (3.0.24-SerNet-SuSE)
/var/log/messages-20070421.gz:Apr 21 00:37:30 host smbd[4928]:   INTERNAL ERROR: Signal 11 in pid 4928 (3.0.24-SerNet-SuSE)
/var/log/messages-20070424.gz:Apr 23 11:53:50 host smbd[16250]:   INTERNAL ERROR: Signal 11 in pid 16250 (3.0.24-SerNet-SuSE)
/var/log/messages-20070424.gz:Apr 23 12:16:20 host smbd[18079]:   INTERNAL ERROR: Signal 11 in pid 18079 (3.0.24-SerNet-SuSE)
/var/log/messages-20070424.gz:Apr 23 12:22:57 host smbd[16893]:   INTERNAL ERROR: Signal 11 in pid 16893 (3.0.24-SerNet-SuSE)
/var/log/messages-20070424.gz:Apr 23 13:04:45 host smbd[17795]:   INTERNAL ERROR: Signal 11 in pid 17795 (3.0.24-SerNet-SuSE)


More information about the samba mailing list