[Samba] Re: Intermittent "INTERNAL ERROR: Signal 11" with 3.0.24

Joe Murphy joe.murphy at clear.net.nz
Wed Jun 13 04:46:01 GMT 2007


Hi all

Follow up to this post, we've been able to capture a gdb
backtrace. Can anyone help with guidance as to what this
means. See below:

(gdb) bt
#0  0xffffe410 in ?? ()
#1  0x00000001 in ?? ()
#2  0x00000000 in ?? ()
#3  0xbfffc9d8 in ?? ()
#4  0x402b36e3 in __waitpid_nocancel () from
/lib/tls/libc.so.6
#5  0x4025ef58 in do_system () from /lib/tls/libc.so.6
#6  0x402268dd in system () from /lib/tls/libpthread.so.0
#7  0x0822b612 in smb_panic (why=0x0) at lib/util.c:1608
#8  0x08219b3f in fault_report (sig=-512) at lib/fault.c:47
#9  0x08219b50 in sig_fault (sig=-512) at lib/fault.c:70
#10 <signal handler called>
#11 0x40292d1b in strlen () from /lib/tls/libc.so.6
#12 0x40268242 in vfprintf () from /lib/tls/libc.so.6
#13 0x40285e76 in vsnprintf () from /lib/tls/libc.so.6
#14 0x08219956 in dbgtext (format_str=0x6d2e5c73 "") at
lib/debug.c:1011
#15 0x0825b360 in oplock_timeout_handler (te=0x844ce10,
now=0xbfffd9c0,
    private_data=0x84492f0) at smbd/oplock.c:351
#16 0x08242d7d in run_events () at lib/events.c:102
#17 0x080f2801 in receive_message_or_smb (buffer=0x40433008
"",
    buffer_len=131137, timeout=60000) at smbd/process.c:457
#18 0x080f4122 in smbd_process () at smbd/process.c:1649
#19 0x082beea9 in main (argc=1831754867, argv=0xbfffdd34) at
smbd/server.c:1024

This is similar to the following panic message recorded in
syslog:

Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
smbd/oplock.c:oplock_timeout_handler(351)
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(41)
Jun 13 12:57:29 uhti02 smbd[16322]:  
===============================================================
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(42)
Jun 13 12:57:29 uhti02 smbd[16322]:   INTERNAL ERROR: Signal
11 in pid 16322 (3.0.24-SerNet-SuSE)
Jun 13 12:57:29 uhti02 smbd[16322]:   Please read the
Trouble-Shooting section of the Samba3-HOWTO
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(44)
Jun 13 12:57:29 uhti02 smbd[16322]:
Jun 13 12:57:29 uhti02 smbd[16322]:   From:
http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/fault.c:fault_report(45)
Jun 13 12:57:29 uhti02 smbd[16322]:  
===============================================================
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/util.c:smb_panic(1599)
Jun 13 12:57:29 uhti02 smbd[16322]:   PANIC (pid 16322):
internal error
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/util.c:log_stack_trace(1706)
Jun 13 12:57:29 uhti02 smbd[16322]:   BACKTRACE: 14 stack
frames:
Jun 13 12:57:29 uhti02 smbd[16322]:    #0
/usr/sbin/smbd(log_stack_trace+0x22) [0x822b6fb]
Jun 13 12:57:29 uhti02 smbd[16322]:    #1
/usr/sbin/smbd(smb_panic+0x6f) [0x822b59a]
Jun 13 12:57:29 uhti02 smbd[16322]:    #2 /usr/sbin/smbd
[0x8219b3f]
Jun 13 12:57:29 uhti02 smbd[16322]:    #3 /usr/sbin/smbd
[0x8219b50]
Jun 13 12:57:29 uhti02 smbd[16322]:    #4 [0xffffe420]
Jun 13 12:57:29 uhti02 smbd[16322]:    #5
/lib/tls/libc.so.6(vsnprintf+0xb6) [0x40285e76]
Jun 13 12:57:29 uhti02 smbd[16322]:    #6
/usr/sbin/smbd(dbgtext+0x2e) [0x8219956]
Jun 13 12:57:29 uhti02 smbd[16322]:    #7 /usr/sbin/smbd
[0x825b360]
Jun 13 12:57:29 uhti02 smbd[16322]:    #8
/usr/sbin/smbd(run_events+0x15f) [0x8242d7d]
Jun 13 12:57:29 uhti02 smbd[16322]:    #9 /usr/sbin/smbd
[0x80f2801]
Jun 13 12:57:29 uhti02 smbd[16322]:    #10
/usr/sbin/smbd(smbd_process+0x10e) [0x80f4122]
Jun 13 12:57:29 uhti02 smbd[16322]:    #11
/usr/sbin/smbd(main+0x946) [0x82beea9]
Jun 13 12:57:29 uhti02 smbd[16322]:    #12
/lib/tls/libc.so.6(__libc_start_main+0xd0) [0x40240210]
Jun 13 12:57:29 uhti02 smbd[16322]:    #13 /usr/sbin/smbd
[0x808ceb1]
Jun 13 12:57:29 uhti02 smbd[16322]: [2007/06/13 12:57:29, 0]
lib/util.c:smb_panic(1607)
Jun 13 12:57:29 uhti02 smbd[16322]:   smb_panic(): calling
panic action [/bin/sleep 90000]

Versions:
Kernel: 2.6.5-7.97-bigsmp
smbd, nmbd, winbindd: Version 3.0.24-SerNet-SuSE

As I said earlier this problem occurs intermittently every
2-3 days, in 2 separate Samba installations, and when it
occurs Samba requires a restart to clear.

Much appreciated.

Joe




----- Original Message Follows -----
> Hi Samba list,
>  
> We're experiencing some issues with our Samba 3.0.24
> environments. Hopefully somebody can offer suggestions or
> guidance.
>  
> A bit of background. We have 3 application environments,
> which consist of a Samba host providing file sharing
> services to 7 Windows application servers.
>  
> These Samba hosts intermittently experiencing problem
> providing file sharing. So far we haven't established a
> pattern with the failures, so for now the best we can
> establish is that every couple of days a Samba host will
> experience a Internal Error (signal 11) in an smbd
> process.  From that point onwards the smbd process will
> operate unreliability such that Windows clients will
> generally not be able to connect to the share, file copies
> that were underway will abort with errors, etc. All this
> will require a restart of the Linux host to clear, and
> once restarted things are fine.
>  
> All three environments are the same for hardware/OS and
> software. They operate independently of each other. All
> experience the same issue. Other than this issue we do not
> experience any other Samba problems, the file shares run
> without problems, until a signal 11 occurs.
>  
> - SuSE Enterprise Linux 9 (2.6.5-7.97-bigsmp)
> - Samba 3.0.24 
> - /data (total 1TB, .5TB in use) - /dev/sdc1 type ext3
> (rw,acl,user_xattr)
>  
> The signal 11 crashes appear to have started following our
> upgrading to Samba 3.0.24 in March 2007.
>  
> Example message attached in signal_11.txt. I've attached
> these instead of placing inline as my webmail has fixed
> width formatting which messes up the syslog line - hope
> this is okay.
>  
> Things we've tested:
>  
> - fsck
> - testparm
> - Samba config changes: 
>   kernel oplocks = no
>   oplocks = False
>   level2 oplocks = False
>  
> I though I'd preemptively post this to the mailing list to
> see if anyone has experienced similar issues. I will post
> some 'gdb smb PID' output once I'm able to catch it.
>  
> Our suspicion is that this occurs under load, though we've
> not yet been able to reproduce the problem under testing.
> Upgrading to 3.0.25 is an option, although we'd like to do
> this once we more clearly identified the cause and fix.
>  
> Finally, an example of the volume of errors we're
> experiencing (from a single host) is attached in
> volume.txt.
>  
> Happy to post other info.
>  
> Kind regards
> Joe Murphy
> Info Systems Technical Team
> joe.murphy at clear.net.nz
> 
> 
> [Attachment: signal_11.txt]
> [Attachment: volume.txt]


More information about the samba mailing list