Issues with smbd hanging; smbstatus

Elliot Wasem e.wasem at opendrives.com
Tue Sep 5 20:06:29 UTC 2023


Hello,


I am having trouble debugging some uniquely strange behavior that occurred
on Samba 4.13.13 Debian Buster. We’re currently scratching out head about
it, so hopefully it’s familiar to someone on the lists!

We were experiencing the following symptoms:

When we ran the command "smbstatus", we observed that the command hung and
did not return. This was resolved by restarting the machine, though we’ve
been able to reproduce it a few times by invoking “smbstatus” again. While
smbstatus was hung, users could no longer use their active Samba mounts, or
mount new clients. This happened around the time that a smbd panic was
logged in syslog, and so we are curious if the panic is related. We also
noticed that smbd is reporting not being able to connect to winbind,
despite winbind seeming otherwise healthy according to systemd logs.

We were unable to successfully read the core dump via gdb, even after
installing debug symbols.

I have listed below some excerpts from syslog, and the GDB session I ran in
attempting to debug this issue.

As background, we had a few issues recently that may be related;

   -

   We were seeing many (over 2k occurrences) entries like the following
   over the past few days, prior to the problems we encountered. To remedy
   this, we increased the ulimit on open files to (2^16)-1. We did not see
   this occur around the specific area of interest, but thought it was worth
   mentioning here. The most recent occurrence of this prior to the primary
   area of concern is 2 days before.
   -

      DATET11:56:00.017245,host1,smbd[2596270]: [DATE 11:56:00.016924,  0]
      ../../source3/smbd/open.c:827(fd_open)
      -

      DATET11:56:00.017394,host1,smbd[2596270]:   Too many open files,
      unable to open more!  smbd's max open files = 16424
      -

   We also had an issue wherein winbind couldn’t always reliably reach its
   DCs, and would sometimes be unable to connect for authentication. We
   restricted winbind to use a subset of available DCs via the smb.conf
   “password servers” setting and fixed the list of DCs that winbind would try
   to connect to to a short list of known-good servers. This was resolved
   after observing the “smbstatus” behavior and we have not performed further
   testing on this issue since the setting was applied.



I am under the impression that "smbstatus" is merely revealing symptoms,
and is not the root cause of the issue. I believe this because we haven't
seen this issue elsewhere other than these systems, and "smbstatus" is used
regularly when we formulate SMB status reports, which generally works
without issue



I have a few main questions:

    1. Does this give any insight at all into what may have happened, or
sound familiar to anyone?

    2. The stack in the core file, as analyzed by GDB, is only a few items
long when analyzed with the "bt" command. I imagine the core dump facility
of SMBD probably isn’t configured correctly on my machine, or GDB isn’t
picking up the data correctly. Can anyone point me to documentation around
gathering and reading samba cores?

3. Any recommendations for further debug or diagnostic measures to take to
diagnose this issue? It’s a bit of an odd one, and the systems seem largely
correctly configured.


Further details listed below. Thank you for your time and consideration!

Thank you,

Elliot

SYSLOG ENTRIES:

DATET00:18:11.752254,host1,smbd[2254747]: [DATE 00:18:11.751973,  0]
../../source3/auth/auth_winbind.c:120(check_winbind_security)

DATET00:18:11.752729,host1,smbd[2254747]:   check_winbind_security:
winbindd not running - but required as domain member:
NT_STATUS_NO_LOGON_SERVERS

DATET00:18:12.489698,host1,Mount(Mount_Management)[2259973]: INFO:
Mount_Management monitor : 0:159(smb_panic_log)

DATET00:18:13.341266,host1,smbd[69758]: [DATE 00:18:13.340801,  0]
../../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)

DATET00:18:13.341445,host1,smbd[69758]:   Bad talloc magic value - unknown
value

DATET00:18:13.341484,host1,smbd[69758]: [DATE 00:18:13.341090,  0]
../../lib/util/fault.c:159(smb_panic_log)

DATET00:18:13.341526,host1,smbd[69758]:
===============================================================

DATET00:18:13.341556,host1,smbd[69758]: [DATE 00:18:13.341106,  0]
../../lib/util/fault.c:160(smb_panic_log)

DATET00:18:13.341699,host1,smbd[69758]:   INTERNAL ERROR: Bad talloc magic
value - unknown value in pid 69758 (4.13.13-Debian)

DATET00:18:13.341737,host1,smbd[69758]: [DATE 00:18:13.341128,  0]
../../lib/util/fault.c:164(smb_panic_log)

DATET00:18:13.341771,host1,smbd[69758]:   If you are running a recent Samba
version, and if you think this problem is not yet fixed in the latest
versions, please consider reporting this bug, see
https://wiki.samba.org/index.php/Bug_Reporting

DATET00:18:13.341799,host1,smbd[69758]: [DATE 00:18:13.341142,  0]
../../lib/util/fault.c:169(smb_panic_log)

DATET00:18:13.341841,host1,smbd[69758]:
===============================================================

DATET00:18:13.341865,host1,smbd[69758]: [DATE 00:18:13.341157,  0]
../../lib/util/fault.c:170(smb_panic_log)

DATET00:18:13.341893,host1,smbd[69758]:   PANIC (pid 69758): Bad talloc
magic value - unknown value in 4.13.13-Debian

DATET00:18:13.341921,host1,smbd[69758]: [DATE 00:18:13.341396,  0]
../../lib/util/fault.c:232(log_stack_trace)

DATET00:18:13.341949,host1,smbd[69758]:   BACKTRACE:

DATET00:18:13.342215,host1,smbd[69758]:    #0 log_stack_trace + 0x39
[ip=0x7f460eb27bd9] [sp=0x7ffff2fbe150]

DATET00:18:13.342928,host1,smbd[69758]:    #1 smb_panic + 0x26
[ip=0x7f460eb27f56] [sp=0x7ffff2fbea90]

DATET00:18:13.343334,host1,smbd[69758]:    #2 <unknown symbol>
[ip=0x7f460e442e58] [sp=0x7ffff2fbeba0]

DATET00:18:13.343334,host1,smbd[69758]:    #3 smbd_request_guid + 0x242
[ip=0x7f460e915562] [sp=0x7ffff2fbebd0]

DATET00:18:13.345131,host1,smbd[69758]:    #4
tevent_common_invoke_immediate_handler + 0x14a [ip=0x7f460e431b4a]
[sp=0x7ffff2fbebf0]

DATET00:18:13.345669,host1,smbd[69758]:    #5 tevent_common_loop_immediate
+ 0x1a [ip=0x7f460e431b6a] [sp=0x7ffff2fbec50]

DATET00:18:13.346065,host1,smbd[69758]:    #6 tevent_wakeup_recv + 0xe6c
[ip=0x7f460e43784c] [sp=0x7ffff2fbec70]

DATET00:18:13.346455,host1,smbd[69758]:    #7
tevent_cleanup_pending_signal_handlers + 0xb7 [ip=0x7f460e435c07]
[sp=0x7ffff2fbecd0]

DATET00:18:13.346849,host1,smbd[69758]:    #8 _tevent_loop_once + 0x94
[ip=0x7f460e430df4] [sp=0x7ffff2fbecf0]

DATET00:18:13.347250,host1,smbd[69758]:    #9 tevent_common_loop_wait +
0x1b [ip=0x7f460e43109b] [sp=0x7ffff2fbed20]

DATET00:18:13.347639,host1,smbd[69758]:    #10
tevent_cleanup_pending_signal_handlers + 0x57 [ip=0x7f460e435ba7]
[sp=0x7ffff2fbed40]

DATET00:18:13.348114,host1,smbd[69758]:    #11 smbd_process + 0x828
[ip=0x7f460e909398] [sp=0x7ffff2fbed60]

DATET00:18:13.348195,host1,smbd[69758]:    #12 _start + 0x2365
[ip=0x55dabd16e315] [sp=0x7ffff2fbedf0]

DATET00:18:13.348726,host1,smbd[69758]:    #13
tevent_common_invoke_fd_handler + 0x7d [ip=0x7f460e43170d]
[sp=0x7ffff2fbeed0]

DATET00:18:13.349124,host1,smbd[69758]:    #14 tevent_wakeup_recv + 0x1097
[ip=0x7f460e437a77] [sp=0x7ffff2fbef00]

DATET00:18:13.349528,host1,smbd[69758]:    #15
tevent_cleanup_pending_signal_handlers + 0xb7 [ip=0x7f460e435c07]
[sp=0x7ffff2fbef60]

DATET00:18:13.349914,host1,smbd[69758]:    #16 _tevent_loop_once + 0x94
[ip=0x7f460e430df4] [sp=0x7ffff2fbef80]

DATET00:18:13.350297,host1,smbd[69758]:    #17 tevent_common_loop_wait +
0x1b [ip=0x7f460e43109b] [sp=0x7ffff2fbefb0]

DATET00:18:13.350676,host1,smbd[69758]:    #18
tevent_cleanup_pending_signal_handlers + 0x57 [ip=0x7f460e435ba7]
[sp=0x7ffff2fbefd0]

DATET00:18:13.350719,host1,smbd[69758]:    #19 main + 0x1ddf
[ip=0x55dabd16be3f] [sp=0x7ffff2fbeff0]

DATET00:18:13.351315,host1,smbd[69758]:    #20 __libc_start_main + 0xea
[ip=0x7f460e116d0a] [sp=0x7ffff2fbf3d0]

DATET00:18:13.351378,host1,smbd[69758]:    #21 _start + 0x2a
[ip=0x55dabd16bfda] [sp=0x7ffff2fbf4a0]

DATET00:18:13.351569,host1,smbd[69758]: [DATE 00:18:13.351383,  0]
../../source3/lib/dumpcore.c:315(dump_core)

DATET00:18:13.351607,host1,smbd[69758]:   dumping core in
/var/log/samba/cores/smbd

DATET00:18:13.351631,host1,smbd[69758]:

GDB BACKTRACE:

$ gdb --exec=$(which smbd) --core=nfl-smbd-core --readnow

GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git

Copyright (C) 2021 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.

Type "show copying" and "show warranty" for details.

This GDB was configured as "x86_64-linux-gnu".

Type "show configuration" for configuration details.

For bug reporting instructions, please see:

<https://www.gnu.org/software/gdb/bugs/>.

Find the GDB manual and other documentation resources online at:

    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".

Type "apropos word" to search for commands related to "word".

warning: Can't open file /var/lib/samba/private/passdb.tdb during
file-backed mapping note processing

warning: Can't open file /run/samba/smbXsrv_open_global.tdb during
file-backed mapping note processing

warning: Can't open file /run/samba/smbXsrv_tcon_global.tdb during
file-backed mapping note processing

warning: Can't open file /run/samba/smbXsrv_session_global.tdb during
file-backed mapping note processing

warning: Can't open file /var/lib/samba/account_policy.tdb during
file-backed mapping note processing

warning: Can't open file /var/lib/samba/group_mapping.tdb during
file-backed mapping note processing

warning: Can't open file /run/samba/smbXsrv_client_global.tdb during
file-backed mapping note processing

warning: Can't open file /var/lib/samba/share_info.tdb during file-backed
mapping note processing

warning: Can't open file /var/lib/samba/private/secrets.tdb during
file-backed mapping note processing

warning: Can't open file /run/samba/smbXsrv_version_global.tdb during
file-backed mapping note processing

warning: Can't open file /run/samba/names.tdb during file-backed mapping
note processing

[New LWP 69758]

[New LWP 2260102]

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Core was generated by `/usr/sbin/smbd --foreground --no-process-group'.

Program terminated with signal SIGABRT, Aborted.

#0  0x00007f460e12bce1 in __modf (x=1.5810100666919889e-322, iptr=0x2) at
../sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c:40

40  ../sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c: No such file or
directory.

[Current thread is 1 (Thread 0x7f460a01fa80 (LWP 69758))]

(gdb) bt

#0  0x00007f460e12bce1 in __modf (x=1.5810100666919889e-322, iptr=0x2) at
../sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c:40

#1  0x0000000000001880 in ?? ()

#2  0x0000000000000000 in ?? ()

(gdb) thread 2

[Switching to thread 2 (Thread 0x7f460713c700 (LWP 2260102))]

#0  futex_abstimed_wait_cancelable (private=0, abstime=0x7f460713bb60,
clockid=118733536, expected=0, futex_word=0x55dabeb4f104) at
../sysdeps/nptl/futex-internal.h:323

323 ../sysdeps/nptl/futex-internal.h: No such file or directory.

(gdb) bt

#0  futex_abstimed_wait_cancelable (private=0, abstime=0x7f460713bb60,
clockid=118733536, expected=0, futex_word=0x55dabeb4f104) at
../sysdeps/nptl/futex-internal.h:323

#1  __pthread_cond_wait_common (abstime=0x7f460713bb60, clockid=118733536,
mutex=0x55dabeb4f0b0, cond=0x55dabeb4f0d8) at pthread_cond_wait.c:520

#2  __pthread_cond_timedwait (cond=cond at entry=0x55dabeb4f0d8,
mutex=mutex at entry=0x55dabeb4f0b0, abstime=abstime at entry=0x7f460713bb60) at
pthread_cond_wait.c:656

#3  0x00007f460dadb1e2 in pthreadpool_server (arg=0x55dabeb4f0a0) at
../../lib/pthreadpool/pthreadpool.c:595

#4  0x00007f460e40fea7 in start_thread (arg=<optimized out>) at
pthread_create.c:477

#5  0x00007f460e1efaef in ?? () from /lib/x86_64-linux-gnu/libc.so.6

#6  0x0000000000000000 in ?? ()

TESTPARM:
Load smb config files from /etc/samba/smb.conf

Loaded services file OK.

Weak crypto is allowed

WARNING: The 'netbios name' is too long (max. 15 chars).

Server role: ROLE_DOMAIN_MEMBER

Press enter to see a dump of your service definitions

# Global parameters

[global]

disable spoolss = Yes

dns proxy = No

load printers = No

local master = No

log file = /var/log/samba/clients/%m.log

logging = syslog at 0 file

max log size = 100

max stat cache size = 65536

max xmit = 1048676

name resolve order = host wins bcast

ntlm auth = ntlmv1-permitted

printcap name = /dev/null

realm = REALM

security = ADS

server multi channel support = Yes

socket options = IPTOS_LOWDELAY TCP_NODELAY

template homedir = /home/%U

usershare path =

winbind use default domain = Yes

workgroup = US

idmap config us : range = 10000-1999999

idmap config us : backend = rid

idmap config * : range = 3000-7999

idmap config * : backend = tdb

include = /etc/samba/smb-shares.conf.local

printing = bsd


[share1]

access based share enum = Yes

case sensitive = Yes

create mask = 0666

delete veto files = Yes

directory mask = 0777

mangled names = no

path = /P01/share1

read only = No

strict sync = No

use sendfile = Yes

vfs objects = catia fruit streams_xattr acl_xattr aio_pthread

readdir_attr:aapl_rsize = no

readdir_attr:aapl_max_access = yes

readdir_attr:aapl_finder_info = yes

fruit:zero_file_id = yes

fruit:wipe_intentionally_left_blank_rfork = yes

fruit:veto_appledouble = no

fruit:resource = file

fruit:posix_rename = yes

fruit:nfs_aces = yes

fruit:model = MacSamba

fruit:metadata = netatalk

fruit:locking = none

fruit:encoding = private

fruit:delete_empty_adfiles = yes

fruit:copyfile = yes

fruit:aapl = yes


[share2]

access based share enum = Yes

case sensitive = Yes

create mask = 0666

delete veto files = Yes

directory mask = 0777

mangled names = no

path = /P01/share2

read only = No

strict sync = No

use sendfile = Yes

vfs objects = catia fruit streams_xattr acl_xattr aio_pthread

readdir_attr:aapl_rsize = no

readdir_attr:aapl_max_access = yes

readdir_attr:aapl_finder_info = yes

fruit:zero_file_id = yes

fruit:wipe_intentionally_left_blank_rfork = yes

fruit:veto_appledouble = no

fruit:resource = file

fruit:posix_rename = yes

fruit:nfs_aces = yes

fruit:model = MacSamba

fruit:metadata = netatalk

fruit:locking = none

fruit:encoding = private

fruit:delete_empty_adfiles = yes

fruit:copyfile = yes

fruit:aapl = yes


[share3]

access based share enum = Yes

case sensitive = Yes

create mask = 0666

delete veto files = Yes

directory mask = 0777

mangled names = no

path = /P01/share3

read only = No

strict sync = No

use sendfile = Yes

vfs objects = fileid catia streams_xattr fruit acl_xattr aio_pthread

fileid:algorithm = fsname

readdir_attr:aapl_rsize = no

readdir_attr:aapl_max_access = yes

readdir_attr:aapl_finder_info = yes

fruit:zero_file_id = yes

fruit:wipe_intentionally_left_blank_rfork = yes

fruit:veto_appledouble = no

fruit:resource = file

fruit:posix_rename = yes

fruit:nfs_aces = yes

fruit:model = MacSamba

fruit:metadata = netatalk

fruit:locking = none

fruit:encoding = private

fruit:delete_empty_adfiles = yes

fruit:copyfile = yes

fruit:aapl = yes

-- 
 
<https://opendrives.ac-page.com/ibc2023?test=true&utm_source=AC&utm_medium=emailsig&utm_campaign=IBC2023>


Need a free guest pass? Use code: *IBC8340*.




More information about the samba-technical mailing list