Issues with smbd hanging; smbstatus
Elliot Wasem
e.wasem at opendrives.com
Tue Sep 5 20:06:29 UTC 2023
Hello,
I am having trouble debugging some uniquely strange behavior that occurred
on Samba 4.13.13 Debian Buster. We’re currently scratching out head about
it, so hopefully it’s familiar to someone on the lists!
We were experiencing the following symptoms:
When we ran the command "smbstatus", we observed that the command hung and
did not return. This was resolved by restarting the machine, though we’ve
been able to reproduce it a few times by invoking “smbstatus” again. While
smbstatus was hung, users could no longer use their active Samba mounts, or
mount new clients. This happened around the time that a smbd panic was
logged in syslog, and so we are curious if the panic is related. We also
noticed that smbd is reporting not being able to connect to winbind,
despite winbind seeming otherwise healthy according to systemd logs.
We were unable to successfully read the core dump via gdb, even after
installing debug symbols.
I have listed below some excerpts from syslog, and the GDB session I ran in
attempting to debug this issue.
As background, we had a few issues recently that may be related;
-
We were seeing many (over 2k occurrences) entries like the following
over the past few days, prior to the problems we encountered. To remedy
this, we increased the ulimit on open files to (2^16)-1. We did not see
this occur around the specific area of interest, but thought it was worth
mentioning here. The most recent occurrence of this prior to the primary
area of concern is 2 days before.
-
DATET11:56:00.017245,host1,smbd[2596270]: [DATE 11:56:00.016924, 0]
../../source3/smbd/open.c:827(fd_open)
-
DATET11:56:00.017394,host1,smbd[2596270]: Too many open files,
unable to open more! smbd's max open files = 16424
-
We also had an issue wherein winbind couldn’t always reliably reach its
DCs, and would sometimes be unable to connect for authentication. We
restricted winbind to use a subset of available DCs via the smb.conf
“password servers” setting and fixed the list of DCs that winbind would try
to connect to to a short list of known-good servers. This was resolved
after observing the “smbstatus” behavior and we have not performed further
testing on this issue since the setting was applied.
I am under the impression that "smbstatus" is merely revealing symptoms,
and is not the root cause of the issue. I believe this because we haven't
seen this issue elsewhere other than these systems, and "smbstatus" is used
regularly when we formulate SMB status reports, which generally works
without issue
I have a few main questions:
1. Does this give any insight at all into what may have happened, or
sound familiar to anyone?
2. The stack in the core file, as analyzed by GDB, is only a few items
long when analyzed with the "bt" command. I imagine the core dump facility
of SMBD probably isn’t configured correctly on my machine, or GDB isn’t
picking up the data correctly. Can anyone point me to documentation around
gathering and reading samba cores?
3. Any recommendations for further debug or diagnostic measures to take to
diagnose this issue? It’s a bit of an odd one, and the systems seem largely
correctly configured.
Further details listed below. Thank you for your time and consideration!
Thank you,
Elliot
SYSLOG ENTRIES:
DATET00:18:11.752254,host1,smbd[2254747]: [DATE 00:18:11.751973, 0]
../../source3/auth/auth_winbind.c:120(check_winbind_security)
DATET00:18:11.752729,host1,smbd[2254747]: check_winbind_security:
winbindd not running - but required as domain member:
NT_STATUS_NO_LOGON_SERVERS
DATET00:18:12.489698,host1,Mount(Mount_Management)[2259973]: INFO:
Mount_Management monitor : 0:159(smb_panic_log)
DATET00:18:13.341266,host1,smbd[69758]: [DATE 00:18:13.340801, 0]
../../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
DATET00:18:13.341445,host1,smbd[69758]: Bad talloc magic value - unknown
value
DATET00:18:13.341484,host1,smbd[69758]: [DATE 00:18:13.341090, 0]
../../lib/util/fault.c:159(smb_panic_log)
DATET00:18:13.341526,host1,smbd[69758]:
===============================================================
DATET00:18:13.341556,host1,smbd[69758]: [DATE 00:18:13.341106, 0]
../../lib/util/fault.c:160(smb_panic_log)
DATET00:18:13.341699,host1,smbd[69758]: INTERNAL ERROR: Bad talloc magic
value - unknown value in pid 69758 (4.13.13-Debian)
DATET00:18:13.341737,host1,smbd[69758]: [DATE 00:18:13.341128, 0]
../../lib/util/fault.c:164(smb_panic_log)
DATET00:18:13.341771,host1,smbd[69758]: If you are running a recent Samba
version, and if you think this problem is not yet fixed in the latest
versions, please consider reporting this bug, see
https://wiki.samba.org/index.php/Bug_Reporting
DATET00:18:13.341799,host1,smbd[69758]: [DATE 00:18:13.341142, 0]
../../lib/util/fault.c:169(smb_panic_log)
DATET00:18:13.341841,host1,smbd[69758]:
===============================================================
DATET00:18:13.341865,host1,smbd[69758]: [DATE 00:18:13.341157, 0]
../../lib/util/fault.c:170(smb_panic_log)
DATET00:18:13.341893,host1,smbd[69758]: PANIC (pid 69758): Bad talloc
magic value - unknown value in 4.13.13-Debian
DATET00:18:13.341921,host1,smbd[69758]: [DATE 00:18:13.341396, 0]
../../lib/util/fault.c:232(log_stack_trace)
DATET00:18:13.341949,host1,smbd[69758]: BACKTRACE:
DATET00:18:13.342215,host1,smbd[69758]: #0 log_stack_trace + 0x39
[ip=0x7f460eb27bd9] [sp=0x7ffff2fbe150]
DATET00:18:13.342928,host1,smbd[69758]: #1 smb_panic + 0x26
[ip=0x7f460eb27f56] [sp=0x7ffff2fbea90]
DATET00:18:13.343334,host1,smbd[69758]: #2 <unknown symbol>
[ip=0x7f460e442e58] [sp=0x7ffff2fbeba0]
DATET00:18:13.343334,host1,smbd[69758]: #3 smbd_request_guid + 0x242
[ip=0x7f460e915562] [sp=0x7ffff2fbebd0]
DATET00:18:13.345131,host1,smbd[69758]: #4
tevent_common_invoke_immediate_handler + 0x14a [ip=0x7f460e431b4a]
[sp=0x7ffff2fbebf0]
DATET00:18:13.345669,host1,smbd[69758]: #5 tevent_common_loop_immediate
+ 0x1a [ip=0x7f460e431b6a] [sp=0x7ffff2fbec50]
DATET00:18:13.346065,host1,smbd[69758]: #6 tevent_wakeup_recv + 0xe6c
[ip=0x7f460e43784c] [sp=0x7ffff2fbec70]
DATET00:18:13.346455,host1,smbd[69758]: #7
tevent_cleanup_pending_signal_handlers + 0xb7 [ip=0x7f460e435c07]
[sp=0x7ffff2fbecd0]
DATET00:18:13.346849,host1,smbd[69758]: #8 _tevent_loop_once + 0x94
[ip=0x7f460e430df4] [sp=0x7ffff2fbecf0]
DATET00:18:13.347250,host1,smbd[69758]: #9 tevent_common_loop_wait +
0x1b [ip=0x7f460e43109b] [sp=0x7ffff2fbed20]
DATET00:18:13.347639,host1,smbd[69758]: #10
tevent_cleanup_pending_signal_handlers + 0x57 [ip=0x7f460e435ba7]
[sp=0x7ffff2fbed40]
DATET00:18:13.348114,host1,smbd[69758]: #11 smbd_process + 0x828
[ip=0x7f460e909398] [sp=0x7ffff2fbed60]
DATET00:18:13.348195,host1,smbd[69758]: #12 _start + 0x2365
[ip=0x55dabd16e315] [sp=0x7ffff2fbedf0]
DATET00:18:13.348726,host1,smbd[69758]: #13
tevent_common_invoke_fd_handler + 0x7d [ip=0x7f460e43170d]
[sp=0x7ffff2fbeed0]
DATET00:18:13.349124,host1,smbd[69758]: #14 tevent_wakeup_recv + 0x1097
[ip=0x7f460e437a77] [sp=0x7ffff2fbef00]
DATET00:18:13.349528,host1,smbd[69758]: #15
tevent_cleanup_pending_signal_handlers + 0xb7 [ip=0x7f460e435c07]
[sp=0x7ffff2fbef60]
DATET00:18:13.349914,host1,smbd[69758]: #16 _tevent_loop_once + 0x94
[ip=0x7f460e430df4] [sp=0x7ffff2fbef80]
DATET00:18:13.350297,host1,smbd[69758]: #17 tevent_common_loop_wait +
0x1b [ip=0x7f460e43109b] [sp=0x7ffff2fbefb0]
DATET00:18:13.350676,host1,smbd[69758]: #18
tevent_cleanup_pending_signal_handlers + 0x57 [ip=0x7f460e435ba7]
[sp=0x7ffff2fbefd0]
DATET00:18:13.350719,host1,smbd[69758]: #19 main + 0x1ddf
[ip=0x55dabd16be3f] [sp=0x7ffff2fbeff0]
DATET00:18:13.351315,host1,smbd[69758]: #20 __libc_start_main + 0xea
[ip=0x7f460e116d0a] [sp=0x7ffff2fbf3d0]
DATET00:18:13.351378,host1,smbd[69758]: #21 _start + 0x2a
[ip=0x55dabd16bfda] [sp=0x7ffff2fbf4a0]
DATET00:18:13.351569,host1,smbd[69758]: [DATE 00:18:13.351383, 0]
../../source3/lib/dumpcore.c:315(dump_core)
DATET00:18:13.351607,host1,smbd[69758]: dumping core in
/var/log/samba/cores/smbd
DATET00:18:13.351631,host1,smbd[69758]:
GDB BACKTRACE:
$ gdb --exec=$(which smbd) --core=nfl-smbd-core --readnow
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
warning: Can't open file /var/lib/samba/private/passdb.tdb during
file-backed mapping note processing
warning: Can't open file /run/samba/smbXsrv_open_global.tdb during
file-backed mapping note processing
warning: Can't open file /run/samba/smbXsrv_tcon_global.tdb during
file-backed mapping note processing
warning: Can't open file /run/samba/smbXsrv_session_global.tdb during
file-backed mapping note processing
warning: Can't open file /var/lib/samba/account_policy.tdb during
file-backed mapping note processing
warning: Can't open file /var/lib/samba/group_mapping.tdb during
file-backed mapping note processing
warning: Can't open file /run/samba/smbXsrv_client_global.tdb during
file-backed mapping note processing
warning: Can't open file /var/lib/samba/share_info.tdb during file-backed
mapping note processing
warning: Can't open file /var/lib/samba/private/secrets.tdb during
file-backed mapping note processing
warning: Can't open file /run/samba/smbXsrv_version_global.tdb during
file-backed mapping note processing
warning: Can't open file /run/samba/names.tdb during file-backed mapping
note processing
[New LWP 69758]
[New LWP 2260102]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/smbd --foreground --no-process-group'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f460e12bce1 in __modf (x=1.5810100666919889e-322, iptr=0x2) at
../sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c:40
40 ../sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c: No such file or
directory.
[Current thread is 1 (Thread 0x7f460a01fa80 (LWP 69758))]
(gdb) bt
#0 0x00007f460e12bce1 in __modf (x=1.5810100666919889e-322, iptr=0x2) at
../sysdeps/ieee754/dbl-64/wordsize-64/s_modf.c:40
#1 0x0000000000001880 in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f460713c700 (LWP 2260102))]
#0 futex_abstimed_wait_cancelable (private=0, abstime=0x7f460713bb60,
clockid=118733536, expected=0, futex_word=0x55dabeb4f104) at
../sysdeps/nptl/futex-internal.h:323
323 ../sysdeps/nptl/futex-internal.h: No such file or directory.
(gdb) bt
#0 futex_abstimed_wait_cancelable (private=0, abstime=0x7f460713bb60,
clockid=118733536, expected=0, futex_word=0x55dabeb4f104) at
../sysdeps/nptl/futex-internal.h:323
#1 __pthread_cond_wait_common (abstime=0x7f460713bb60, clockid=118733536,
mutex=0x55dabeb4f0b0, cond=0x55dabeb4f0d8) at pthread_cond_wait.c:520
#2 __pthread_cond_timedwait (cond=cond at entry=0x55dabeb4f0d8,
mutex=mutex at entry=0x55dabeb4f0b0, abstime=abstime at entry=0x7f460713bb60) at
pthread_cond_wait.c:656
#3 0x00007f460dadb1e2 in pthreadpool_server (arg=0x55dabeb4f0a0) at
../../lib/pthreadpool/pthreadpool.c:595
#4 0x00007f460e40fea7 in start_thread (arg=<optimized out>) at
pthread_create.c:477
#5 0x00007f460e1efaef in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x0000000000000000 in ?? ()
TESTPARM:
Load smb config files from /etc/samba/smb.conf
Loaded services file OK.
Weak crypto is allowed
WARNING: The 'netbios name' is too long (max. 15 chars).
Server role: ROLE_DOMAIN_MEMBER
Press enter to see a dump of your service definitions
# Global parameters
[global]
disable spoolss = Yes
dns proxy = No
load printers = No
local master = No
log file = /var/log/samba/clients/%m.log
logging = syslog at 0 file
max log size = 100
max stat cache size = 65536
max xmit = 1048676
name resolve order = host wins bcast
ntlm auth = ntlmv1-permitted
printcap name = /dev/null
realm = REALM
security = ADS
server multi channel support = Yes
socket options = IPTOS_LOWDELAY TCP_NODELAY
template homedir = /home/%U
usershare path =
winbind use default domain = Yes
workgroup = US
idmap config us : range = 10000-1999999
idmap config us : backend = rid
idmap config * : range = 3000-7999
idmap config * : backend = tdb
include = /etc/samba/smb-shares.conf.local
printing = bsd
[share1]
access based share enum = Yes
case sensitive = Yes
create mask = 0666
delete veto files = Yes
directory mask = 0777
mangled names = no
path = /P01/share1
read only = No
strict sync = No
use sendfile = Yes
vfs objects = catia fruit streams_xattr acl_xattr aio_pthread
readdir_attr:aapl_rsize = no
readdir_attr:aapl_max_access = yes
readdir_attr:aapl_finder_info = yes
fruit:zero_file_id = yes
fruit:wipe_intentionally_left_blank_rfork = yes
fruit:veto_appledouble = no
fruit:resource = file
fruit:posix_rename = yes
fruit:nfs_aces = yes
fruit:model = MacSamba
fruit:metadata = netatalk
fruit:locking = none
fruit:encoding = private
fruit:delete_empty_adfiles = yes
fruit:copyfile = yes
fruit:aapl = yes
[share2]
access based share enum = Yes
case sensitive = Yes
create mask = 0666
delete veto files = Yes
directory mask = 0777
mangled names = no
path = /P01/share2
read only = No
strict sync = No
use sendfile = Yes
vfs objects = catia fruit streams_xattr acl_xattr aio_pthread
readdir_attr:aapl_rsize = no
readdir_attr:aapl_max_access = yes
readdir_attr:aapl_finder_info = yes
fruit:zero_file_id = yes
fruit:wipe_intentionally_left_blank_rfork = yes
fruit:veto_appledouble = no
fruit:resource = file
fruit:posix_rename = yes
fruit:nfs_aces = yes
fruit:model = MacSamba
fruit:metadata = netatalk
fruit:locking = none
fruit:encoding = private
fruit:delete_empty_adfiles = yes
fruit:copyfile = yes
fruit:aapl = yes
[share3]
access based share enum = Yes
case sensitive = Yes
create mask = 0666
delete veto files = Yes
directory mask = 0777
mangled names = no
path = /P01/share3
read only = No
strict sync = No
use sendfile = Yes
vfs objects = fileid catia streams_xattr fruit acl_xattr aio_pthread
fileid:algorithm = fsname
readdir_attr:aapl_rsize = no
readdir_attr:aapl_max_access = yes
readdir_attr:aapl_finder_info = yes
fruit:zero_file_id = yes
fruit:wipe_intentionally_left_blank_rfork = yes
fruit:veto_appledouble = no
fruit:resource = file
fruit:posix_rename = yes
fruit:nfs_aces = yes
fruit:model = MacSamba
fruit:metadata = netatalk
fruit:locking = none
fruit:encoding = private
fruit:delete_empty_adfiles = yes
fruit:copyfile = yes
fruit:aapl = yes
--
<https://opendrives.ac-page.com/ibc2023?test=true&utm_source=AC&utm_medium=emailsig&utm_campaign=IBC2023>
Need a free guest pass? Use code: *IBC8340*.
More information about the samba-technical
mailing list