[Samba] second dc not working properly

Jason Keltz jas at eecs.yorku.ca
Thu Dec 10 00:26:45 UTC 2020


On 12/9/2020 6:49 PM, Jason Keltz via samba wrote:
>
> On 12/9/2020 6:14 AM, Rowland penny via samba wrote:
>> On 08/12/2020 23:20, Jason Keltz via samba wrote:
>>>
>>> On 12/8/2020 4:35 PM, Rowland penny via samba wrote:
>>>> On 08/12/2020 21:09, Jason Keltz via samba wrote:
>>>>> I'm running Samba 4.11.16 on CentOS 7 and not having much luck 
>>>>> with failover to a second domain controller.  I could *really* use 
>>>>> some help.
>>>>>
>>>>> I know my Samba config is fine.  I know that adding the second 
>>>>> domain controler was fine.  Replication is working perfectly. No 
>>>>> errors.   If I stop the DC processes on either server, Windows 
>>>>> clients appear to failover perfectly fine.
>>>>>
>>>>> The problem seems to affect my Linux clients (CentOS 7) running 
>>>>> winbind.
>>>>>
>>>>> Let's say a CentOS 7 client X is connected to dc2, and I stop the 
>>>>> DC processes on dc2....  The odd time, the client will connect to 
>>>>> dc1 almost right away, and everything just works the way it should 
>>>>> always work.
>>>>>
>>>>> However, most of the time, I stop the DC processes on dc2, the 
>>>>> client will connect to dc1, I can even do a "wbinfo -u" or "wbinfo 
>>>>> -g", but "whoami" reveals "user doesn't exist". Somewhere between 
>>>>> 20-50 minutes later, it just "magically" works.  The timing 
>>>>> doesn't seem consistent. Even a reboot doesn't fix things when 
>>>>> it's in this state.
>>>>>
>>>>> I've tried to follow the Samba logs, but I really can't figure out 
>>>>> what's up.  Andrew? Jeremy? Anyone?
>>>>>
>>>>> I don't think this can be just my system.  I suspect there's a lot 
>>>>> of users out there running multiple DCs with a similar setup to 
>>>>> me, believing that it's all working, and maybe, because there 
>>>>> hasn't been a failure, everything works great, but who knows what 
>>>>> will happen when there's actually a failure.
>>>>>
>>>>> Jason.
>>>>>
>>>>>
>>>> Try adding these lines to the /etc/resolv.conf on the Linux clients:
>>>>
>>>> options rotate
>>>>
>>>> options timeout:1
>>>>
>>>> ||Rowland 
>>>
>>>
>>> Hi Rowland,
>>>
>>>
>>> Here's something that may help jog your memory if you've heard of 
>>> this happening before.....
>>>
>>> So my machine was connected to dc2...  wbinfo -u is giving me 
>>> nothing now, yet wbinfo -g is working fine.
>>>
>>>
>>> This sure has me puzzled.
>>>
>> The fact that 'wbinfo -g' works, seems to suggest that the DC is 
>> being connected to, so why does 'wbinfo -u' not work ?
>>
>> Unfortunately you cannot select which DC to use with wbinfo, bu you 
>> can with net, so try this when 'wbinfo -u' doesn't work: net 
>> usersidlist -S  DCHOSTNAME
>>
>> Replace 'DCHOSTNAME' with the running DC's hostname.
>>
>> If you get output it shows that your DC is working and the problem 
>> lies elsewhere.
>>
>> Rowland
>
> Hi Rowland,
>
> There must have been a problem with mail on Samba list today.. you 
> sent your message at 6:14 AM, and I got it at night.. lol.
>
> I just tried again.
>
> Host is on dc2.
>
> I stopped DC services on dc2.
>
> I ran on host, wbinfo -u - it quickly gave me a whole user list. I ran 
> wbinfo -g - it quickly gave me a group list.
>
> In netstat output I see:
>
> host->dc1:ldap ESTABLISHED
>
> host->dc1:microsoft-ds ESTABLISHED
>
> Exactl as it should... But try to become me...
>
> # su - jas
> su: user jas does not exist
> # getent passwd jas
> <nothing>
>
> # net usersidlist -S dc1
> <hangs for 1 min 38 s> (there's over 4500 users)...
>
> Immediately following this, I could "getent passwd jas" and it worked!
>
> So I repeated the exact same test...
>
> Put DC2 back up.
>
> Connect host to dc2.
>
> Everything is working.
>
> I stopped DC services on dc2.
>
> I ran on host, wbinfo -u - it quickly gave me a whole user list. I ran 
> wbinfo -g - it quickly gave me a group list.
>
> In netstat output I see:
>
> host->dc1:ldap ESTABLISHED
>
> host->dc1:microsoft-ds ESTABLISHED
>
> Exactl as it should... But try to become me...
>
> # su - jas
> su: user jas does not exist
> # getent passwd jas
> <nothing>
>
> # net usersidlist -S dc1
> <hangs for 1 min 38 s> (there's over 4500 users)...
>
> But this time, after running "net usersidlist -S dc1", still takes the 
> same amount of time to process the data, but again "su - jas" doesn't 
> work...
>
> I know it will eventually "unstick", and the host will be working on 
> dc2 without me doing anything, but it will take some time... If I 
> leave for awhile, and come back it will be "fixed".
>
> There's definately an issue between nsswitch and winbind
>
> Slightly shrunk version of strace on "su - jas" when it's "broken":
>
>> 9862  execve("/bin/su", ["su", "-", "jas"], 0x7ffde4219898 /* 46 vars 
>> */) = 0
>> 9862  brk(NULL)                         = 0x56097cefa000
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc870a000
>> 9862  open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
>> 9862  fstat(3, {st_mode=S_IFREG|0644, st_size=209922, ...}) = 0
>> 9862  mmap(NULL, 209922, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fdcc86d6000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libpam.so.0", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`&\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=61680, ...}) = 0
>> 9862  mmap(NULL, 2155088, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc82db000
>> 9862  mprotect(0x7fdcc82e8000, 2097152, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc84e8000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd000) = 0x7fdcc84e8000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libpam_misc.so.0", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\20\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=15648, ...}) = 0
>> 9862  mmap(NULL, 2109752, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc80d7000
>> 9862  mprotect(0x7fdcc80da000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc82d9000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fdcc82d9000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`&\2\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=2156240, ...}) = 0
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc86d5000
>> 9862  mmap(NULL, 3985920, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc7d09000
>> 9862  mprotect(0x7fdcc7ecc000, 2097152, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc80cc000, 24576, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c3000) = 0x7fdcc80cc000
>> 9862  mmap(0x7fdcc80d2000, 16896, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fdcc80d2000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libaudit.so.1", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 at 2\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=127184, ...}) = 0
>> 9862  mmap(NULL, 2261896, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc7ae0000
>> 9862  mprotect(0x7fdcc7afe000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc7cfd000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d000) = 0x7fdcc7cfd000
>> 9862  mmap(0x7fdcc7cff000, 37768, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fdcc7cff000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\16\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=19248, ...}) = 0
>> 9862  mmap(NULL, 2109744, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc78dc000
>> 9862  mprotect(0x7fdcc78de000, 2097152, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc7ade000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fdcc7ade000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libcap-ng.so.0", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\25\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=23968, ...}) = 0
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc86d4000
>> 9862  mmap(NULL, 2118016, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc76d6000
>> 9862  mprotect(0x7fdcc76da000, 2097152, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc78da000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4000) = 0x7fdcc78da000
>> 9862  close(3)                          = 0
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc86d3000
>> 9862  mmap(NULL, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc86d1000
>> 9862  arch_prctl(ARCH_SET_FS, 0x7fdcc86d1780) = 0
>> 9862  mprotect(0x7fdcc80cc000, 16384, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc78da000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc7ade000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc7cfd000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc84e8000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc82d9000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x56097bff3000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc870b000, 4096, PROT_READ) = 0
>> 9862  munmap(0x7fdcc86d6000, 209922)    = 0
>> 9862  brk(NULL)                         = 0x56097cefa000
>> 9862  brk(0x56097cf1b000)               = 0x56097cf1b000
>> 9862  brk(NULL)                         = 0x56097cf1b000
>> 9862  open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
>> 9862  fstat(3, {st_mode=S_IFREG|0644, st_size=106172832, ...}) = 0
>> 9862  mmap(NULL, 106172832, PROT_READ, MAP_PRIVATE, 3, 0) = 
>> 0x7fdcc1194000
>> 9862  close(3)                          = 0
>> 9862  getuid()                          = 0
>> 9862  geteuid()                         = 0
>> 9862  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
>> 9862  close(3)                          = 0
>> 9862  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
>> 9862  close(3)                          = 0
>> 9862  open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
>> 9862  fstat(3, {st_mode=S_IFREG|0644, st_size=1677, ...}) = 0
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc8709000
>> 9862  read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1677
>> 9862  read(3, "", 4096)                 = 0
>> 9862  close(3)                          = 0
>> 9862  munmap(0x7fdcc8709000, 4096)      = 0
>> 9862  open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
>> 9862  fstat(3, {st_mode=S_IFREG|0644, st_size=209922, ...}) = 0
>> 9862  mmap(NULL, 209922, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fdcc86d6000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260!\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=61560, ...}) = 0
>> 9862  mmap(NULL, 2173048, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc0f81000
>> 9862  mprotect(0x7fdcc0f8d000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc118c000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xb000) = 0x7fdcc118c000
>> 9862  mmap(0x7fdcc118e000, 22648, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fdcc118e000
>> 9862  close(3)                          = 0
>> 9862  mprotect(0x7fdcc118c000, 4096, PROT_READ) = 0
>> 9862  munmap(0x7fdcc86d6000, 209922)    = 0
>> 9862  open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
>> 9862  fstat(3, {st_mode=S_IFREG|0644, st_size=2461, ...}) = 0
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc8709000
>> 9862  read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 2461
>> 9862  read(3, "", 4096)                 = 0
>> 9862  close(3)                          = 0
>> 9862  munmap(0x7fdcc8709000, 4096)      = 0
>> 9862  open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
>> 9862  fstat(3, {st_mode=S_IFREG|0644, st_size=209922, ...}) = 0
>> 9862  mmap(NULL, 209922, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fdcc86d6000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libnss_winbind.so.2", O_RDONLY|O_CLOEXEC) = 3 
>> [note: /lib64/libnss_winbind.so.2 -> 
>> /xsys/pkg/samba/lib/libnss_winbind.so.2]
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\f\0\0\0\0\0\0"..., 
>> 832) = 832 <--- I wish I could understand what it is reading here...
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=18400, ...}) = 0
>> 9862  mmap(NULL, 2135848, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc0d77000
>> 9862  mprotect(0x7fdcc0d7a000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc0f79000, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fdcc0f79000
>> 9862  mmap(0x7fdcc0f7a000, 26408, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fdcc0f7a000
>> 9862  close(3)                          = 0
>> 9862 
>> open("/xsys/pkg/samba-4.11.16/lib/private/libwinbind-client-samba4.so", 
>> O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\16\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=18688, ...}) = 0
>> 9862  mmap(NULL, 2109576, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc0b73000
>> 9862  mprotect(0x7fdcc0b76000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc0d75000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fdcc0d75000
>> 9862  close(3)                          = 0
>> 9862 open("/xsys/pkg/samba-4.11.16/lib/private/libreplace-samba4.so", 
>> O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\r\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=13616, ...}) = 0
>> 9862  mmap(NULL, 2105352, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc0970000
>> 9862  mprotect(0x7fdcc0972000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc0b71000, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x7fdcc0b71000
>> 9862  mmap(0x7fdcc0b72000, 8, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fdcc0b72000
>> 9862  close(3)                          = 0
>> 9862  stat("/xsys/pkg/samba-4.11.16/lib", {st_mode=S_IFDIR|0755, 
>> st_size=4096, ...}) = 0
>> 9862  stat("/xsys/lib64", {st_mode=S_IFDIR|0755, st_size=12288, ...}) 
>> = 0
>> 9862  open("/lib64/libcrypt.so.1", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\16\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=40600, ...}) = 0
>> 9862  mmap(NULL, 2318912, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc0739000
>> 9862  mprotect(0x7fdcc0741000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc0940000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7fdcc0940000
>> 9862  mmap(0x7fdcc0942000, 184896, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fdcc0942000
>> 9862  close(3)                          = 0
>> 9862  open("/lib64/libfreebl3.so", O_RDONLY|O_CLOEXEC) = 3
>> 9862  read(3, 
>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\v\0\0\0\0\0\0"..., 
>> 832) = 832
>> 9862  fstat(3, {st_mode=S_IFREG|0755, st_size=11392, ...}) = 0
>> 9862  mmap(NULL, 2105536, PROT_READ|PROT_EXEC, 
>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fdcc0536000
>> 9862  mprotect(0x7fdcc0538000, 2093056, PROT_NONE) = 0
>> 9862  mmap(0x7fdcc0737000, 8192, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x7fdcc0737000
>> 9862  close(3)                          = 0
>> 9862  mprotect(0x7fdcc0737000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc0940000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc0b71000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc0d75000, 4096, PROT_READ) = 0
>> 9862  mprotect(0x7fdcc0f79000, 4096, PROT_READ) = 0
>> 9862  munmap(0x7fdcc86d6000, 209922)    = 0
>> 9862  getpid()                          = 9862
>> 9862  lstat("/run/winbindd", {st_mode=S_IFDIR|0755, st_size=60, ...}) 
>> = 0
>> 9862  lstat("/run/winbindd/pipe", {st_mode=S_IFSOCK|0777, st_size=0, 
>> ...}) = 0
>> 9862  socket(AF_UNIX, SOCK_STREAM, 0)   = 3
>> 9862  fcntl(3, F_GETFL)                 = 0x2 (flags O_RDWR)
>> 9862  fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>> 9862  fcntl(3, F_GETFD)                 = 0
>> 9862  fcntl(3, F_SETFD, FD_CLOEXEC)     = 0
>> 9862  connect(3, {sa_family=AF_UNIX, sun_path="/run/winbindd/pipe"}, 
>> 110) = 0
>> 9862  poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 
>> ([{fd=3, revents=POLLOUT}])
>> 9862  write(3, 
>> "P\10\0\0\0\0\0\0\0\0\0\0\206&\0\0\0\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
>> 2128) = 2128
>> 9862  poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, 
>> revents=POLLIN}])
>> 9862  read(3, 
>> "\250\17\0\0\2\0\0\0\37\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
>> 4008) = 4008
>> 9862  poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 
>> ([{fd=3, revents=POLLOUT}])
>> 9862  write(3, 
>> "P\10\0\0\1\0\0\0\0\0\0\0\206&\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
>> 2128) = 2128
>> 9862  poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, 
>> revents=POLLIN}])
>> 9862  read(3, 
>> "\250\17\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
>> 4008) = 4008
>> 9862  open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 4
>> 9862  fstat(4, {st_mode=S_IFREG|0644, st_size=2502, ...}) = 0
>> 9862  mmap(NULL, 4096, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdcc8709000
>> 9862  read(4, "# Locale name alias data base.\n#"..., 4096) = 2502
>> 9862  read(4, "", 4096)                 = 0
>> 9862  close(4)                          = 0
>> 9862  munmap(0x7fdcc8709000, 4096)      = 0
>> 9862  write(2, "su: ", 4)               = 4
>> 9862  write(2, "user jas does not exist", 23) = 23
>> 9862  write(2, "\n", 1)                 = 1
>> 9862  close(1)                          = 0
>> 9862  close(2)                          = 0
>> 9862  close(3)                          = 0
>> 9862  exit_group(1)                     = ?
>> 9862  +++ exited with 1 +++
>
> I even stopped winbind on the client, deleted /run/winbindd, and 
> restarted winbindd... still nothing... (though wbinfo -u and wbinfo -g 
> are working)...
>
> I don't think there's a way to "reset" nsswitch, and there's not even 
> a cache because I'm not using nscd.
>
> Anything else to try?
>
> Jason. 

I think I'm on to something...  So I ran winbind in debug mode, 
interactive, and this is what I saw...

winbindd version 4.11.16 started.
Copyright Andrew Tridgell and the Samba Team 1992-2019
lp_load_ex: refreshing parameters
Initialising global parameters
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[global]"
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_DMALLOC_MARK and LOG_CHANGED
lp_load_ex: refreshing parameters
Initialising global parameters
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[global]"
added interface enp0s3 ip=130.63.97.152 bcast=130.63.97.255 
netmask=255.255.255.0
added interface enp0s3 ip=130.63.97.152 bcast=130.63.97.255 
netmask=255.255.255.0
tdb '/local/samba/locks/winbindd_cache.tdb' is valid
Created backup '/local/samba/locks/winbindd_cache.tdb.bak' of tdb 
'/local/samba/locks/winbindd_cache.tdb'
add_trusted_domain: Added domain [BUILTIN] [(null)] [S-1-5-32]
add_trusted_domain: Added domain [J2] [(null)] 
[S-1-5-21-4255622434-1312408701-3568591385]
add_trusted_domain: Added domain [EECSYORKUCA] [AD.EECS.YORKU.CA] 
[S-1-5-21-1981678738-1545235886-4256466701]
connection_ok: Connection to (null) for domain EECSYORKUCA is not connected
Successfully contacted LDAP server 130.63.94.66
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
Connecting to 130.63.94.66 at port 445
ldb_wrap open of secrets.ldb
GENSEC backend 'gssapi_spnego' registered
GENSEC backend 'gssapi_krb5' registered
GENSEC backend 'gssapi_krb5_sasl' registered
GENSEC backend 'spnego' registered
GENSEC backend 'schannel' registered
GENSEC backend 'naclrpc_as_system' registered
GENSEC backend 'sasl-EXTERNAL' registered
GENSEC backend 'ntlmssp' registered
GENSEC backend 'ntlmssp_resume_ccache' registered
GENSEC backend 'http_basic' registered
GENSEC backend 'http_ntlm' registered
GENSEC backend 'http_negotiate' registered
GENSEC backend 'krb5' registered
GENSEC backend 'fake_gssapi_krb5' registered
winbindd_dual_list_trusted_domains: [ 12149]: list trusted domains
ads: trusted_domains
ldb_wrap open of secrets.ldb
Connecting to 130.63.94.66 at port 135
Connecting to 130.63.94.66 at port 49152
Connecting to 130.63.94.66 at port 135
Connecting to 130.63.94.66 at port 49152
winbindd_dual_list_trusted_domains: [ 12149]: list trusted domains
ads: trusted_domains
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
Successfully contacted LDAP server 130.63.94.66
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"

So it looks like everything is settled... we're going to talk to dc1.  
We even successfully contacted LDAP server 130.63.94.66 (dc1)

But then every time I do the "su", it changes its mind:

[jas at j2 jas]# su - jas
winbindd_interface_version: [nss_winbind (12180)]: request interface 
version (version = 31)
winbindd_getpwnam_send: [nss_winbind (12180)] getpwnam jas
Connecting to 130.63.94.66 at port 135
Connecting to 130.63.94.66 at port 49153
idmap backend ad not found

load_module_absolute_path: Module 
'/xsys/pkg/samba-4.11.16/lib/idmap/ad.so' loaded

Connecting to 130.63.94.67 at port 389 <- now it tries to connect to dc2!!
su: user jas does not exist
[jas at j2 jas]# winbindd_interface_version: [nss_winbind (12183)]: request 
interface version (version = 31)
winbindd_interface_version: [nss_winbind (12182)]: request interface 
version (version = 31)
winbindd_getgroups_send: [nss_winbind (12182)] getgroups root
winbindd_getgroups_send: [nss_winbind (12183)] getgroups root
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
Successfully contacted LDAP server 130.63.94.66
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"
get_dc_list: preferred server list: "dc1.ad.eecs.yorku.ca, *"

And every time I try to "su - jas", it again tries to connect to dc2!!

It even creates a krb5.conf._JOIN_ file containing:

[realms]
         AD.EECS.YORKU.CA = {
                 kdc = 130.63.94.66
         }

But eventually....... after a long period of time...

It works:

# su - jas
winbindd_interface_version: [nss_winbind (13046)]: request interface 
version (version = 31)
winbindd_getpwnam_send: [nss_winbind (13046)] getpwnam jas
resolve_hosts: Attempting host lookup for name dc1.ad.eecs.yorku.ca<0x20>
Connecting to 130.63.94.66 at port 389
....

Now I realize this isn't a nsswitch bug at all .. this is hiding in winbind.

Jason.





More information about the samba mailing list