[Samba] samba AD problem after re-join domain

Jason Keltz jas at eecs.yorku.ca
Mon Oct 12 16:39:11 UTC 2020


On 10/12/2020 11:51 AM, Rowland penny via samba wrote:

>>> and since that reboot, I've seen a few of them do this:
>>>
>>> [2020/10/12 10:00:19.814381,  1, pid=36145, effective(0, 0), real(0, 
>>> 0)] ../../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
>>>   Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT
>>> [2020/10/12 10:16:19.557261,  1, pid=36145, effective(0, 0), real(0, 
>>> 0)] ../../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
>>>   Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT
>>>
>>> Two of them are virtualbox VMs, so I figured maybe it's some kind of 
>>> virtualbox thing, but one of them is an actual machine and still has 
>>> the same error.  The DC is very lightly loaded. How would I debug 
>>> what is causing this reduction in IO?
> I would be checked your network connections etc.

There is minimal load on the server which has plenty of network 
capacity.  Is there a particular way to ask winbind to test IO between 
itself and the server?  I've seen a lot of this question asked on the 
list, and I'm concerned there could be a bug.  There should be a winbind 
option to do an IO checkup...

Here's iperf output between one client that has those messages and the 
server:

[  5]   0.00-1.00   sec   108 MBytes   905 Mbits/sec
[  5]   1.00-2.00   sec   112 MBytes   942 Mbits/sec
[  5]   2.00-3.00   sec   112 MBytes   941 Mbits/sec
[  5]   3.00-4.00   sec   112 MBytes   942 Mbits/sec
[  5]   4.00-5.00   sec   111 MBytes   935 Mbits/sec
[  5]   5.00-6.00   sec   112 MBytes   938 Mbits/sec
[  5]   6.00-7.00   sec   112 MBytes   941 Mbits/sec
[  5]   7.00-8.00   sec   112 MBytes   941 Mbits/sec
[  5]   8.00-9.00   sec   112 MBytes   941 Mbits/sec
[  5]   9.00-10.00  sec   112 MBytes   942 Mbits/sec
[  5]  10.00-10.04  sec  4.31 MBytes   939 Mbits/sec

I don't know what other test to do.

>>> I know that various errors in the Samba logs are not "issues" but 
>>> this one seems to be an issue.  I don't like seeing IO_TIMEOUTs.
>>>
>>> Another distracting error in the log included:
>>>
>>> [2020/10/11 22:43:29.843630,  1, pid=969, effective(0, 0), real(0, 
>>> 0)] ../../source3/libads/ldap.c:565(ads_find_dc)
>>>   ads_find_dc: name resolution for realm 'AD.EECS.YORKU.CA' (domain 
>>> 'EECSYORKUCA') failed: NT_STATUS_NO_LOGON_SERVERS
>
> That make me think of dns/network problems.
>
I was able to reboot multiple times and when I tried to login while the 
system was coming up, that error showed up in the logs (followed by a 
succesful login).  If I waited until the system was completely booted 
before logging, the error wouldn't show up.
>
>>>
>>> ... after boot which sounds serious but it turns out if I try to 
>>> authenticate before everything is up and running, that's what I get. 
>>> The error makes sense but there's no "follow up" to say: "Ok ok - I 
>>> found it now - Sorry to give you a heart attack.". It's all a 
>>> learning experience.
>>>
>>> <snipped>
>>> Jason
>>
>>
>>
>> I wonder if this a regular error and everyone is seeing this in their 
>> logs?  Just for fun, I tried to change the permission of 
>> /etc/krb5.keytab temporarily to 644, and sure enough, the error goes 
>> away....  so somehow when the user is logging in, it seems that 
>> winbind is trying to read the keytab as user.  It's not clear why 
>> that would be, but while a google search hasn't revealed the reason 
>> for this error, I do see it in a whole lot of log files. It's just 
>> that when I'm trying to ensure there are no problems with my setup, 
>> and trying to understand the errors that do show up, it can cause 
>> panic.  Whether it's a problem or not, I do not know
>
> The keytab shouldn't be a problem, what are the permissions on 
> /etc/krb5.conf ?

/etc/krb5.conf is root:root 644.

After more experiments, the error I included is only shown when SSHing 
from a system not in the domain to the domain.  I've discovered that if 
I ssh from the system in the domain to another system in the domain, the 
error doesn't appear in the log.  It's another log error not to be 
concerned about, I think.

By the way, you read my second email, so missed the question from my 
first .. please let me know if you have any additional feedback on 
this... tnx.

The real reason I was trying to change the hostnames was to deal with a 
scenario particular of our environment.  We have many dualboot machines  
running Windows and Linux.  I know that I can't join the domain with the 
same name on both Linux and Windows systems because joining one would 
change the password, then the other wouldn't be joined, etc.  I 
understand that it's possible to generate a machine password manually, 
and use that from both sides, but as I understand it, this interferes 
with the systems ability to change the machine password regularly which 
seems more secure.  I don't know if Samba does that.   I also don't want 
to have a different IP address for both sides because that would be 
wasteful.  I would prefer if the hostname would be the same on both 
sides as well.    I was trying to explore how carefully the name in the 
AD computer database is tied to the "real" DNS name of the host.  What I 
was trying to do was to add to /etc/samba/smb.conf: netbios name=<system 
hostname>-linux so that when I would join the hosts under Linux, they 
would take on a "-linux" name, but only in the AD computer database.  
When the host was booted, the host would have an AD name of <system 
hostname>-linux, but a real name of just "<system hostname>".    On 
Windows, both the AD name and hostname would be "<system hostname>".  
This would mean that on Windows, you could have a computer called 
"test", and under Linux, "test-linux", but both would really be the same 
physical PC and both would be host "test" with one IP.    It wasn't 
working.  I am pretty sure I forgot the nfs/X entries on the NFS servers 
after rejoining the domain so that may be the issue.  However, thinking 
back, I also think that "net ads keytab" would not let me add an entry 
for "host/test...." because it wanted "host/test-linux....", but I could 
be wrong.  If the host *had* to take on its real identity "test-linux" 
then test-linux could just be an alias for test, I guess, but then the 
machine build would be a headache.... and when the Linux machines boot 
they use dhcp (just like Windows) and the machine wouldn't know if it's 
"test" or "test-linux". Lots of "fun".

Jason.





More information about the samba mailing list