[Samba] samba AD problem after re-join domain
Jason Keltz
jas at eecs.yorku.ca
Mon Oct 12 16:39:11 UTC 2020
On 10/12/2020 11:51 AM, Rowland penny via samba wrote:
>>> and since that reboot, I've seen a few of them do this:
>>>
>>> [2020/10/12 10:00:19.814381, 1, pid=36145, effective(0, 0), real(0,
>>> 0)] ../../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
>>> Reducing LDAP page size from 500 to 250 due to IO_TIMEOUT
>>> [2020/10/12 10:16:19.557261, 1, pid=36145, effective(0, 0), real(0,
>>> 0)] ../../source3/libads/ldap_utils.c:93(ads_do_search_retry_internal)
>>> Reducing LDAP page size from 250 to 125 due to IO_TIMEOUT
>>>
>>> Two of them are virtualbox VMs, so I figured maybe it's some kind of
>>> virtualbox thing, but one of them is an actual machine and still has
>>> the same error. The DC is very lightly loaded. How would I debug
>>> what is causing this reduction in IO?
> I would be checked your network connections etc.
There is minimal load on the server which has plenty of network
capacity. Is there a particular way to ask winbind to test IO between
itself and the server? I've seen a lot of this question asked on the
list, and I'm concerned there could be a bug. There should be a winbind
option to do an IO checkup...
Here's iperf output between one client that has those messages and the
server:
[ 5] 0.00-1.00 sec 108 MBytes 905 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 942 Mbits/sec
[ 5] 2.00-3.00 sec 112 MBytes 941 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 942 Mbits/sec
[ 5] 4.00-5.00 sec 111 MBytes 935 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 938 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 941 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 941 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 941 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 942 Mbits/sec
[ 5] 10.00-10.04 sec 4.31 MBytes 939 Mbits/sec
I don't know what other test to do.
>>> I know that various errors in the Samba logs are not "issues" but
>>> this one seems to be an issue. I don't like seeing IO_TIMEOUTs.
>>>
>>> Another distracting error in the log included:
>>>
>>> [2020/10/11 22:43:29.843630, 1, pid=969, effective(0, 0), real(0,
>>> 0)] ../../source3/libads/ldap.c:565(ads_find_dc)
>>> ads_find_dc: name resolution for realm 'AD.EECS.YORKU.CA' (domain
>>> 'EECSYORKUCA') failed: NT_STATUS_NO_LOGON_SERVERS
>
> That make me think of dns/network problems.
>
I was able to reboot multiple times and when I tried to login while the
system was coming up, that error showed up in the logs (followed by a
succesful login). If I waited until the system was completely booted
before logging, the error wouldn't show up.
>
>>>
>>> ... after boot which sounds serious but it turns out if I try to
>>> authenticate before everything is up and running, that's what I get.
>>> The error makes sense but there's no "follow up" to say: "Ok ok - I
>>> found it now - Sorry to give you a heart attack.". It's all a
>>> learning experience.
>>>
>>> <snipped>
>>> Jason
>>
>>
>>
>> I wonder if this a regular error and everyone is seeing this in their
>> logs? Just for fun, I tried to change the permission of
>> /etc/krb5.keytab temporarily to 644, and sure enough, the error goes
>> away.... so somehow when the user is logging in, it seems that
>> winbind is trying to read the keytab as user. It's not clear why
>> that would be, but while a google search hasn't revealed the reason
>> for this error, I do see it in a whole lot of log files. It's just
>> that when I'm trying to ensure there are no problems with my setup,
>> and trying to understand the errors that do show up, it can cause
>> panic. Whether it's a problem or not, I do not know
>
> The keytab shouldn't be a problem, what are the permissions on
> /etc/krb5.conf ?
/etc/krb5.conf is root:root 644.
After more experiments, the error I included is only shown when SSHing
from a system not in the domain to the domain. I've discovered that if
I ssh from the system in the domain to another system in the domain, the
error doesn't appear in the log. It's another log error not to be
concerned about, I think.
By the way, you read my second email, so missed the question from my
first .. please let me know if you have any additional feedback on
this... tnx.
The real reason I was trying to change the hostnames was to deal with a
scenario particular of our environment. We have many dualboot machines
running Windows and Linux. I know that I can't join the domain with the
same name on both Linux and Windows systems because joining one would
change the password, then the other wouldn't be joined, etc. I
understand that it's possible to generate a machine password manually,
and use that from both sides, but as I understand it, this interferes
with the systems ability to change the machine password regularly which
seems more secure. I don't know if Samba does that. I also don't want
to have a different IP address for both sides because that would be
wasteful. I would prefer if the hostname would be the same on both
sides as well. I was trying to explore how carefully the name in the
AD computer database is tied to the "real" DNS name of the host. What I
was trying to do was to add to /etc/samba/smb.conf: netbios name=<system
hostname>-linux so that when I would join the hosts under Linux, they
would take on a "-linux" name, but only in the AD computer database.
When the host was booted, the host would have an AD name of <system
hostname>-linux, but a real name of just "<system hostname>". On
Windows, both the AD name and hostname would be "<system hostname>".
This would mean that on Windows, you could have a computer called
"test", and under Linux, "test-linux", but both would really be the same
physical PC and both would be host "test" with one IP. It wasn't
working. I am pretty sure I forgot the nfs/X entries on the NFS servers
after rejoining the domain so that may be the issue. However, thinking
back, I also think that "net ads keytab" would not let me add an entry
for "host/test...." because it wanted "host/test-linux....", but I could
be wrong. If the host *had* to take on its real identity "test-linux"
then test-linux could just be an alias for test, I guess, but then the
machine build would be a headache.... and when the Linux machines boot
they use dhcp (just like Windows) and the machine wouldn't know if it's
"test" or "test-linux". Lots of "fun".
Jason.
More information about the samba
mailing list