[Samba] Bug 6870 resurfaced in Samba 4.2.10

Mon Oct 17 16:13:08 UTC 2016

Hi, 

So I did some digging into the source code, and I think I've found the
issue. Around line 120 of source3/libads/cldap.c: 

for (i=0; i<num_servers; i++) {
  NTSTATUS status; 

  status = cldap_socket_init(state->cldap,
    NULL, /* local_addr */
    state->servers[i],
    &state->cldap[i]); 

  if (tevent_req_nterror(req, status)) {
    return tevent_req_post(req, ev);
  } 

  /* Code omitted for brevity */ 

} 

This is in cldap_multi_netlogon_send(), a function that sends CLDAP
requests to multiple DCs in one go. The loop here sets up a socket for
each DC. cldap_socket_init() in turn (possibly several calls deeper)
sets up the UDP socket, and calls connect() on it, which fails with
"Network unreachable". This bubbles up the chain and comes back to
cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. 

Note however the return from the function: it returns an error if *any*
of the servers queried returned an error, even if any of them succeeded.

In my case, even though server 0 (IPv4) succeeds, this call returns an
error because server 1 (IPv6) could not be reached. 

To reiterate, this is in Samba 4.2.10, which ships with Debian 8
(Jessie), and occurs when running "net ads workgroup". 

This is the relevant section of the D10 log (compare with the strace
from my previous email): 

Adding 2 DC's from auto lookup
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2
remove_duplicate_addrs2: looking for duplicate address/port pairs
get_dc_list: returning 2 ip addresses in an ordered list
get_dc_list: 192.168.81.132:389 2001:8b0:1627:1::2:389 
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2
ads_try_connect: sending CLDAP request to 2 servers (realm:
FEDERATION.STARFLEET-NET.CO.UK)
ads_cldap_netlogon: cldap_multi_netlogon failed:
NT_STATUS_NETWORK_UNREACHABLE
ads_try_connect: CLDAP request failed.
Adding cache entry with
key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,192.168.81.132] and
timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead)
add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK
(192.168.81.132) to failed conn cache
Adding cache entry with
key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,2001:8b0:1627:1::2]
and timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead)
add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK
(2001:8b0:1627:1::2) to failed conn cache
ads_connect: No logon servers 

Would this all be better (and/or more actively worked on) if sent to
samba-technical ? 

Regards 

Rebecca Gellman 

On 2016-10-14 07:40, L.P.H. van Belle via samba wrote: 

> Hai, 
> 
> Did you check if ifconfig still shows ipv6 adresses. ( even ::1 ) 
> 
> Can you check that. 
> 
> I have several with ipv6 on and severel only ipv4. 
> As of 4.1.17+ i didnt see this happing here. Now on 4.4.5 
> I think you have forgotten something. 
> 
> Greetz, 
> 
> Louis
> 
>> -----Oorspronkelijk bericht-----
>> Van: samba [mailto:samba-bounces at lists.samba.org] Namens Rebecca Gellman
>> via samba
>> Verzonden: donderdag 13 oktober 2016 17:07
>> Aan: samba at lists.samba.org
>> Onderwerp: [Samba] Bug 6870 resurfaced in Samba 4.2.10
>> 
>> According to this bugzilla entry, bug 6870 has been fixed as of at least
>> version 3.5:
>> 
>> https://bugzilla.samba.org/show_bug.cgi?id=6870
>> 
>> However, I assert that it is present in 4.2.10, which ships with Debian
>> Jessie.
>> 
>> On my home network (IPv4 and IPv6), a box with Samba 4.2.10 with IPv6
>> disabled (via sysctl), will fail to contact a DC because the IPv6
>> connect fails immediately before the v4 connect has a chance to succeed.
>> 
>> I determined this by strace'ing the "net ads workgroup" command, which
>> resulted in the following:
>> 
>> 11:41:52 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 11 <0.000027>
>> 11:41:52 fcntl(11, F_GETFL) = 0x2 (flags O_RDWR) <0.000015>
>> 11:41:52 fcntl(11, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000016>
>> 11:41:52 fcntl(11, F_GETFD) = 0 <0.000015>
>> 11:41:52 fcntl(11, F_SETFD, FD_CLOEXEC) = 0 <0.000016>
>> 11:41:52 connect(11, {sa_family=AF_INET, sin_port=htons(389),
>> sin_addr=inet_addr("192.168.81.132")}, 16) = 0 <0.000050>
>> 11:41:52 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 12 <0.000025>
>> 11:41:52 fcntl(12, F_GETFL) = 0x2 (flags O_RDWR) <0.000016>
>> 11:41:52 fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000015>
>> 11:41:52 fcntl(12, F_GETFD) = 0 <0.000016>
>> 11:41:52 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 <0.000015>
>> 11:41:52 setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0 <0.000018>
>> 11:41:52 connect(12, {sa_family=AF_INET6, sin6_port=htons(389),
>> inet_pton(AF_INET6, "2001:8b0:1627:1::2", &sin6_addr), sin6_flowinfo=0,
>> sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
>> <0.000032>
>> 11:41:52 close(12) = 0 <0.000028>
>> 11:41:52 close(11) = 0 <0.000024>
>> 11:41:52 close(10) = 0 <0.000020>
>> 11:41:52 fcntl(8, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=288,
>> len=1}) = 0 <0.000021>
>> 11:41:52 fcntl(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=288,
>> len=1}) = 0 <0.000018>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000019>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000032>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000019>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000017>
>> 11:41:52 write(2, "ads_connect: No logon servers\n", 30ads_connect: No
>> logon servers
>> 
>> As you can see, sockets 11 and 12 are setup to contact the DC, 11 to v4,
>> and 12 to v6. connect() on socket 11 is successful (returns 0), but
>> connect() on socket 12 returns -1 due to "Network unreachable" - this is
>> correct as the box in question does not have IPv6.
>> 
>> The attempt is abandoned (implied by the immediate closing of sockets 11
>> and 12, and the writing of "No logon servers" to stderr) before any
>> attempt is made to talk on socket 11 (v4).
>> 
>> After futzing the box to have an IPv6 address with appropriate routing,
>> the attempt succeeds as expected. However, for reasons (too long to go
>> into here) this is not a solution, only a means of proving the problem.
>> 
>> Since most DCs publish v6 records of some kind in DNS in an AD setup
>> these days, it would seem that this behaviour could do with urgently
>> fixing.
>> 
>> Any comments from the samba bods, or should I forward this on to
>> samba-technical ?
>> 
>> Thanks
>> 
>> -- Rebecca Gellman
>> 
>> --
>> To unsubscribe from this list go to the following URL and read the
>> instructions:  https://lists.samba.org/mailman/options/samba