[Samba] winbind : suspend nightmare

Prunk Dump prunkdump at gmail.com
Wed Oct 23 08:01:31 UTC 2019


Le lun. 21 oct. 2019 à 11:21, L.P.H. van Belle via samba
<samba at lists.samba.org> a écrit :
>
> Hai,
>
>
> #Fix Internet after Suspend
> alias fix-internet="sudo modprobe -r r8169 && sleep 10 && sudo modprobe r8169"
>
> Depending on what is used, you still have more options.
> For example, in networkmanager.conf, try "carrier-wait-timeout" and "ignore-carrier"
>
> And othere thing you might encounter, that that the network device name changed after suspending.
> Then you also need to use : /etc/udev/rules.d/10-network.rules
> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="aa:bb:cc:dd:ee:ff", NAME="eth1"
> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="ff:ee:dd:cc:bb:aa", NAME="eth0"
> Greetz,
>
> Louis

How do you know all of that things Louis ? .... Impressive ....

Following your advises, I have written some "monitor" service scripts that :
-> check and record kernel logs about nic module when wake from
suspend or DHCP discover
-> record dhclient/network-manager logs
-> record interface names

But as my problem appear just one time a day (suspend must be
sufficient long, at least 5 hours) I can't give the result now.
I keep you informed.


Le mer. 23 oct. 2019 à 07:26, Jeremy Allison <jra at samba.org> a écrit :
>
> On Mon, Oct 21, 2019 at 10:07:20AM +0200, Prunk Dump via samba wrote:
> >
> > I don't know if winbind "officially" support suspending. Currently I
> > have written a systemd hook that kill winbind before suspend and
> > restarting it after.
>
> It hasn't been tested in that mode as far as I know.
>
> Congratulations, you're the first ! :-).
>

Thank you very much Jeremy !

Here the systemd hook used. This solve the issue while recover from
suspend. But don't solve the DHCPDISCOVER problem.
Maybe you are also interested about strace on the "wake" problem. But
I think it will too difficult for me to work on two problems at the
same time. And  as I have a workaround for the first one I prefer
working on the DHCPDISCOVER problem first.

~# cat /lib/systemd/system/sleep-winbind.service
[Unit]
Description=winbind sleep hook
Before=sleep.target
StopWhenUnneeded=yes

[Service]
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Type=oneshot
RemainAfterExit=yes
ExecStart=-systemctl stop winbind
ExecStop=-systemctl start winbind

[Install]
WantedBy=sleep.target


> > 07:44:43  connection_ok: Connection to fichdc01.samdom.com for domain
> > SAMDOM is not connected
> > 07:44:43  Successfully contacted LDAP server 172.16.0.30
> > 07:44:43  get_dc_list: preferred server list: "fichdc01.samdom.com, *"
> > 07:44:43  Connecting to 172.16.0.30 at port 445
> > 07:44:43  ldb_wrap open of secrets.ldb
> > 07:44:43  Connecting to 172.16.0.30 at port 135
> > 07:44:43  Connecting to 172.16.0.30 at port 49153
> > 07:44:43  Connecting to 172.16.0.30 at port 135
> > 07:44:43  Connecting to 172.16.0.30 at port 49153
> > 07:44:43  Connecting to 172.16.0.30 at port 135
> > 07:44:43  Connecting to 172.16.0.30 at port 49152
> > 07:44:45  ads: fetch sequence_number for SAMDOM
> > 07:44:45  get_dc_list: preferred server list: "fichdc01.samdom.com, *"
> > 07:44:45  Successfully contacted LDAP server 172.16.0.30
> > 07:44:45  Connected to LDAP server fichdc01.samdom.com
> > 07:46:40  connection_ok: Connection to fichdc01.samdom.com for domain
> > SAMDOM is not connected
> > 07:46:40  cldap_multi_netlogon_send: cldap_socket_init failed for
> > ipv4:172.16.0.30:389  error NT_STATUS_NETWORK_UNREACHABLE
>
> OK, the above line is the problem. Why does that
> happen if above we have:
>
> 07:44:45  Successfully contacted LDAP server 172.16.0.30
>
> cldap_multi_netlogon_send() does a UDP cldap ping
> to the server (172.16.0.30). Getting NT_STATUS_NETWORK_UNREACHABLE
> looks like the network interface isn't up yet.
>
> Can you start winbind under strace in this case so we
> can see what syscalls are being done and exactly how they're
> failing ?
>
> Thanks,
>
> Jeremy.

Yes I will do that and I will sent the result.

But the first thing I don't understand is why winbind start reacting
exactly when DHCPDISCOVER is launched.

The first log lines :
07:44:43  connection_ok: Connection to fichdc01.samdom.com for domain
SAMDOM is not connected
07:44:43  Successfully contacted LDAP server 172.16.0.30

Happen exactly when the DHCPDISCOVER response is received. Strange no
? How winbind know that ?

There is also a hook script for ntp in Debian :
/etc/dhcp/dhclient-exit-hooks.d/ntp

I'm investigating if this problem can be cause by ntp.

There is also a Samba hook :
/etc/dhcp/dhclient-enter-hooks.d/samba

But it seems only related to smbd, not winbind. Does smbd can sent
some signal to winbind ?

Thank you very must Samba Team !!

Baptiste.



More information about the samba mailing list