delayed network interfaces and ctdb service startup

Steve French smfrench at gmail.com
Thu Sep 8 03:45:41 UTC 2016


I can try your attached patch (seems simpler, and better) - but first
I want to see what happens tomorrow when they run the update with the
restart=on-failure

On Wed, Sep 7, 2016 at 9:34 PM, Martin Schwenke <martin at meltin.net> wrote:
> Hi Steve,
>
> On Wed, 7 Sep 2016 20:32:37 -0500, Steve French <smfrench at gmail.com>
> wrote:
>
>> Ran into an issue with ctdb service startup failing (on RHEL) when
>> rebooting and there was a software update on the reboot (normally
>> works fine on reboot on next boot after software update (or before
>> software update etc.) but presumably network comes up late when
>> software update in progress).
>
> It would be interesting to know what the actual failure was.  ;-)
>
>> Should ctdb service file be configured with something like
>>
>> Restart=on-failure
>> RestartSec=60s
>>
>> or Restart=on-abnormal ....
>>
>> I noticed that ctdb was being started after networking, but presumably
>> that is not good enough to guarantee that socket is usable
>>
>> [Unit]
>> Description=CTDB
>> After=network.target
>
> https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
> suggests that we need After=network-online.target instead of just
> network.target.
>
> The above systemd documentation says:
>
>   network.target has very little meaning during start-up. It only
>   indicates that the network management stack is up after it has been
>   reached. Whether any network interfaces are already configured when
>   it is reached is undefined. It's primary purpose is for ordering
>   things properly at shutdown: [...]
>
> and:
>
>   network-online.target is a target that actively waits until the
>   nework is "up" [...]
>   It is strongly recommended not to pull in this target too liberally:
>   for example network server software should generally not pull this in
>   (since server software generally is happy to accept local connections
>   even before any routable network interface is up), it's primary
>   purpose is network client software that cannot operate without
>   network.
>
> to which I quite naturally say: Yay!  Systemd!  That wasn't obvious...  :-(
>
> So, I can guess 2 reasons for CTDB not starting:
>
> * CTDB insists on binding a known node address to its TCP socket.  If
>   the address isn't available/local then the bind will fail.
>
> * The public address setup is also unforgiving.  If a specified
>   interface is not available then CTDB will abort early.
>
> Can you please test if the attached patch improves things and, if so,
> give it a Reviewed-by: ?   :-)
>
> Thanks...
>
> peace & happiness,
> martin



-- 
Thanks,

Steve



More information about the samba-technical mailing list