[Samba] winbind causes Linux to lockup when connectivity to AD is lost (subject line edited for clarity)

Jason Haar Jason.Haar at trimble.co.nz
Thu Oct 22 17:13:22 MDT 2009

On 10/23/2009 11:45 AM, Robert LeBlanc wrote:
> I'm using 3.4.2 right now and I'm seeing a similar problem. We are
> using winbind to authenticate our users on our Linux cluster. The
> worker and interactive nodes are on a private subnet that is NATed to
> the local LAN. Two head nodes provide failover for the NATing. When
> failover is happening, winbind whacks out. The system is not unusable,
> but no authentication happens for about 30 minutes after the failover.
> I'm going to see if I can get iptables to share state between machines
> to help prevent this, but there needs to be a faster reconnection
> after domain controllers seem to be down.

What I see (as a winbind-laptop user) is that sometimes winbind thinks
it has working connections to domain controllers when either the network
is down or is no longer the corporate network. e.g. I can be logged in
at work, sleep my laptop and take it home. After coming out of sleep,
"netstat -t" shows that there are still ESTABLISHED tcp sessions to
domain controllers - even though my home network has no access to my
work network. I think winbind then gets into a state where it is
continually trying to talk to these non-available domain controllers and
it never gives up - and so the offline mode never kicks in.

It's got so bad that I now have scripts that run whenever a network
change occurs, to check if winbind is "stuck" and restart accordingly.


