[Samba] Restart Winbind

Linux Addict linuxaddict7 at gmail.com
Fri Sep 11 10:32:09 MDT 2009

On Thu, Sep 10, 2009 at 11:27 PM, Adam Nielsen <adam.nielsen at uq.edu.au>wrote:

> > I wish I can put gdb, but when tdb files get corrupted, I cant login to
> > the host even as a local user on console. Winbind seems to be locking
> > the whole authentication stream.   I don't understand why even the local
> > user cant login.
> It's because normally (depending on /etc/nsswitch.conf) winbind will be
> queried first before local files like /etc/passwd.  If you swap the
> order you can make it check local auth files first.
> Alternatively you should be able to get around that by either leaving a
> console or SSH connection open to the server 24/7 until it breaks, or
> perhaps using SSH with public keys, which should bypass the normal
> authentication scheme.  Of course then even something like "ls" will
> probably lock up, since it will query winbind to map UIDs back to
> usernames...

Thank you for taking time to respond.

I do have have nsswitch has file and then winbind and it is working as
expected when everything is fine. e.g. I stop winbind, use a local user and
I can login. The issue happens only when winbind takes all CPU.

I can have session open on console directly, but its very random.

> > Thats the I'm working on a script to run w/ cron, so that when winbind
> > consumes more than 40% cpu, I want to restart the cpu.
> Short of tracking down the bug with gdb and fixing it, this is probably
> the only alternative.
> > I wanted to ask another question on the same subject. When I start the
> > winbind using the init script, it forks 4 processes. The pid on
> > /var/run/winbindd.pid is the parent process. So is that the pid I need
> > to monitor to capture the true cpu utilization?
> I'm afraid I can't answer that, but it's possible that any of the
> instances might lock up, so you would probably need to monitor all of
> them.  Perhaps an easier option could be to time how long it takes to
> run a command, and when winbind locks up and that command doesn't
> complete, then you know winbind must be restarted.  (Even something like
> "rm /tmp/heartbeat; ls; touch /tmp/heartbeat" would mean that if
> /tmp/heartbeat disappeared for more than a few seconds you know
> something is wrong.  "monit" probably has a test for this already and
> would save cronjob scripting.

I am doing something similar. I grep for no of winbind pids and avg it. If
the cpu avg crosses, say 10%, then clear the tdb and restart the winbind,

##This script will clean up winbind if it causes CPU issue.
#WBCPU=`/bin/ps -eo pcpu,pid,user,args,cputime | grep winbind|grep -v
grep|awk '{print $1}' > /tmp/wbind.dont.d
WBCPU=`top -b -n1 |grep winbindd|awk '{print $5}' > /tmp/wbind.dont.del`
WBCOUNT=`wc -l /tmp/wbind.dont.del|awk '{print $1}'`
WBCPUTOT=`echo $(sed -e 's/$/+/' /tmp/wbind.dont.del) 0|bc`
#echo Count is  $WBCOUNT and Tot is $WBCPUTOT and Avg is $WBCPUAVG
if [ $WBCPUAVG -gt 10 ]
                rm -rf /var/lib/samba/* > /dev/null
                /etc/init.d/winbind restart

I am somewhat limited to use tdb backend as ldap back end doesn't seems to
be supporting trusted domains.

> Cheers,
> Adam.

More information about the samba mailing list