[Samba] winbind craps out, NT_STATUS_PIPE_BROKEN

Jay Sullivan jpspgd at rit.edu
Tue Apr 26 08:20:23 MDT 2011

Good morning, Samba list.  =)

I've been experiencing intermittent winbind failures over the past few weeks.  The symptom is that users that haven't connected in a while, and thus aren't in the winbind cache, are unable to connect to any shares.  I see a lot of NT_STATUS_PIPE_BROKEN in the logs when the failures occur, like this one from log.winbind:
[2011/04/26 09:20:54.671225,  5] winbindd/winbindd_getpwnam.c:138(winbindd_getpwnam_recv)
  Could not convert sid S-1-5-21-1060284298-1450960922-725345543-818338: NT_STATUS_PIPE_BROKEN

At the same time, this is from log.somehostname:
[2011/04/26 09:17:30.597274,  2] auth/auth.c:314(check_ntlm_password)
  check_ntlm_password:  Authentication for user [coolusername] -> [coolusername] FAILED with error NT_STATUS_NO_SUCH_USER

A few minutes later, I restarted winbind and everything was groovy again.

Users that are still cached seem to work fine, sometimes for hours.  It's only _some_ new connections that exhibit this problem.  Restarting winbindd fixes the issue 100% of the time.  I've tried to figure out how to 'detect' when winbind has crapped out on me, but wbinfo -tp, and net ads testjoin all return what would be expected.  Most of the time, `id username` works fine, too, for users that are most certainly not cached.

I also see a lot of noise in the logs about "more than 200 winbind connections, cleaning up idle connections".  This particular server usually has at least 400 client connections all the time, and a good chunk of those are rather active (distributed renderfarm).  I see that smb.conf will recognize "winbind max clients" in 3.6.  I also understand that I could change WINBINDD_MAX_SIMULTANEOUS_CLIENTS and recompile a new winbindd, but that's not an option for me for a few weeks.

Output from uname -a:
Linux cias-files 2.6.32-5-amd64 #1 SMP Mon Mar 7 21:35:22 UTC 2011 x86_64 GNU/Linux

I'm using Samba 3.5.6 on Debian Squeeze in ADS mode joined to a 2008R2 domain.  I'm using the RID idmap backend.

Any thoughts on how I can further troubleshoot/debug this problem?  Do you think that the 200 clients thing is an issue for me?



Jay Sullivan
jay.sullivan at rit.edu<mailto:jay.sullivan at rit.edu>

More information about the samba mailing list