[Samba] Winbind occasionally forgets some users (failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND)

Jeremy Allison jra at samba.org
Thu Aug 25 00:22:30 UTC 2016

On Wed, Aug 24, 2016 at 12:12:29PM -0700, Aaron C. de Bruyn via samba wrote:
> I've been googling, but can't seem to find an answer to this one.
> We have a Windows network with ~25 sites.  Each site has a local Windows
> DC.  Each site has a Debian 8.5 box running Samba 4.2.10-Debian.
> We decided to test moving shares from the Windows Server to the local
> Debian machine.  (zfs snapshot is *really* handy when someone decides to
> open cryptolocker).
> File sharing has been working perfectly for about 6 months with one
> exception.
> Occasionally (for no reason I can find), winbind 'forgets' a handful of
> users.
> If I run 'wbinfo -i <a working user>' I get their name, home directory,
> shell, etc...
> If I run it against a non-working-user, I get:
> failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
> Could not get info for user <some user>
> These users can no longer connect to shares.  They don't show up in the
> 'net ads user' list, etc...
> I don't think it's a disconnect with AD because newly created users will
> sometimes show up after waiting ~15 minutes for the replication delay.
> Sometimes they won't.
> A 'net cache flush' doesn't fix the issue.
> I have to stop winbind and samba (causing problems for all users), and
> basically rm -rf everything under /var/lib/samba/, then run 'net ads join'
> and start the services back up.
> All users show up at that point.
> It's difficult to test because the issue appears to happen randomly between
> a few days and a few weeks.
> Logs don't reveal anything.

Logs will be key here. In conjunction with
the source code they should be able to tell
you the difference between a successful and
failed lookup, and allow you to look into
the different code paths.

Can't help much more without them.

More information about the samba mailing list