[Samba] intermittent pam_winbind authentication failure

Rowland Penny rpenny at samba.org
Wed Jun 11 13:22:54 UTC 2025


On Wed, 11 Jun 2025 13:52:55 +0100
James Dingwall via samba <samba at lists.samba.org> wrote:

> Hi Rowland,
> 
> Thank you for taking the time to reproduce on your system.  I've
> added some answers to your questions inline and given some details
> about the current state of my testing.
> 
> > Date: Tue, 3 Jun 2025 13:46:23 +0100
> > From: Rowland Penny <rpenny at samba.org>
> > To: samba at lists.samba.org
> > Subject: Re: [Samba] intermittent pam_winbind authentication failure
> > 
> > On Tue, 3 Jun 2025 11:10:54 +0100
> > James Dingwall via samba <samba at lists.samba.org> wrote:
> > 
> > > Hi,
> > > 
> > > We've been having an intermittent issue with pam_winbind on Ubuntu
> > > 24.04. The test case we have to demonstrate this is to run this
> > > loop after logging in as a domain user:
> > > 
> > > $ while sleep 1 ; do sudo -k ; sudo -K ; date ; echo "password" |
> > > sudo -S /bin/echo "sudo success" || break ; done
> > > 
> > > The loop will run successfully, sometimes for 50+ iterations but
> > > eventually
> > > 
> > > [sudo] password for DOMAIN\user: sudo success
> > > ...
> > > [sudo] password for DOMAIN\user: sudo success
> > > [sudo] password for DOMAIN\user: Sorry, try again.
> > > [sudo] password for DOMAIN\user:
> > > sudo: no password was provided
> > > sudo: 1 incorrect password attempt
> > > 
> > > Rejection of the correct password can also happen with a console
> > > logon or remote ssh.  The system is joined to a Windows 2012 based
> > > domain.  Ubuntu 24.04 ships with packages based on 4.19.5 but
> > > rebuilding the packages from 25.04 (Samba 4.21.4) also has the
> > > same issue.
> > > 
> > > If I try to run winbindd with -d 3 or higher or with strace
> > > attached I'm unable to reproduce the issue which makes me suspect
> > > a timing issue passing the response between winbindd and
> > > pam_winbind.  At -d 2 the winbindd logs associated with the
> > > failure:
> > > 
> > > May 22 09:14:36 hostname winbindd[853342]: ads_krb5_mk_req:
> > > smb_krb5_get_credentials failed for S00099-HOST$@DOMAIN.COM
> > > (Preauthentication failed) May 22 09:14:36 hostname
> > > winbindd[853342]: failed to get ticket for
> > > S00099-HOST$@DOMAIN.COM: Preauthentication failed May 22 09:14:36
> > > hostname winbindd[853342]: _wbint_PamAuth: Plain-text
> > > authentication for user DOMAIN\user returned
> > > NT_STATUS_LOGON_FAILURE (PAM: 7) May 22 09:14:36 hostname
> > > winbindd[853342]: Auth: [winbind,PAM_AUTH, PAM_WINBIND[sudo],
> > > 871758] user [DOMAIN]\[user] at [Thu, 22 May 2025 09:14:36.413942
> > > BST] with [Plaintext] status [NT_STATUS_LOGON_FAILURE]
> > > workstation [(null)] remote host [unix:] mapped to
> > > [(null)]\[(null)]. local host [unix:] May 22 09:14:36 hostname
> > > winbindd[853342]: {"timestamp":
> > > "2025-05-22T09:14:36.424126+0100", "type": "Authentication",
> > > "Authentication": {"version": {"major": 1, "minor": 3},
> > > "eventId": 4625, "logonId": "db5db55c0b2b6903", "logonType": 8,
> > > "status": "NT_STATUS_LOGON_FAILURE", "localAddress": "unix:",
> > > "remoteAddress": "unix:", "serviceDescription": "winbind",
> > > "authDescription": "PAM_AUTH, PAM_WINBIND[sudo], 871758",
> > > "clientDomain": "DOMAIN", "clientAccount": "user", "workstation":
> > > null, "becameAccount": "", "becameDomain": "", "becameSid": null,
> > > "mappedAccount": null, "mappedDomain": null, "netlogonComputer":
> > > null, "netlogonTrustAccount": null, "netlogonNegotiateFlags":
> > > "0x00000000", "netlogonSecureChannelType": 0,
> > > "netlogonTrustAccountSid": null, "passwordType": "Plaintext",
> > > "clientPolicyAccessCheck": null, "serverPolicyAccessCheck": null,
> > > "duration": 98176}}
> > > 
> > > Occasionally for success I get:
> > > 
> > > Jun 03 08:27:44 hostname winbindd[774711]: final write to client
> > > failed: Broken pipe
> > > 
> > > In the domain controller event log I get a pair of 4768 / 4769
> > > event ids for the proceeding success cases and the failure case
> > > (clocks are in sync so I'm reasonably confident I've matched
> > > these up correctly) so it seems winbindd has had a successful
> > > exchange with the domain controller.
> > > 
> > > A similar loop to the test case using `echo "pasword" | wblogin
> > > --pam-logon="${USER}"` runs reliably.  However I don't see
> > > anything in the winbind logs so I'm not sure that "Attempt to
> > > authenticate a user in the same way pam_winbind would do." is
> > > identical in the technical implementation.
> > > 
> > > Are there any alternative approaches I could take to try and
> > > uncover what is happening?
> > > 
> > > Thanks,
> > > James
> > > 
> > > 
> > > /etc/pam.d/common-auth includes:
> > > 
> > > auth    [success=ignore default=die]    pam_faillock.so preauth
> > > deny=6 unlock_time=1800 silent auth    [success=ok default=1]
> > >  pam_localuser.so auth    [success=3 default=ignore]
> > > pam_unix.so try_first_pass auth    [success=2 default=ignore]
> > >  pam_winbind.so krb5_auth krb5_ccache_type=FILE cached_login
> > > try_first_pass debug auth    optional        pam_faillock.so
> > > authfail deny=6 unlock_time=1800 auth    requisite
> > >        pam_deny.so
> > > 
> > > 
> > > The smb.conf we're using:
> > > 
> > > [global]
> > >   workgroup = DOMAIN
> > >   realm = DOMAIN.COM
> > >   netbios name = S00099-host
> > >   security = ads
> > >   server role = member server
> > >   dedicated keytab file = /etc/krb5.keytab
> > >   kerberos method = secrets and keytab
> > >   allow trusted domains = no
> > >   server string = %h server (Samba, Ubuntu)
> > >   disable netbios = yes
> > >   password server = *
> > >   winbind enum groups = yes
> > >   winbind enum users = yes
> > >   winbind nested groups = yes
> > >   winbind refresh tickets = no
> > >   template shell = /bin/bash
> > >   template homedir = /home/local/%D/%U
> > >   idmap config * : backend              = autorid
> > >   idmap config * : range                = 2900000001-3000000000
> > >   idmap config * : ignore builtin       = yes
> > >   idmap config DOMAIN : backend   = rid
> > >   idmap config DOMAIN : range     = 3000000001-3100000000
> > >   map to guest = bad user
> > >   guest account = nobody
> > >   log file = /var/log/samba/log.%m
> > >   log level = 1
> > >   max log size = 5000
> > >    load printers = no
> > >    printing = bsd
> > >    printcap name = /dev/null
> > >    disable spoolss = yes
> > >    dns proxy = no
> > >    wins support = no
> > >    domain master = no
> > >    local master = no
> > >    preferred master = no
> > >    store dos attributes = yes
> > >    map hidden = no
> > >    map readonly = no
> > >    map system = no
> > >    map archive = no
> > >    hide dot files = no
> > >    enable core files = yes
> > >    min receivefile size = 131072
> > >    aio read size = 1
> > >    aio write size = 1
> > >    use sendfile = yes
> > >    unix charset = UTF8
> > >    ea support = yes
> > >    map acl inherit = yes
> > >    acl map full control = no
> > >    unix extensions = no
> > >    inherit acls = no
> > >    follow symlinks = yes
> > >    wide links = yes
> > > 
> > 
> > OK, I took your one line script and turned it into this:
> > 
> > #!/bin/bash
> > 
> > password=$1
> > 
> > counter=0
> > while sleep 1 ; do
> > 	echo "TRY: $counter"
> > 	sudo -k
> > 	sudo -K
> > 	date
> > 	echo "$password" | sudo -S /bin/echo "sudo success" || break
> > 	counter=$((counter+1))
> > done
> > 
> > I then ran it and it is still running, 'TRY' is now in excess of
> > 2,500 and still going up.
> > 
> > I feel your problems possibly lie in your /etc/pam.d/common-auth
> > file and rather strange (well, strange to me) idmap config lines in
> > smb.conf.
> > 
> > My common-auth contains this:
> > 
> > auth	[success=2 default=ignore]	pam_unix.so nullok
> > auth	[success=1 default=ignore]	pam_winbind.so
> > krb5_auth krb5_ccache_type=FILE cached_login try_first_pass
> > auth	requisite			pam_deny.so
> > auth	required			pam_permit.so
> > 
> > What do expect your extra/different lines to do ? 
> 
> We've added pam_faillock to have consistent password lockout
> behaviour for a system regardless of any domain controller policies
> or whether the user is local or domain.  The pam_localuser line is to
> skip pam_unix if the user is not present in /etc/passwd.  pam_unix
> was complaining for non-local accounts and when we were ingesting
> logs from *lots* of systems to a central processor this was very
> noisy so skipping over prevents that.

You really need very few 'local users' (by which, I mean a user that
isn't a 'system' user and is in /etc/passwd), I have only one, just in
case a domain connection error occurs, otherwise members of the Domain
Admins group are allowed admin access (use another group if you use the
'ad' idmap backend).

> 
> > 
> > You appear to be using both the 'autorid' and 'rid' idmap backends
> > and you are telling autorid to ignore the BUILTIN domain, so what
> > is mapping the required BUILTIN users and groups, I haven't a clue.
> > I suggest you change 'autorid' to 'tdb'. I also have to ask, why
> > are you using such high numbers for the ranges ?
> > 
> > These are my idmap config lines:
> > 
> >   idmap config * : backend = tdb
> >   idmap config * : range = 3000-7999
> >   idmap config SAMDOM : backend  = rid
> >   idmap config SAMDOM : range = 10000-999999
> > 
> > By the way, 'TRY' is now approaching 3,000 and still going.
> 
> We didn't (obviously) have a need for BUILTIN for just winbind, there
> are no shares or printing etc.  We felt that the idmap_autorid was
> more suitable for our environment as it does not require a database
> which needs to be managed.  The ranges are large to reduce the
> chances of a collision in the autorid scheme.

Yes, I can understand using 'autorid', just not with 'rid' at the same
time. I have never set up Samba in the way you have, but I presume that
'rid' is mapping your 'DOMAIN' users and I have no idea if 'autorid' is
doing anything because you have stopped it from mapping the BUILTIN
domain and, yes, you do need the BUILTIN domain.
 
> 
> The current state is that I have found the code responsible for
> generating the "(Preauthentication failed)" error, from
> third_party/heimdal/lib/krb5/fast.c:
> 
> krb5_error_code
> _krb5_fast_unwrap_kdc_rep(krb5_context context, int32_t nonce,
>                           krb5_data *chksumdata,
>                           struct krb5_fast_state *state, AS_REP *rep)
> {
> ...
>     if (nonce != (int32_t)fastrep.nonce) {
>         ret = KRB5KDC_ERR_PREAUTH_FAILED;
>         goto out;
>     }
> ...
> }
> 
> I added a debug statement there and for one failure:
> 
> 2025-06-11T10:22:39 _krb5_fast_unwrap_kdc_rep: ret =
> KRB5KDC_ERR_PREAUTH_FAILED (3) (-4476732 != 12300484)
> 
> Obviously the numbers aren't the same but printing the binary
> representation its almost a perfect bit flip (I think this is two's
> complement representation because I used %d)
> 
> >>> "{0:b}".format(12300484)
> '101110111011000011000100'
> >>> "{0:b}".format(-4476732)
> '-10001000100111100111100'
> 
> For another occurence printing the nonce as %lu:
> 
> 
> 2025-06-11T12:59:10 get_cred_kdc: generated nonce: 4286841863
> 2025-06-11T12:59:10 _krb5_fast_unwrap_kdc_rep: ret =
> KRB5KDC_ERR_PREAUTH_FAILED (3) (4286841863 != 8651783)
> 
> >>> "{0:b}".format(4286841863)
> '11111111100001000000010000000111'
> >>> "{0:b}".format(8651783)
>         '100001000000010000000111'
> 
> The left hand side is definitely the generated value so perhaps an
> issue sending the request or unpacking the response?  (If I
> understood it the value is sent to the domain controller and then
> returned in the reply.)
> 
> Thanks,
> James
> 

If it is of any interest, I finally killed my script when it go over
9000, it just wouldn't fail for me.

Rowland



More information about the samba mailing list