[Samba] Users created in last few years cannot login after 4.7 -> 4.8 + winbind

Paul Raines raines at nmr.mgh.harvard.edu
Thu Jan 3 22:46:51 UTC 2019

TLDR: after upgrading our CentOS 7.5 servers using Samba 4.7.x with
security = ads and no winbind to CentOS 7.6 with Samba 4.8.x with security = 
ads + winbind all users accounts created in the last few years can no longer 

Explaining this requires a fairly long back story

Our corporate is primarily a Windows shop while our own research department
primarly uses Linux.  For over a decade we used our own account/group/file
namespace in our Linux infrastructure totally separate from corporate.

A couple years ago for new security hardening purposes corporate has
dictated all logins need to be based off their AD server so they
can manage/monitor/enforce password changes, access, etc.

The issue was we had petabytes of data using our accounts which had in most 
cases both different names and underlying user ids. For example, my Linux 
username is raines with ID 5829 and my corparte/AD username is per2 with ID 
2040470.  And groups have no relation whatsoever. Simply reconfiguring our 
Linux servers to do straight LDAP or winbind/nss to corporate AD was not 
possible without a wholesale painful re-ID-ing of files and breakage of
lots of apps that hard code usernames in settings.

For all non-Samba resources (login, web, LDAP-based apps, ...) I could solve 
this issue using LDAP SASL passthru.  In this scheme you set the user LDAP 
record the userPassword field to be something like

userPassword:: {SASL}per2

and any authentication to the LDAP server for user 'raines' is passed
through to the AD server as authentication for user 'per2'.

The issue was this did not work for Samba.  The solution I came up with
was to create a "username map = /etc/samba/users.map" with lines like

raines = MYDOMAIN\per2
aea32 = MYDOMAIN\aea32

and then have in smb.conf

 	workgroup = MYDOMAIN
         security = ads
         passdb backend = tdbsam
         realm = MYDOMAIN.ORG

         dedicated keytab file = /etc/krb5.keytab
         kerberos method = secrets and keytab
         preferred master = no
         encrypt passwords = yes

         socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=65536 

         idmap config *:backend = tdb
         idmap config *:range = 100-999999
         idmap config MYDOMAIN:backend = ad
         idmap config MYDOMAIN:schema_mode = rfc2307
         idmap config MYDOMAIN:range = 1000000-9999999

         username map = /etc/samba/users.map
         username map cache time = 60

         winbind nss info = rfc2307
         winbind trusted domains only = no
         winbind use default domain = yes
         winbind enum users = yes
         winbind enum groups = yes
         winbind nested groups = yes
         winbind refresh tickets = yes
         allow trusted domains = yes

         server signing = auto
         client signing = auto
         client ntlmv2 auth = yes
         ntlm auth = yes
         lanman auth = no
         max protocol = SMB2

         map acl inherit = yes
         nt acl support = yes
         map archive = no
         create mode = 0770
         directory mode = 0770

     comment = poster volume
     path = /home/posters
     valid users = +webgp
     public = no
     writable = yes
     printable = no
     create mask = 664
     force create mode = 664
     directory mask = 2775
     force directory mode = 2775

but I would NOT run winbind.  When I did run winbind I found the behavior
I find now with 4.8.x + winbind, that many accounts could not login. Since
I did not need winbind and it worked without I just didn't run it and
didn't investigate further.

Also worth noting since this dictated policy all new accounts we create on
the Linux side have matching usernames and underlying UID to AD.  Groups
are still totally independent.

Now with the 4.8.x upgrade in CentOS 7.6 I have to run winbind.

(Actually curiously some users still COULD login with 4.8.x and no
  winbind running.  I have no idea how but it worked but it was a
  very small subset of very old users)

And now with 4.8.x and winbind running I have the situation where 
recently created users are unable to login.  Curiously it does not
coincide exactly with the username/UID matching change over.  Users
created in approximately the first year of that change over CAN login.

>From the log files for a user where things work I will see lines like

   check_ntlm_password:  authentication for user [per2] -> [per2] -> [raines] 
   pdb_getsampwnam (TDB): error fetching database.
    Key: USER_raines
   Adding homes service for user 'raines' using home directory: 

For a user where things do NOT work I see

   check_ntlm_password:  authentication for user [aea32] -> [aea32] -> [aea32] 
   sid_to_gid(S-1-5-21-8915387-943144406-1916815836-513) failed
   Failed to generate session_info (user and group token) for session setup: 
   NT error packet at ../source3/smbd/sesssetup.c(263) cmd=115 (SMBsesssetupX) 

If I run 'strings -a /var/lib/samba/group_mapping.tdb' I see

TDB file
TDB file

Also I see

# wbinfo --domain=MYDOMAIN
MYDOMAIN\Domain Users 2

When I downgrade back to 4.7.x with NO winbind everything works again
for everybody as it did before.

I have tried rejoining the domain.  I have tried wiping out everything
in /var/lib/samba.

Any clues as to what is going on?

To make things more confusing, there is one 4.8.x + winbind server
where new users CAN login fine.   I have tried recreating its setup
exactly on a different server but it behaves like the others denying all
accounts created in last few years.


It might be related to this thread but maybe not



Thought I had a solution

Changing my 'idmap config' stanza to simply be the two lines

         idmap config *:backend = tdb
         idmap config *:range = 100-9999999

covering the entire range.  This worked on one server for one new user
who could not login before.  But it did not work for that same user on
another server.  And when a second user who previously was unable
to login tried it failed on both servers.

The error message changes slightly in the log:

   check_ntlm_password:  authentication for user [aea32] -> [aea32] -> [aea32] 
[2019/01/03 15:55:56.770349,  1] 
   SID S-1-5-21-8915387-943144406-1916815836-1333656 -> getpwuid(100) failed
[2019/01/03 15:55:56.770401,  1] 
   Failed to generate session_info (user and group token) for session setup: 
[2019/01/03 15:55:56.770488,  3] ../source3/smbd/error.c:82(error_packet_set)
   NT error packet at ../source3/smbd/sesssetup.c(263) cmd=115 (SMBsesssetupX) 

I tried 'idmap config *:backend = nss' and randomly it worked on one server 
but failed on another with

   check_ntlm_password:  authentication for user [nmr27] -> [nmr27] -> [nmr27] 
[2019/01/03 17:28:28.480615,  1] 
   SID S-1-5-21-8915387-943144406-1916815836-949594 -> getpwuid(1000000) failed
[2019/01/03 17:28:28.480673,  3] 
   Failed to add local groups
[2019/01/03 17:28:28.480738,  3] 
   smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] 
status[NT_STATUS_UNSUCCESSFUL] || at ../source3/smbd/smb2_sesssetup.c:137

A lot of this just seems to be totally random.

