[Samba] id mapping

Mon Sep 20 14:42:13 UTC 2021

First of all, I really appreciate the helpful discussion and comments. 
Based on Rowland's comment below, which I believe I also saw verbatim in 
a StackExchange thread, I'm switching gears for a second to an entirely 
different system I maintain which is larger and more critical.

On 9/19/21 4:25 PM, Rowland Penny via samba wrote:
> 
> You used to be able to use sssd with Samba, but from Samba 4.8.0 , the
> smbd binary must go via winbind to get to AD. This means, because sssd
> has its own version of the winbind libs, you cannot use sssd anymore.
> It may seem to work, but it will not work correctly and it isn't
> supported any longer.
>   

I'm the administrator for a structural biology lab. We did the initial 
SARS-CoV-2 imaging used to design both the Pfizer and Moderan mRNA 
vaccines. You might have noticed that is an ongoing situation, so there 
is some mild interest in keeping this facility operational.

The system consists mostly of linux workstations and servers, but with a 
few Windows machines used for some computational biology packages that 
only run on Windows.  Also, the microscope vendors supply control PCs 
which all run Windows, and I have to interface with these.

The university we're a part of runs a Windows Active Directory domain 
service (AUSTIN) with some rather unfortunate design constraints (and 
architectural decisions) that I have no control over.

This domain is itself populated dynamically from an authoritative X.500 
directory. This means we can't use (or maybe the AD team can't figure 
out how to facilitate) RFC2307 extensions because the entire directory 
is rewritten every day. Furthermore, they refuse to allow any domain 
trusts. The reason the latter is important is when all else is lost, you 
can always add another layer of indirection, say by having your linux 
users authenticate against a FreeIPA directory which has a trust 
relationship with the domain (as discussed in this series of blog posts 
of which this is a representative post:
https://www.redhat.com/en/blog/i-really-cant-rename-my-hosts
(This might work with a Samba AD instead of FreeIPA, too; I'm not sure. 
Also, If someone looks at the blog and is confused by my comments, AFAIK 
idM is basically FreeIPA supported by RedHat.)

Largely based on the chutzpah of naivete, we decided to use the 
University's AUSTIN directory for authentication and authorization 
anyway. So far this has worked pretty well using both sssd and Samba.

All the linux hosts are bound to the domain using sssd, and the file 
servers also run smbd. Samba seems to be required on all the linux 
machines regardless of whether they serve files, I'm not exactly sure 
why. It took a while to get this all working, and now I just use a 
recipe to set up new machines.

Because sssd uses a fixed algorithm for mapping SIDs to UIDs, the users 
UIDs are consistent across all machines.

Access is controlled using AD Security Groups. A GPO is associated with 
each host that restricts console and/or remote access to one or more 
security groups.

In a typically open scientific context, that would be the end, but 
because we have groups that have contracts for work with, say, Pfizer, I 
have to make sure that some data has read access restricted to a 
particular group, with further restrictions on who can write to the 
collection. The way I implemented this is to have all files owned by a 
local dummy user/group on the file server with very lax permissions (but 
no ability to log in) and then use POSIX extended ACLs for actual access 
restrictions; e.g.

   # setfacl -d -m g:cns-smithlabusers:rX smithdata
   # setfacl -d -m u:smith8437:rwX smithdata

where cns-smithlabusers is an AD Security Group consisting of Smith Lab 
  researchers, and smith8437 is the AD UserName of the PI, the only 
person  authorized to edit the data in this case. For after the fact 
authorization control, something like

   # setfacl -R -m g:cns-smithlabusers:rX smithdata
   # setfacl -R -m u:smith8437:rwX smithdata

This works remarkably well and is pretty flexible. For example, we run a 
      distributed software stack which needs read access to all the 
image data and which runs as a local user.  To facilitate this:

   setfacl -d -m u:cryosparc_user:rX EMimages
   setfacl -R -m u:cryosparc_user:rX EMimages

cryosparc_user is a local user on the linux workstations, and is then 
able to access the data, which is NFS-mounted from one or more of the 
fileservers (using sec=sys), so authorization questions are deferred to 
the file server.  Yes I understand that this isn't really secure against 
a determined attempt to access the data, it's a convenience/security 
trade-off the PIs are OK with.

Back to Roland's comment.  The linux workstations share data over NFS, 
which defers POSIX ACL authorization back to the file server, so no 
problems there. And so far, everything has worked properly on the 
Windows machines as well; i.e. users are able to log in to the Windows 
machines using their University credentials and their home directories 
and relevant image collections are mounted automatically from the 
fileservers (again, facilitated via Group Policy). However, I checked, 
and our main file server is running an older version of Samba:

cnsit at kraken:/EM$ dpkg -l | grep " samba "
ii  samba 
2:4.7.6+dfsg~ubuntu-0ubuntu2.11   amd64        SMB/CIFS file, print, and 
login server for Unix

Now it looks like I'm going to have to rethink the entire system 
architecture if I want to upgrade the file server from Ubuntu 18.04 to 
anything newer?  (Ubuntu 20.04 ships 4.11.6).  This is going to be a 
problem, as all the files are related to the UIDs and GIDs generated by 
sssd. I'm not sure that's realistic in a very active research 
environment. The solution is likely going to involve virtualizing all 
the Windows machines and using IOMMU to provide a PCIe passthrough for 
whatever GPU's they need for processing.

Any thoughts on this appreciated.