Duplicate SMB file_ids leading to Windows client cache poisoning

Christof Schmitt cs at samba.org
Fri Dec 10 19:08:46 UTC 2021

On Fri, Dec 10, 2021 at 09:57:07AM -0800, Jeremy Allison via samba-technical wrote:
> On Fri, Dec 10, 2021 at 06:06:56PM +0100, Ralph Boehme via samba-technical wrote:
> > On 12/10/21 17:56, Andrew Walker wrote:
> > > That's a good point, but if MacOS SMB client is faking up an inode
> > > number based on a hash of the filename in the zero-id case, isn't it
> > > even more likely to yield a collision at some point?
> > 
> > well, it's somehash(name) combined with the parent-inode number. I know,
> > it's not ideal.
> > 
> > But going back to inode numbers brings us back to:
> > 
> > https://bugzilla.samba.org/show_bug.cgi?id=12715
> > 
> > *scratches head*
> OK, seems to me that we need inode numbers by default,
> as that's what works with both Windows an Linux clients.
> If Mac's need special handling here, then we have the
> capability to detect them and switch out the inode
> numbers for Mac clients (fruit... :-) :-).

Not every Samba server where Mac clients connect has vfs_fruit enabled.
And requiring vfs_fruit to prevent data corruption seems like a big
step. The requirement for Mac clients is the same, no matter whether
the fileid is generated in vfs_default or vfs_fruit.

The initial problem is fairly easy to recreate: Use a Samba version that
reports inode numbers as file ids, create 100 different files with
different data from MacOS. Now go to the file system, delete those files
and create files with the same name and different data. Then read those
files on the Mac client. Chances are that the Mac client will now show
the data from the old files (due to the fileid based caching. If a new
file got the same inode, it has the same fileid).

As this thread shows, timestamps are not a universal solution.
In vfs_gpfs, we are able to retrieve the inode number and the inode
generation and use that to generate unique ids. That seems to work
reasonably well. But even the standard Linux statx() call does not
return the generation number. So that would only be solution for file
systems that that provide an interface to query the generation number.

For additional fun, the spec says:

    2.1.9 64-bit file ID
   The identifier SHOULD<10> be unique to the volume and stable until the
   file is deleted.

So technically the Mac is wrong to expect identifiers to be unique
across file deletions. I assume this comes from the old MacOS file
systems that have a CNID that is only increasing.

I am not sure how to solve this for the general case...


More information about the samba-technical mailing list