Can fallocate() ops be emulated better using SMB request compounding?

David Howells dhowells at redhat.com
Thu Dec 7 15:58:46 UTC 2023


Hi Steve, Namjae, Jeremy,

At the moment certain fallocate() operations aren't very well implemented in
the cifs filesystem on Linux, either because the protocol doesn't fully
support them or because the ops being used don't also set the EOF marker at
the same time and a separate RPC must be made to do that.

For instance:

 - FALLOC_FL_ZERO_RANGE does some zeroing and then sets the EOF as two
   distinctly separate operations.  The code prevents you from doing this op
   under some circumstances as it doesn't have an oplock and doesn't want to
   race with a third party (note that smb3_punch_hole() doesn't have this
   check).

 - FALLOC_FL_COLLAPSE_RANGE uses COPYCHUNK to move the file down and then sets
   the EOF as two separate operations as there is no protocol op for this.
   However, the copy will likely fail if the ranges overlap and it's
   non-atomic with respect to a third party.

 - FALLOC_FL_INSERT_RANGE has the same issues as FALLOC_FL_COLLAPSE_RANGE.

Question: Would it be possible to do all of these better by using compounding
with SMB2_FLAGS_RELATED_OPERATIONS?  In particular, if two components of a
compound are marked related, does the second get skipped if the first fails?
Further, are the two ops then essentially done atomically?

If this is the case, then for FALLOC_FL_ZERO_RANGE, just compounding the
SET_ZERO_DATA with the SET-EOF will reduce or eliminate the race window.

For FALLOC_FL_COLLAPSE/INSERT_RANGE, we could compound the COPYCHUNK and
SET-EOF.  As long as the SET-EOF won't happen if the COPYCHUNK fails, this
will reduce the race.

However, for COLLAPSE/INSERT, we can go further: recognise the { COPYCHUNK,
SET-EOF } compound on the server and see if the file positions, chunk length
EOF and future EOF are consistent with a collapse/insert request and, if so,
convert the pair of them to a single fallocate() call and try that; if that
fails, fall back to copy_file_range() and ftruncate().


As an alternative, at least for removing the 3rd-party races, is it possible
to make sure we have an appropriate oplock around the two components in each
case?  It would mean potentially more trips to the server, but would remove
the window, I think.

David




More information about the samba-technical mailing list