[LSF/MM/BPF TOPIC] Enhancing Linux Copy Performance and Function and improving backup scenarios

Steve French smfrench at gmail.com
Sat Feb 1 19:54:46 UTC 2020

On Wed, Jan 29, 2020 at 7:54 PM Darrick J. Wong <darrick.wong at oracle.com> wrote:
> On Wed, Jan 22, 2020 at 05:13:53PM -0600, Steve French wrote:
> > As discussed last year:
> >
> > Current Linux copy tools have various problems compared to other
> > platforms - small I/O sizes (and most don't allow it to be
> > configured), lack of parallel I/O for multi-file copies, inability to
> > reduce metadata updates by setting file size first, lack of cross
> ...and yet weirdly we tell everyone on xfs not to do that or to use
> fallocate, so that delayed speculative allocation can do its thing.
> We also tell them not to create deep directory trees because xfs isn't
> ext4.

Delayed speculative allocation may help xfs but changing file size
thousands of times for network and cluster fs for a single file copy
can be a disaster for other file systems (due to the excessive cost
it adds to metadata sync time) - so there are file systems where
setting the file size first can help

> >  And copy tools rely less on
> > the kernel file system (vs. code in the user space tool) in Linux than
> > would be expected, in order to determine which optimizations to use.
> What kernel interfaces would we expect userspace to use to figure out
> the confusing mess of optimizations? :)

copy_file_range and clone_file_range are a good start ... few tools
use them ...

> There's a whole bunch of xfs ioctls like dioinfo and the like that we
> ought to push to statx too.  Is that an example of what you mean?

That is a good example.   And then getting tools to use these,
even if there are some file system dependent cases.

> > But some progress has been made since last year's summit, with new
> > copy tools being released and improvements to some of the kernel file
> > systems, and also some additional feedback on lwn and on the mailing
> > lists.  In addition these discussions have prompted additional
> > feedback on how to improve file backup/restore scenarios (e.g. to
> > mounts to the cloud from local Linux systems) which require preserving
> > more timestamps, ACLs and metadata, and preserving them efficiently.
> I suppose it would be useful to think a little more about cross-device
> fs copies considering that the "devices" can be VM block devs backed by
> files on a filesystem that supports reflink.  I have no idea how you
> manage that sanely though.

I trust XFS and BTRFS and SMB3 and cluster fs etc. to solve this better
than the block level (better locking, leases/delegation, state management, etc.)



More information about the samba-technical mailing list