[Bug 5124] Parallelize the rsync run using multiple threads and/or connections

samba-bugs at samba.org samba-bugs at samba.org
Thu Feb 7 02:24:32 UTC 2019


https://bugzilla.samba.org/show_bug.cgi?id=5124

--- Comment #8 from Michael <michael.williams at infatech.co.nz> ---
+1 from me on this.

We have several situations where we need to copy a large number of very small
files, and I expect that having multiple file transfer threads, allowing say ~5
transfers concurrently, would speed up the process considerably. I expect that
this would also make better use of the available network bandwidth as each
transfer appears to have an overhead for starting and completing the transfer
which makes the effective transfer rate far less than the available network
bandwidth. This is the method one of our pieces of backup software uses to
speed up backups and is also implemented in FileZilla for file transfers.
Consider a very large file that needs to be transferred, along with a number of
small files. In a single transfer mode, all other files would need to wait
while the large file is transferred. If there are multiple transfers happening
concurrently, the smaller files will continue transferring while the large file
transfers. I have seen the benefits of this sort of implementation in other
software.

I can also see benefits in having file transfers begin whilst rsync is
comparing files. This could logically work if you consider rsync makes a 'list'
of files to be transferred and that it begins transferring files as soon as
this list begins to be populated. In situations where there are a large number
of files and few of these files changed, the sync could effectively be
completed by the time rsync is finished comparing files (given the few changed
files may have already been transferred during the file comparison). This also
is effectively implemented in FileZilla (consider copying a directory in which
FileZilla has to recurse into each directory and add each file to copy into the
queue).

Interestingly, I assumed this was already an option for rsync, so I went
looking to find the necessary option. However, all I found were the previously
mentioned hacks, which weren't what I was going for.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.



More information about the rsync mailing list