[Bug 3099] Please parallelize filesystem scan

ray vantassle rayvantassle at gmail.com
Fri Jul 17 18:29:07 UTC 2015


Ken, this just happens to be a special case where your configuration has a
huge number of spindles.  If you have multiple threads reading the same
spindle you'll just be thrashing the heads back & forth.  If there is one
thread reading at the front of the disk and another thread reading at the
end of the disk, it will be *slower* that if you have just one thread
reading first the front of the disk and then the end of the disk.  Two
threads will just have the head whipping back and forth.

"one of my rsync jobs moving from a ZFS system ... has over 100 million
files"
Spreads over how many spindles?

The problem is, the optimum way to access the disks depends on how the data
lies on the disks.  And that's something that a mere program cannot know.
Only the filesystem can know that information.  Whether it's ext4, md,
brtfs, zfs, or whatever -- a program like rsync cannot possibly know how
best to access the disk(s) and with how many simultaneous threads.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20150717/dbddc1e7/attachment.html>


More information about the rsync mailing list