[Bug 3099] Please parallelize filesystem scan

samba-bugs at samba.org samba-bugs at samba.org
Fri Jul 17 09:01:04 UTC 2015


https://bugzilla.samba.org/show_bug.cgi?id=3099

--- Comment #7 from Rainer <rainer at voigt-home.net> ---
Hi,

I'm experiencing the very same problem: I'm trying to sync a set of VMWare disk
files (about 2.5TB) with not too many changes, and direct copying is still
faster than the checksumming by a quite large margin because of the sequential
checksumming on source and target just doubles the time needed.

I think the point is that the GigE link between the PC and the NAS achieves
about 80MB/s, and the HDD read rate is not much higher (approx. 130MB/s). 

When doing the checksumming on source and target in parallel we could ideally
(if nothing changed) reach the read rate of the HDDs as 'transfer' bandwidth,
because this is the speed at which we can verify that the data is the same on
source and target. The sequential approach like it is now reduces the initial
check to half the HDD read rate, so transfering unchanged files will only yield
about 65MB/s in my case, which is slower than simple copying.

Is this patch you proposed some years ago something I can apply to and try on a
current rsync version? If not, could you update it to the 3.1.x version so I
can benchmark the parallel checksumming in my situation?

Best Regards
Rainer

-- 
You are receiving this mail because:
You are the QA Contact for the bug.



More information about the rsync mailing list