checksum feature request

Bill Wichser bill at princeton.edu
Thu Oct 3 14:32:24 UTC 2019


Paul,

Thanks. I can see your point for sure.  I wasn't suggesting an all out 
switch but just an option to use with a flag.  Since we're using a GPFS 
to GPFS transfer over a high speed link, doing billions of files at the 
moment, even a marginal increase in speed helps and is why we were using 
MD4 instead of MD5.

We can easily maintain the patch with the way that the code is well 
structured.  The hope is that we can take a smarter approach once all 
the data is mirrored by using the built-in inotify structure IBM has 
provided.  But this will require a better understanding on our part in 
order to use this effectively.  For now we will continue to just use the 
naive rsync approach with our modifications.

Thanks,
Bill

On 10/3/19 9:43 AM, Paul Slootman via rsync wrote:
> On Tue 01 Oct 2019, Bill Wichser via rsync wrote:
>>
>> Attached is the patch we applied.  Since xxhash is in the distro, a
>> dependency would be required for this RPM.  If nothing else, perhaps the
>> developers should just take a look as this could benefit many.
> 
> "The distro" is a bit vague for a tool like rsync that runs on many
> versions of Unix and linux, and even windows.
> 
> The problem is (AFAIK) that this would need a protocol version bump so
> that the checksum algorithm to be used can be decided upon by both ends
> of the transfer, it's not as simple as simply replacing the current
> algorithm: that would make it impossible to rsync to / from an older
> version of rsync.
> 
> It's an interesting idea, although I wonder how many users would
> actually profit from this. CPU is generally fast enough to handle what
> the IO subsystem can read for most people, I imagine.
> 
> Paul
> 



More information about the rsync mailing list