[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64

Jorrit Jongma jorrit.jongma+rsync at gmail.com
Mon May 18 20:23:18 UTC 2020

That is the goal, a drop-in optimization.

I don't know if xxhash has the required properties to be able to
replace the rolling checksum (bytes need to be able to easily be
shifted on/off at boths ends, see match.c).

However, as there's also talk of replacing the MD5 checksum with
xxhash (again, the rolling checksum isn't the MD5 checksum, they're
two different checksums used at different times for different
reasons), and that would lead to a much larger performance benefit
than replacing the rolling checksum, I suggest we keep xxhash over
there. xxhash isn't a cryptographic checksum, and if we replace MD5
with xxhash we still have two different checksums being verified when
blocks are replaced, which should increase their total strength
(decrease the odds of a collision).

You could argue xxhash is fast enough to replace both checksums even
if the rolling checksum has to be recalculated for every shifted byte,
but that would require extensive testing and some study into how to
refactor the double checksum into a single one while maintaining
backward compatibility (could be easy, could be hard).

> Still. You claim in your patch that
> | Benchmarks                   C           SSE2        SSSE3
> | - Intel i7-7700hq            1850 MB/s   2550 MB/s   4050 MB/s
> while xxhash [0] claims on a Core i5-3340M @2.7GHz that:
> |Version        Speed on 64-bit Speed on 32-bit
> |XXH64          13.8 GB/s       1.9 GB/s
> so using xxhash64 for that work would also boost !x86 platforms.
> However your patch has the benefit that no changes are required on the
> remote side. I like that.
> [0] https://github.com/Cyan4973/xxHash#benchmarks

More information about the rsync mailing list