[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64

Ben RUBSON ben.rubson at gmx.com
Mon May 18 16:20:07 UTC 2020


Thank you Jorrit for your detailed answer.

> On 18 May 2020, at 17:58, Jorrit Jongma via rsync <rsync at lists.samba.org> wrote:
> 
> Well, don't get too excited, get_checksum1() (the function optimized
> here) is not the great performance limiter in this case, it's
> get_checksum2() and sum_update(), which will be using MD5.

Certainly that all other functions using MD5 could be updated to use your SSE-optimized function.
So that we have a full SSE MD5 support, wherever rsync is using it (basis file checksum, rolling checksum etc...).

I think one nice performance improvement could be when the receiver checksums the (big/huge) basis file, because here the sender is then simply waiting...

> Unfortunately, single stream MD5 cannot be effectively optimized with
> SSE, at least I've not seen an SSE version faster than pure C

I was about to tell you that we successfully implemented it into FreeBSD a few years ago, but it's CRC32, not MD5...
https://github.com/freebsd/freebsd/commit/c4b27423f57c30068aff3f234c912ae8d9ff1b6a
https://github.com/freebsd/freebsd/commit/5a798b035b4858923878c014a5faa48b2f9aa6e7
At least sounds like the algorithm author / inspiration, Mark Adler, is the same :)

Anyway, this is a first interesting SSE MD5 support.


More information about the rsync mailing list