[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
ben.rubson at gmx.com
Mon May 18 16:20:07 UTC 2020
Thank you Jorrit for your detailed answer.
> On 18 May 2020, at 17:58, Jorrit Jongma via rsync <rsync at lists.samba.org> wrote:
> Well, don't get too excited, get_checksum1() (the function optimized
> here) is not the great performance limiter in this case, it's
> get_checksum2() and sum_update(), which will be using MD5.
Certainly that all other functions using MD5 could be updated to use your SSE-optimized function.
So that we have a full SSE MD5 support, wherever rsync is using it (basis file checksum, rolling checksum etc...).
I think one nice performance improvement could be when the receiver checksums the (big/huge) basis file, because here the sender is then simply waiting...
> Unfortunately, single stream MD5 cannot be effectively optimized with
> SSE, at least I've not seen an SSE version faster than pure C
I was about to tell you that we successfully implemented it into FreeBSD a few years ago, but it's CRC32, not MD5...
At least sounds like the algorithm author / inspiration, Mark Adler, is the same :)
Anyway, this is a first interesting SSE MD5 support.
More information about the rsync