[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64

Sebastian Andrzej Siewior rsync at ml.breakpoint.cc
Mon May 18 21:41:44 UTC 2020


On 2020-05-18 21:55:13 [+0200], Jorrit Jongma wrote:
> What do you base this on?

So my memory was wrong. SSE2 is supported by all x86-64bit CPUs. Sorry
for that.

> would imply that SSSE3 is enabled out of the box on builds on machines
> that support it, this is not the case (it certainly isn't on my Ubuntu
> box). It would be preferred to detect this at runtime but getting that
> to work on GCC is (apparently) a mess, and would probably require
> modifications to configure/Makefile/etc that I'm not comfortable
> doing, as my lack of expertise on those would probably lead me to
> break the build for somebody else. If someone knowledgable enough in
> that area wants to fix it, though...

My suggestion would be to have a get_checksum1_sse2() and
get_checksum1_sse3() and always build them. The compiler should support
it. Then on runtime you would check for sse3 and based on the result
get_checksum1() would either invoke the _sse2() or sse3().

Without auto detection it won't be utilized by distros. But yes, this
could be improved afterwards.

Sebastian



More information about the rsync mailing list