[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
Sebastian Andrzej Siewior
rsync at ml.breakpoint.cc
Mon May 18 21:41:44 UTC 2020
On 2020-05-18 21:55:13 [+0200], Jorrit Jongma wrote:
> What do you base this on?
So my memory was wrong. SSE2 is supported by all x86-64bit CPUs. Sorry
> would imply that SSSE3 is enabled out of the box on builds on machines
> that support it, this is not the case (it certainly isn't on my Ubuntu
> box). It would be preferred to detect this at runtime but getting that
> to work on GCC is (apparently) a mess, and would probably require
> modifications to configure/Makefile/etc that I'm not comfortable
> doing, as my lack of expertise on those would probably lead me to
> break the build for somebody else. If someone knowledgable enough in
> that area wants to fix it, though...
My suggestion would be to have a get_checksum1_sse2() and
get_checksum1_sse3() and always build them. The compiler should support
it. Then on runtime you would check for sse3 and based on the result
get_checksum1() would either invoke the _sse2() or sse3().
Without auto detection it won't be utilized by distros. But yes, this
could be improved afterwards.
More information about the rsync