[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64

Jorrit Jongma jorrit.jongma+rsync at gmail.com
Mon May 18 19:55:13 UTC 2020


What do you base this on?

Per https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html :

"For the x86-32 compiler, you must use -march=cpu-type, -msse or
-msse2 switches to enable SSE extensions and make this option
effective. For the x86-64 compiler, these extensions are enabled by
default."

That reads to me like we're fine for SSE2. As stated in my comments,
SSSE3 support must be manually enabled at build time. Your comment
would imply that SSSE3 is enabled out of the box on builds on machines
that support it, this is not the case (it certainly isn't on my Ubuntu
box). It would be preferred to detect this at runtime but getting that
to work on GCC is (apparently) a mess, and would probably require
modifications to configure/Makefile/etc that I'm not comfortable
doing, as my lack of expertise on those would probably lead me to
break the build for somebody else. If someone knowledgable enough in
that area wants to fix it, though...

The only reason there's an SSE2 backport (you'll find SSSE3 support on
most CPUs up to nearly a decade old) in the first place is because by
my understanding SSE2 is supported on all x86-64 CPUs out of the box.

> You can't replace the code like that with SSE2+. You need runtime
> detection for this. Otherwise it can't be enabled by distros becuase it
> would fail on CPUs without SSE2+. Only SSE is part of generic x86-64.



More information about the rsync mailing list