[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64

Sebastian Andrzej Siewior rsync at ml.breakpoint.cc
Mon May 18 19:15:43 UTC 2020


On 2020-05-18 17:06:51 [+0200], Jorrit Jongma via rsync wrote:
> diff --git a/checksum.c b/checksum.c
> index cd234038..4e696f3d 100644
> --- a/checksum.c
> +++ b/checksum.c
> @@ -99,6 +99,7 @@ int canonical_checksum(int csum_type)
>   return csum_type >= CSUM_MD4 ? 1 : 0;
>  }
> 
> +#ifndef __SSE2__  // see checksum_sse2.c for SSE2/SSSE3 version
>  /*
>    a simple 32 bit checksum that can be updated from either end
>    (inspired by Mark Adler's Adler-32 checksum)
> @@ -119,6 +120,7 @@ uint32 get_checksum1(char *buf1, int32 len)
>   }
>   return (s1 & 0xffff) + (s2 << 16);
>  }
> +#endif

You can't replace the code like that with SSE2+. You need runtime
detection for this. Otherwise it can't be enabled by distros becuase it
would fail on CPUs without SSE2+. Only SSE is part of generic x86-64.

Sebastian



More information about the rsync mailing list