[ccache] [PATCH] Speed up "copy4", "copy64" on little-endian systems.
ams at codesourcery.com
Wed Nov 21 11:00:04 MST 2012
On 21/11/12 17:46, Andrew Stubbs wrote:
> The copy64 function implements an endian-safe copy routine for
> an array of 16 32-bit integers, but this is sub-optimal on machines
> where the byte-order is already correct. Likewise for copy4.
I should add, I also had a look at "mdfour64", the other function that
shows up at the top of a profile's hot-functions list, but I found
nothing obviously inefficient about that implementation.
Given that the algorithm is completely linear (with every line depending
on the result of the previous), and that everything is completely
inlined already, I don't see that fiddling with the C code could do
anything that the compiler's transformations don't already do.
More information about the ccache