[ccache] Buffer size for IO operations is too small

Anders Björklund anders at itension.se
Sun Apr 3 20:59:29 UTC 2016

Anders Björklund wrote:
> Michael Kolomeytsev wrote:
>> I've discovered that there is too small buffer size for IO in ccache: 16k
>> or 10k
>> (in hash_fd, copy_fd, copy_file).
> But your observations are very interesting, and please post
> more if you have it. Would also be nice to have some follow-up
> on the observation about ccache problems with multiple cores:
> https://github.com/jrosdahl/ccache/issues/54 (also on OS X)
> I'm thinking that hash and copy could do with different macros...

Actually three macros, hash, compress/decompress and plain old copy.
Thought I'd move the "copy" case aside, away from the other buffers...

You'd think that copying a file would be a simple thing to do, right ?
Actually, on some systems like Windows or Mac OS X it is. But on Linux:

Found this interesting blog post, that came with some benchmarks too:

So the first thing to do would be to make the I/O buffer size into a
whole multiple of the block size, that is: 16384 instead of 10240.
Avoids having to do partial page copies later. And then allocating the
buffer in kernel space instead of user space sounded like a good idea.

But having to look for various OS/kernel versions of sendfile()? Eww.
Might as well stick with "splice()", since other main systems like
have solutions already: Win32 have CopyFile and OS X has copyfile.
And doing some "advise/allocate" sounded easy, but had pitfalls too.

Here is the end result, in case anyone is interested in a preview:

It sounded like a good idea, but needs some actual benchmarks to see
whether it was actually worth it. Probably should check st_blksize too.

The actual I/O can probably be made twice as fast (e.g. for a 1M file)
Question is whether it makes any real impact of the ccache run time ?

pipe+splice + advices + trunc   1175ns  1283ns  1290ns
read+write 4bs                  1537ns  2126ns  2210ns  (+ 30.8%)
read+write 10k                  2334ns  2356ns  2668ns  (+ 98.6%)
read+write bs                   2515ns  2692ns  4591ns  (+ 114.0%)

But 256K seemed like overkill (over 16K), at least for plain copy I/O.
Might still be some additional benefits when doing gzip or md4, though.


PS. We gave up on mmap already, for other reasons (high maintenance)

More information about the ccache mailing list