[ccache] pathological direct mode behavior for files with many often changing include files
yiding at fb.com
Mon Apr 14 16:05:49 MDT 2014
When direct mode is enabled and a file has many dependencies that change,
it will cause `file_info` entries to accumulate over an extended period of
time. Since the manifest file reaping is only based on the actual source
file (not its dependent files) changing, this list can grow very large.
This has two effects:
1. Ccache execution becomes slower when writing. In a simple pathological
test ccache took from 50ms to 150ms on cache misses, this includes the
underlying compile time, so actual ccache attributed performance hit is
worse. The effect is minimized during hits.
2. Ccache infinite loops while writing out manifest after 65k files. In
`write_manifest`, the counter is 16bit while n_file_infos is 32bit.
This may be a somewhat rare occurrence in the field. We use ccache for
continuous integration, and this problem only shows up for a few files
(those with many recursively included frequently changing headers) after
tens of thousands of builds. Once it starts happening, however, subsequent
builds fail as the cache is effectively poisoned.
1. Create 10000 header files with some random valid contents (such as
`extern int foo_RANDOMNUMBER;`)
2. Create a c file that references those header files, and does something
3. Compile with ccache in direct mode.
4. Goto 1.
Simple patch attached, which converts all the counters in `manifest.c` to
`uint32_t` from `uint16_t` and reaps a manifest file when the number of
file_info entries becomes larger than some threshold. I have not
experimented with this threshold, but 10000 should accommodate a sizable
number of file changes.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 1958 bytes
More information about the ccache