[ccache] pathological direct mode behavior for files with many often changing include files

Joel Rosdahl joel at rosdahl.net
Tue Apr 15 14:35:10 MDT 2014


> 2. Ccache infinite loops while writing out manifest after 65k files. In
> `write_manifest`, the counter is 16bit while n_file_infos is 32bit.

Oh, that's an oversight in an early refactoring of the manifest format.
Thanks for finding and fixing the problem.

> Since the manifest file reaping is only based on the actual source
> file (not its dependent files) changing, this list can grow very large.

For the record, and just to clarify, a changed source file results in a
different manifest file, which is subject to the same cache cleanup logic
as other files in the cache. New object file entries are added to a
manifest file when the set (or content) of included header files for a
given source file (in combination with compiler options, etc.) has changed.
When the list of such entries becomes larger than 100, the manifest will be
emptied as a crude method of garbage collection.

Doing the same maneuver for a large number of file info entries, like you
suggest, sounds like a good idea. Patch applied.

-- Joel


On 15 April 2014 00:05, Yiding Jia <yiding at fb.com> wrote:

> When direct mode is enabled and a file has many dependencies that change,
> it will cause `file_info` entries to accumulate over an extended period of
> time. Since the manifest file reaping is only based on the actual source
> file (not its dependent files) changing, this list can grow very large.
>
> This has two effects:
>
> 1. Ccache execution becomes slower when writing. In a simple pathological
> test ccache took from 50ms to 150ms on cache misses, this includes the
> underlying compile time, so actual ccache attributed performance hit is
> worse. The effect is minimized during hits.
> 2. Ccache infinite loops while writing out manifest after 65k files. In
> `write_manifest`, the counter is 16bit while n_file_infos is 32bit.
>
> This may be a somewhat rare occurrence in the field. We use ccache for
> continuous integration, and this problem only shows up for a few files
> (those with many recursively included frequently changing headers) after
> tens of thousands of builds. Once it starts happening, however, subsequent
> builds fail as the cache is effectively poisoned.
>
>
>
> Repro steps:
>
> 1. Create 10000 header files with some random valid contents (such as
> `extern int foo_RANDOMNUMBER;`)
> 2. Create a c file that references those header files, and does something
> trivial.
> 3. Compile with ccache in direct mode.
> 4. Goto 1.
>
>
> Simple patch attached, which converts all the counters in `manifest.c` to
> `uint32_t` from `uint16_t` and reaps a manifest file when the number of
> file_info entries becomes larger than some threshold. I have not
> experimented with this threshold, but 10000 should accommodate a sizable
> number of file changes.
>
>
> _______________________________________________
> ccache mailing list
> ccache at lists.samba.org
> https://lists.samba.org/mailman/listinfo/ccache
>
>


More information about the ccache mailing list