[ccache] why is limit_multiple ignored?
joel at rosdahl.net
Tue Jan 16 21:29:27 UTC 2018
On 7 January 2018 at 14:02, Scott Bennett wrote:
> The design problem is that there is no centralized index maintained of
> cache entries' paths, their sizes, and their timestamps, necessitating
> the plumbing of the directory trees. [...]
Thanks for sharing your ideas!
I fully agree that the cleanup algorithm/design hasn't aged well. It has
essentially stayed the same since Tridge created ccache in 2002, when
storage devices were much smaller and a cache of one GB or two probably
was considered quite large.
Trying to improve the cleanup algorithm/design has not been a priority
since I personally haven't seen such pathological behavior that you
describe ("cleanups that can take over a half hour to run and hammer a
hard drive mercilessly"). However, I'm not at all convinced that
introducing a centralized index is the panacea you describe.
Do you have a sketch design of how to maintain a centralized index? Here
are some requirements to consider for the design:
A. It should cope with a ccache process being killed at any time.
B. It should work reasonably well on flaky and/or slow file systems,
C. It should not introduce lock contention for reasonable use cases.
D. It should be quick for cache misses (not only for cleanup).
E. It should handle cleanup quickly and gracefully.
I'm guessing that you envision having one centralized lock for the
index. The tiny stats files already suffer from lock contention in some
scenarios because they are so few. That's why ideas like
https://github.com/ccache/ccache/issues/168 and comments like
(comment number 2) pop up. Even if a centralized index only needs a lock
for writing, it would still serialize writes to the cache. I have
trouble seeing how that would work out well. But I'll gladly be proved
For reference: When updating the stats files, the current method is to
acquire a lock, write the new content to a temporary file, rename the
temporary file to the target file and release the lock. Writing the full
content to a temporary file and renaming it to the target solves A and
B, and having 16 files instead of 1 improves on C. Having no index
trivially solves D. (And E is not solved well at all.)
> The lack of a centralized index can also result in cache evictions
> that are not actually LRU.
Not having true LRU eviction doesn't bother me at all. I think that it's
a very reasonable trade-off to have "approximate LRU eviction" if the
performance is better and/or the implementation is easier.
> Where does the hysteresis of (0.9-0.8)max_size=0.1*max_size come from?
When the cache has filled up at least once, the fill grade of one of the
16 subdirectories is a random variable between 0.8 and 1.0 with uniform
distribution, so the probability of the total size of the 16
subdirectories is approximately a normal distribution with 0.9 as the
mean. In other words, it's likely that the cache size is around 0.9 and
much less likely that it's near 0.8 or 1.0. For serial usage of ccache,
> What I've seen is that the cleanups are usually triggered by
> 0.8*max_size, and that does not change when I set limit_multiple =
As already explained, nothing is triggered at 0.8*max_size or even at
limit_multiple*max_size, so the reason for your 24 GB cache is something
else. And that something else is most likely that when several ccache
invocations trigger a cleanup of the same subdirectory at the same time,
the net effect will be removal of more than
(1-limit_multiple)*max_size/16, potentially much more. I bet that if you
run something like "du $CCACHE_DIR/$X" for each X in [0-9a-f], or just
count the number of files in each subdirectory, you'll see some
subdirectories that are much smaller than limit_multiple*max_size/16 but
some that are near max_size/16.
*** *** *** *** *** *** *** *** *** *** *** *** ***
During a couple of recent walks with my daughter in the stroller, I've
been thinking more about how to improve ccache's cleanup. I think that I
have come up with something that will be significantly better, but I
don't have time to describe any details right now. Stay tuned.
More information about the ccache