The tool cache manifesto
September 4, 2024

In my career I’ve used many tools which keep local file caches of one kind or another: pip, Maven, npm, clj, go, staticcheck, and dozens of others. I’ve also written a few myself.

Besides running them on my local computers, I’ve also maintained automated test suites and CI systems that use these tools.

These experiences have led me to a set of ideas about what makes a tool cache good or bad.

Tool caches must be correct

The most important job of a tool cache is to be correct. Correctness means that the user cannot tell that the tool is making use of a cache—other than that the tool is faster.

Here are some specific behaviors of correct tool caches:

Here are some smells that may indicate a poor tool cache:

Caching is a privilege, not a right

If a tool’s cache behavior is not correct, it’s usually best to disable the cache. Better to be slow and correct than fast and broken.

Disabling the cache can mean running the tool in a no-cache mode or wiping the cache before each invocation.

Occasionally when you do this you’ll discover that a tool cache is not just buggy, but also unnecessary, because the cache hardly makes the tool faster at all.

Cache location

By default, tools should locate their caches according to the system’s conventions. For example:

Dumping cache files in /tmp or ~/.bespoke may have been fine in 1995, but it’s not acceptable anymore.

It should be easy for the user to override a cache location, typically via an environment variable. This should be clearly and prominently documented.

Dependencies

If your tool has a code dependency that writes a file cache, you own that cache.

If a dependency’s caching behavior is broken, your tool needs to shield the user from it. For example, the tool might disable the cache, it might selectively delete the cache when it’s in a bad state that confuses the dependency, or it might add concurrency control if the dependency doesn’t properly handle concurrent cache use.

If a dependency places its cache in a weird location, your tool should override it.

If your tool has multiple caches, they should be consolidated in a single directory rather than scattered across the file system.

The best cache is no cache

Modern hardware can read files and process data very quickly. A tool can do a huge amount of work in 5 seconds (or even 50 ms).

Good tools are fast. When a good tool uses a file cache for performance reasons, it’s because the problem at hand is fundamentally expensive enough that a cache makes a real difference after the work has been well-optimized.

Adding a cache to a slow, inefficient tool can result in a tool which is still slow some of the time but also has new correctness issues. If a tool’s authors haven’t delivered respectable performance, I’m skeptical that they can implement a solid file cache.

Doing tool caches right

I’d love to point to an article which explains how best to implement file caches for tools. Unfortunately, I don’t think it exists.

Each use case has its own particular needs; doing this task well requires thinking through those needs and coming up with a design that plays nicely with the underlying capabilities of the OSes you are targeting.

I can recommend studying the Go tool’s cache; it works extremely well.

Apenwarr’s blog post mtime comparison considered harmful is well worth a read. It’s focused on make-like tools, but many of the pitfalls he points out also await implementers of file caches.