Engineering
How we cut build times in half without changing the stack
Every engineering team we know has a story about the build that got away. Ours started the same way most of them do: a pipeline that took eleven minutes when the repository was young, and forty-one minutes three years later. Nobody made a single decision that caused it. It was a hundred small ones, each reasonable at the time.
The obvious answer was a rewrite - new bundler, new CI provider, a fashionable caching layer. We resisted, mostly because rewrites have a habit of trading a known slow pipeline for an unknown broken one. Instead we set ourselves a constraint: cut the build time in half using only the tools we already had.
Start by measuring, not guessing
The first week produced nothing but graphs. We instrumented every stage of the pipeline and let it run for five days before touching anything. That patience paid for itself immediately: the stage everyone blamed - the test suite - accounted for barely a fifth of the wall-clock time. The real cost was hiding in dependency installation and a set of artefact uploads that ran serially for no reason anyone could remember.
“The stage everyone blamed accounted for barely a fifth of the wall-clock time. The real cost was hiding where nobody was looking.” Dana Okoro, on the first week of measurement
Once we could see the pipeline honestly, the fixes were almost boring. We cached the dependency install keyed on the lockfile, which removed nine minutes on the median run. We parallelised the artefact uploads, which removed six more. And we split the test suite by historical timing data rather than by directory, which balanced the shards and shaved the long tail off every run.
Three changes, no new tools, and the median build now finishes in nineteen minutes. The lesson we keep returning to is not about builds at all: when a system feels slow, the instinct to replace it is usually a way of avoiding the duller work of understanding it. Measure first. The stack was never the problem.