Last quarter our CI pipeline was spending more time compiling than our engineers were spending writing code. The average build had crept past eleven minutes, pull requests were sitting idle waiting for a green check, and the on-call rotation had started to feel like a queue management exercise rather than an engineering practice. Something had to change, but we had no appetite to swap the toolchain mid-cycle.

What followed was a four-week sprint focused entirely on instrumentation, incremental caching, and a handful of configuration changes that most of us had been meaning to look at for months. By the end we had brought the median build down to just under five minutes - a 54% reduction - without touching a single dependency version or introducing a new language runtime.

Start by measuring, not guessing

The first instinct when a build is slow is to guess at the bottleneck and reach for a solution. We have all done it: turn on parallelism here, add a cache layer there, hope for the best. What we did instead was spend the first week doing nothing but measuring. We instrumented each build phase with structured timing logs, exported the data into a simple dashboard, and stared at it until patterns emerged.

The data was humbling. Our longest phase was not compilation - it was test setup. A shared database fixture that every integration test suite was spinning up from scratch, every time, accounted for nearly three minutes of a typical build. It had been that way for two years. Nobody had noticed because nobody had looked.

"The data was humbling. Our longest phase was not compilation - it was test setup. A shared fixture that every integration test suite was spinning up from scratch accounted for nearly three minutes of a typical build."

Once we had that visibility the prioritisation became obvious. We converted the fixture to a shared, lazily-initialised singleton scoped to the test runner process rather than to each suite. We added a restore-from-snapshot step for the local development path. And we moved a handful of slow integration tests that were running in the unit tier into a separate, parallelised stage that only ran on merge to main.

None of these were exotic techniques. They are the kind of things you read about in engineering blog posts and nod along to, then never find the time to actually implement. The lesson, obvious in retrospect, is that measurement forces prioritisation. Without the dashboard, we would have kept guessing, and guessing tends to produce local optima at best.

There is more work ahead - we want to take another pass at the asset compilation step, which is still noisier than we would like - but we are starting from a position of knowledge rather than intuition. That, more than any individual optimisation, is the change worth carrying forward.