Thursday, August 5, 2021

Continuous Integration Benchmark Metrics

 While Continuous Integration should be a professional software development standard by now, many organizations struggle to set it up in a way that actually works properly.

I've created a small infographic based on data taken from the CircleCI blog - to provide an overview of the key metrics you may want to control and some figures on how the numbers should look like when benchmarked against industry performance:



The underlying data is from 2019, as I could not find data from 2021.

Key Metrics

First things first - if you're successfully validating your build on every single changed line of code and it just takes a few seconds to get feedback, tracking the individual steps would be overkill. The metrics described in this article are intended to help you locate improvement potential when you're not there yet.


Build Frequency

Build frequency is concerned with how often you integrate code from your local environment. That's important because the assumption that your local version of the code is actually correct and consistent with the work of the remaining team is just that - an assumption, which becomes less and less feasible as time passes.

By committing and creating a verified, valid build, you reset the timer on that assumption, thereby reducing the risk of future failure and rework.

A good rule of thumb is to build at least daily per team member - the elite would validate their changes every couple of minutes! If you're not doing all of the following, you may have serious issues:

  • Commit isolated changes
  • Commit small changes
  • Validate the build on every single change instead of bulking up

Build Time

The amount of time it takes for a committed change until the pipeline has successfully completed - indicating that the build is valid and ready for deployment into production.

Some organizations go insanely fast, with the top projects averaging at 2 seconds from commit all the way into production - and it seems to work for them. I have no insights whether there's much testing in the process - but hey, if their Mean Time to Restore (MTTR) on productive failures is also just a couple minutes, they have little to lose.

Well, let's talk about normal organizations - if you can go from Commit to Pass in about 3 and a half minutes, you're in the median range: half the organizations will still outperform you, half won't.

If you take longer than 28 minutes, you definitely have to improve - 95% of organizations can do better!


Build Failure Rate 

The percentage of committed changes causing a failure.

The specific root cause of the failure could be anything - from build verification, compilation errors or test automation - no matter what, I'm amazed to learn that 30% of projects seem to have their engineering practice and IDE tooling so well under control that they don't even have that problem at all, and that's great to hear. 

Well, if that's a problem for you like 1/5th of the time, you'd still pass as average, but if a third or more of your changes are causing problems, you should look to improve quickly and drastically!


Pipeline Restoration Time

How long it takes to fix a problem in the pipeline.

Okay, failure happens. Not to everyone (see above), but to most. And when it does, you have failure demand - work only required because something failed. The top 10% organizations can recover from such a failure within 10 minutes or less, so they don't sweat much when something goes awry. If you can recover within the hour, you're still on average.

From there, we quickly get into a hugely spread distribution - the median moves between 3 hours and 18 hours, and the bottom 5% take multiple days. The massive variation between 3 and 18 hours is explained easily - if you can't fix it before EOB, there's an entire night between issue and resolution.

Nightly builds, which were a pretty decent practice just a decade ago, would immediately throw you at or below median - not working between 6pm and 8am would automatically botch you above 12 hours, which puts you at the bottom already.


First-time Fix Rate

Assuming you do have problems in the pipeline - which many don't even have, you occasionally need to provide a fix to return your pipeline to Green condition.
If you do CI well, your only potential problem should be your latest commit, and if you follow the rules on build frequency properly, the worst case scenario is reverting your change, and if you're not certain that your fix will work, that's the best thing you can do in order to return to a valid build state.

Half the organizations seem to have this under control, while the bottom quartile still seems to enjoy a little bit of tinkering - with fixes being ineffective or leading to additional failures. 
If that's you, you have homework to do.


Deployment Frequency

The proof of the pudding: How often you successfully put an update into production.

Although Deployment Frequency is clearly located outside the core CI process, if you can't reliably and frequently deploy, you might have issues you maybe shouldn't have.

If you want to be great, aim for moving from valid build to installed build many times a day. If you're content with average, once a day is probably still fine. When you can't get at least one deployment a week, your deployment process is definitely ranking on the bottom of the barrel and you have definite room for improvement.

There are many root causes for lower deployment frequency, though: technical issues, organizational issues or just plain process issues. Depending on what they are, you're looking at an entirely different solution space: for example, improving technically won't help as long as your problem is an approval orgy with 17 different comittees.


Conclusion

Continuous Integration is much more than having a pipeline.

Doing it well means:

  1.  Integrating multiple times a day, preferably multiple times an hour
  2. Having such high quality that you can be pretty confident that there are no failures in the process, 
  3. And even when a failure happens, you don't break a sweat when having to fix it
And finally, your builds should always be in a deployable condition - and the deployment itself should be so safe and effortless that you can do it multiple times a day.

Thousands of companies world-wide can do that already. What's stopping you?




No comments:

Post a Comment