Metrics That Matter
In a previous post on Product Engineering Best Practices I asserted you are what you measure. To briefly unpack that assertion — metrics inform behavior, behavior defines who you are, you become what you measure.
What this means is your metrics define the essence of who you are as a team. This post zooms into how to view a metric for released bugs in the context of Waterfall vs Agile.
First, a look at the first couple of months after the release of a Waterfall project.
In typical Waterfall fashion there is a big bang release and all user impacting bugs that slipped through the cracks are apparent all at once. The highs and mediums are worked first and some time later diminishing returns are reached. Not ideal but then again this isn’t news to anyone.
What does this look like for projects that are frequently and iteratively releasing to Production?
In this view we notice that Agile doesn’t eliminate user impacting bugs. In fact, over a similar period of time there may be a similar number of distinct bugs as compared to the Waterfall model. But why is the Agile approach better and why might we want to measure something other than total bug count for Agile teams?
Measure What Matters: Area Under the Curve
Take a look at the above Waterfall image again. If we wanted to improve how users are impacted by released bugs there are two dials we have available to adjust.
- Improve quality before release and use released bug count to measure success. This encourages behavior that brings the overall height of the graph down and therefore reduces impact to users.
- Scale up the number of dev hours dedicated towards fixing post-release bugs, either by scaling up the team or by encouraging overtime. This steepens the drop in bugs and reduces impact to users.
The second option is an obviously unattractive option and as a result it’s understandable why one would be tempted to lean heavily into the first option with Waterfall. Having said that, let’s not forget what both options are attempting to accomplish. At the end of the day, both options are trying to reduce impact to users, or put another way, both options attempt to reduce the area under the curve. Why? Because the area under the curve represents the total impact to users.
User Impact = # Bugs per user x Time
Something that’s worth celebrating is by moving to a more agile approach with frequent, iterative releases even if the total number of bugs summed over all releases remain the same the area under the curve is greatly diminished compared to Waterfall. The fact that fewer bugs are released with each cycle to fewer users and the fact that bugs can be immediately fixed diminishes the curve area.
An agile approach now gives us more than just two options to improve user experience through a more quality product. Some options include:
- Adjust the release cycle. Consider how continuous delivery could chop up the graph even more and further reduce curve area.
- Reduce how many users see each release. Consider targeting fewer users with especially risky releases and slowly scale up usage.
- Fully leverage sprint retrospectives. Why the area under the curve is larger for one team and not another may be team specific. Encourage teams to dig into root causes and experiment with solutions.
You Are What You Measure
Metrics incentivize behavior and behavior defines who you are. Does a metric for counting released bugs matter? Do you want to become a bug minimizing machine? If you’re running Waterfall projects and developer burnout is important to you then sure. Concentrating on that specific metric could make sense.
However, should Agile teams become bug minimizing machines?
Perhaps not.
Total bugs is not the metric that matters. In fact, getting too focused on minimizing released bugs can be detrimental to the iterative agile machinery. It may disincentivize teams from frequently releasing until they’re more confident the bug count is closer to zero, effectively moving the needle in a more Waterfall direction. Ironically, this could increase the area under the curve.
Let’s get to the punchline..
In Agile, bugs are ok as long as the overall user impact is minimized.
If your team is frequently and iteratively releasing value it’s fine for bugs to slip through because the overall impact will be small. Don’t ignore quality. Rather, recalibrate on which metrics truly give a quality signal.
When you focus on the area under the curve with an agile mindset, which is to say if you focus on reducing negative impact to users while frequently releasing value, then who do you become? You are what you measure. In this view, you become a team who optimizes value delivery to end users while keeping a more holistic picture of quality in view.