Wednesday, January 19, 2022

Team archetypes: basic setups

 "Which kind of team setup should we choose?" - a common question in many organizations with the luxury of having an IT department bigger than a single team. First of all: there's no universal best answer - the choice is often constrained by factors outside the teams' control - such as HR, legal,  company politics and so on. Irrespective of such details, let's explore what the archetypes are.

As long as there are only enough people for a single team, the question "which setup do we choose?" is - fortunately - moot: build a team and get to work. Approaches like Skunkworks, which predate software development, conclude that a single team is always less overhead than multiple teams, and the overhead can become a performance killer - down to the point where a single, effective team can outperform half a dozen teams struggling to get their act together.

This entire article is written under the premise that a single team isn't enough to get the job done, and that we already exhausted all feasible options of doing the same with fewer people - hence: there is an indisputable need to set up a multi-team structure.


Four Team Archetypes

Let's take a look at the team setup chart. For simplicity's sake, let's just assume that the only functional specializations are Analysis, Coding and Testing. The key points of the article don't change if we have further specializations, such as Frontend/Backend/Database development or include Operations. If anything, the problems caused by specialization would be exacerbated.



Component Teams


One of the most common setups for organizations with just a handful of teams is the Component Team: each team consists of the people required to deliver a single component. This works splendidly as long as there are only few components, and they are loosely coupled. In that case, each team can perform their work independently. It becomes an entirely different ballgame when components are tightly coupled, interdependent - or worse: co-dependent.

Key questions to ask when setting up component teams include:
  • Will each team have all the skills and means to complete the delivery lifecycle of their component? If the answer is "No," component teams will quickly find themselves blocked and unable to consistently deliver value. Proceed only if the answer is "Yes."
  • How close is the coupling between each component? If the answer is, "Tight", component teams will likely find themselves rendered ineffective. Proceed only if the answer is, "Very loose."
  • How dependent are the components on each other in delivering end user value? If the answer is anywhere on the scale from "Master-Slave" to "Closely interlinked", at least one component team will be a bottleneck, and overall delivery performance will depend on the constraining team.

Specialist Teams

The staple setup in corporations with hundreds, potentially thousands of IT people is the Specialist team: each team consists of specialists performing a single function in the software development lifecycle. In case of very large organizations, we often see a double specialization - each component having their own specialist teams performing only a single type of task. 
This kind of "industrialized, conveyor-belt delivery" process optimizes resource efficiency over effectiveness, and typically ends up with massive bottlenecks diminishing flow efficiency of value delivery close to Zero: Regardless of how many people are involved, "there are always too few people." The problem is less the amount of work than the immanent optimization favoring workload over results.


Key questions to ask when setting up specialist teams include:
  • Where will the bottlenecks be? You will have at least one bottleneck, and if it's not where you want it to be, you won't understand why your organization is so ineffective.
  • How do we keep flow across teams balanced? Managing flow in specialist teams requires understanding the intake, throughput and flowback rates of every single team. If these are out of balance across teams, work will get stuck.
  • How will people communicate? The classic Waterfall-style "communication by document" is slow, inefficient, ineffective and error-prone. Proceed only when you have a feasible solution for communication flow.

Feature Teams

The key reason for feature teams is to reduce the cross-team friction caused by delivering unusable component work which requires integration efforts at a later point in time. Feature teams minimize the handovers across team boundaries which lead to additional communication and waiting queues. 

Neither component nor skill boundaries will stop the flow of work from demand to delivery in a feature team. The way of working in feature teams is very natural - people sit together, do whatever it takes, and get the job done.


Key questions to ask when setting up feature teams include:
  • Which organizational impediments prevent us from setting up feature teams, and how do we overcome them? The barriers are often more centered around management, structure and contracts, not at shop floor level.
  • How do we keep team size low and still make sure people know what they're doing? Feature teams require a lot of initial learning as every team member is confronted with a vast amount of things they don't know.
  • Which engineering practices will we use to ensure that teams won't constantly step on each others' toes? If you have no solutions, you need to find these before proceeding.

Generalist Teams

The generalist team is often considered the holy grail of Agilists. A generalist team is multi-skilled both in software engineering, and in the product domain: each team consists of individuals who can exercise multiple functions in the development lifecycle, and who are able to work on all relevant components of their product. Generalists hardly find themselves in bottleneck situations, and are always able to contribute value. Setting up generalist teams has a massive learning curve, requiring continuous multi-learning. In return, generalist teams minimize the concern of "truck count," and they are able to flexibly take on any feature that maximizes value.

Key questions to ask when setting up feature teams include:
  • Where are currently our knowledge constraints? Generalist teams immediately expose where the number of individuals with either product domain expertise or specific lifecycle expertise are too rare to ensure each team has seed knowledge to get started.
  • How much specialization is really required? If you're working on teams where lacking deep domain knowledge becomes a life-or-death decision, generalization may not be your primary choice. These cases are rarer than we would think. Based on the Pareto Principle, 80% of the work requires 20% of the knowledge - and generalists can do that.
  • What does it take to set up generalist teams? How do we get started? Can we start right away, should we upskill feature teams or component teams? Do we need to re-organize entirely?
The most common concern towards setting up generalist teams is, "Not everyone can do everything." Indeed. It takes time and learning, and not everyone will specialize deeply in everything. There are two key points to ponder: the starting point, and the optimum point. 
At the starting point, we set out with people who may find themselves unable to contribute outside their specialization, and based on where the team's constraint is, that's what they need to learn first.
At the optimum point, team members can offer basic contribution wherever it is needed - and "wherever" isn't "everywhere." A generalist team is effective when there are no skill-related bottlenecks in the team, not when everyone is a master of everything. It could mean that a tester understands some basics of coding and can make simple changes by themselves - not that they need to establish profound coding expertise. 

What's the difference between generalist and feature teams?

Feature teams are a lower standard. We can easily set up a feature by putting an analyst, a developer and a tester into one team and giving them access to all the modules they need to work with.

Generalist teams should be feature teams. Only some feature teams are - or evolve into - generalist teams.

A generalist team consists of coders who can analyze or test, testers who can analyze and maybe write some code, and analysts who can also code or test. People who worked in smaller companies are often used to this way of working - people from a corporate environment are often unable to even grasp this concept as it's too detached from their reality.


How about Generalist Component teams?

Small teams of people, each of whom can do most of the work on their own components. This pattern is very typical in modern organizations opting for microservice architectures.

Dissolving specialization boundaries within teams is desirable to increase "truck count," and definitely useful. We minimize delays, handovers and information gaps within the team. On the other hand, the effectiveness of the overall development organization remains constrained by the lowest performing component team. Component team organizations can't solve this fundamental flow impediment, hence the teams' generalization ends up being a local optimization.

We can even find this local optimization working against us - the highest performing teams might create overload on other teams, further reducing the overall performance of the organization. High-performance generalist component teams might render the entire organization ineffective, despite best intentions.



What to choose?

In other articles on the topic of team archetypes, we will explore a little further what the implications of each archetype are. The topic is complex, hence there is no simple answer. The best answer is: "Caveat emptor." Know what you're opting for and what the consequences of your choices are.


What about Mix-And-Match?

Should every team within an organization have the same setup? Wouldn't it be better to mix-and-match? A few component teams doing bulk haulage, supported by a few specialist teams for Feature Analysis and End-to-End testing, maybe with a Task Force Feature Team for high urgency tasks? 

That's actually what we find in many organizations, so it's definitely a realistic scenario. Our question is less about "Can we do that?" than it's about, "Should we do that?" - and again: there's no simple answer. As a product matures and growth reaches a plateau, that's most likely what we're going to find. For emerging products, the suggested mixed setup is probably not going to provide the necessary performance to succeed. 

Where to start?

Try starting with the simplest setup, and only add complexity when inevitable.

And that simplest possible setup is a small number of Generalist teams. If you can, try that first.

Wednesday, January 12, 2022

The economy of incentives

People working in bigger companies often do things which don't really seem to make sense when looking at the big picture. And yet, they're often pretty reasonable in their choices. They are intuitively following basic psychological and economical principles. Let's explore.




The economy of needs, wants and incentives

Let's start out at the detail level of the individual. Each person has needs:

The need pyramid

Maslow's hierarchies of Needs, sometimes also called "Maslow's pyramid" is built on the theory that every human being has needs, and these needs follow a hierarchical order. This picture is already a simplification, more comprehensive models are abundant on the Internet. 
While each person has the same types of needs and the pyramid is the same, the level at which these needs are considered "satisfied" widely differ. 
One person might feel basically satisfied with a bowl of rice and a few cups of water, whereas another person requires a more special diet to  sustain. These are the basic needs, which must be met to ensure survival. People whose basic needs aren't met will not have a mind to think about other things, their primary concern will always be meeting these needs.
Once people are satisfied at the basis, they care for psychological needs - family, social ties, being accepted and valued.
With such a firm foundation starts the restless search for meaning in life - developing one's personality,  being creative, pursuing a cause. At this level, people look for transcendency.
Dan Pink mentions this in his book, "Drive" - without fully realizing that it's not just that the money must be off the table, but all basic and psychological needs. People who feel threatened for their physiological well-being or social status aren't looking for transcendence, so in order to set up an organization where people go to look for a higher purpose, there's a lot of groundwork to be done in taking care of people first.



From "Need" to "Want"

Unmet needs cause dissatisfaction with the status quo. This is also true for the future, because people think ahead.
We build our needs pyramid from the bottom, and whenever a lower-levelled needs turns to an unmet status, everything on top of that need gets out of focus. A starving person doesn't care for appreciation (many artists can tell that story.) A bullied person won't feel the office aesthetics. And so on. So, when a person has an unmet need, they want something, and action is required to satisfy the need (although it could be a different actor to serve the person in need.)
There are a lot of intermediary Wants - for example, money can be used to meet all kinds of need. When a person is hungry, they want money to buy food. Or drink. Or to pay the rent. Once basic needs are satisfied, they might invest that money to buy a token of wealth that meets their need for social recognition. And so on.
Other intermediary Wants, such as a promotion, could be more complex: Is it about the job title for prestige, is it about the presumed job security, about the additional pay, is it to rise the ranks to improve status? Maybe all - maybe only some. 
Or maybe the idea that the Want satisfies the need is an illusion?

The key takeaway here is - if there's a need, there is a want. Ignoring the underlying need will render attempts to satisfying a Want ineffective.



Incentives

Incentives don't directly affect needs. They work because they trigger (both positive and negative) wants: things the individual wants, in order to meet needs - and things the individual wants to avoid, in order to not have needs unmet.
An incentive always stimulates at least one Want. 


Positive incentives

Positive incentives give people the ability to satisfy their needs if they conduct a certain action. The most common form of positive incentive is, "Do your job, get your paycheck." Even a simple "thumbs up" from a coworker can be an incentive to try and go the extra mile. 
The strength of an incentive depends much less on the nominal amount of incentivization - much rather, it depends on the level of unmet need. For example, you won't entice Donald Trump with a Thumbs Up, but the new team member who's not certain of their position in the team yet may be willing to go great lengths to finally get the lead developer to nod in agreement of something.

Negative incentives

Just like positive incentives, we can find ourselves subject to negative incentives.  For example, the negative incentive "skip dinner"  attached to overtime plays on our Want to not be hungry.
In society, we use negative incentives to deter certain behaviours, such as fines (which affect people differently, depending on their wealth.) 
Explicitly phrasing negative incentives helps people discover which courses of actions are desirable and which are undesirable.
Managers quickly forget that there are almost always some negative incentives attached to their words: Even asking for a trivial thing might leads to the person feeling their status in front of their peers is diminished - a negative incentive, which people try to avoid.


The incentive economy

An economy is defined by wealth - in this case, we define "wealth" as satisfied needs.
In the case of incentives, we're dealing with promises of future wealth, hence, a value proposition to the recipient of the incentive.


Passive incentives

"What could be a passive incentive?" - well, that's an incentive created by doing nothing. Taking an extreme example, when a police officer continues munching donuts while a robber demands cash from the store owner, that's an incentive for the robber to commit further crimes. We see much less extreme examples in everyday's business world as managers fail to acknowledge the successes and contribution of their teams, incentivizing individuals to either cut down on performance (if stress is affecting their basic needs) - or to start hunting for a better job.

The nefarious thing about passive incentives is that you literally don't need to do anything to trigger them.

Ineffective incentives

The next type of incentives we're looking at are ineffective incentives. A positive incentive which doesn't address a real Want is entirely ineffective - as are negative incentives which don't touch a relevant need. This is also the reason why a pay raise for a person who's already fully able to meet their basic and psychological needs is pretty ineffective, as Dan Pink in his work, "Drive" figured out.
The same is true for "fun activity at work" while people are worried about not being able to afford the rent: it affects a higher-levelled need while there are lower-levelled Wants.
We can try to manipulate people by suggesting that they do have a certain Want, a typical marketing strategy. Here we can quote Lincoln, "you can fool some of the people all of the time, and all of the people some of the time, but you can't fool all of the people all of the time." Eventually, people will realize that an incentive doesn't affect their needs, and they will ignore it. That is especially true for  things often called "non-monetary incentives."
Likewise, when a person finds alternative ways of satisfying their Needs, the Want disappears, and the incentive fizzles. For example, new joiners will stop trying to achieve the recognition of their boss when they feel that the recognition of their peers means a lot more. 

Hence, long-term incentives that have alternatives tend to be ineffective. 

Perverse incentives

The final category of incentives are perverse incentives - these are different from the other categories.
When we try to incentivize an action, we usually try to do that to get a benefit from what they do. The term, "perverse incentive" comes into play when people have the Want, but the intended incentivized action isn't actually what people to in order to meet their need, and so they go to do undesirable things. "Cobra Farming" is a notorious example of a perverse incentive.
The problem with perverse incentives is that they work extremely well - just not in the way that the incentive-giver intended. 
And there's another problem: Perverse incentives are extremely common in complex organizations where the relationship between cause and effect isn't clear.

Setting incentives without connecting to the recipient's reality often leads to perverted incentives.  
Top managers coming up with broad-brushed incentivization schemes for people at shop floor level often realize this only when it's already too late.


Summary


Bringing it all together - we have to understand the real needs and how they can be met. We must learn to go beyond merely talking about our Wants.
From there, we must learn how to take care of each other's needs in effective ways.

That allows us to become explicit on the incentives that let us build "Win-Win" scenarios.

Sunday, December 19, 2021

The Agile Tragedy of Commons

There's a socioeconomical dilemma called, "Tragedy of the Commons" which effectively means, "In unregulated systems, local optimization may become so extreme that the system becomes unsustainable."
And here's why you need to understand it in order to understand "agility at scale."

Before we get started, though, let me explain the dilemma:


The Tragedy of Commons

Imagine there's a huge, pristine green pasture. The lord of the land has decreed that everyone is free to use this pasture for shepherding their flock.

The first shepherd who arrives finds a vast green pasture, and the flock is happy to graze on the fields.

Soon afterwards, another shepherd arrives, and their sheep graze happily on the lush pasture as well. Both shepherds are happy, as their sheep have ample food to eat.

In the excellent conditions, the flocks multiply.

As the flocks grow, there is no longer an overabundance - the sheep of the two shepherds begin competing for food. The first shepherd's sheep had more time to multiply, and the second shepherd's sheep lack the required conditions to multiply freely.


Both flocks compete over increasingly scarce food: the sheep lack nutrition and are threatened from starvation. The first shepherd feels that the second shepherd's presence has caused this threat to their flock. The second shepherd considers the first to be using their unfair advantage of having a bigger flock to drive them off the pasture. Quarrels arise.

The feudal lord settles the dispute by dividing the once lush green pasture into an allotted segment for each shepherd, based on the size of their flock. Both shepherds now have access to less land than they could access before - but each now has full control over their flock.


Should the lord have settled the dispute in this way? Should the lord have found another solution? Take some time to think. What would have happened - had the lord not intervened?


Tragedy of Agile Commons

There are many, and massive, applications to the Tragedy of Commons in the realm of software systems - moreso in scaled environments. In the land of Agile, they're much more visible than in the land of Waterfalls, where the lots are already divided before they exist: Agile teams incrementally and empirically discover their next best move based on where they are today - perfect preconditions for the Tragedy of Commons.

Common Code

Teams who haven't learnt discipline in code will find it highly irritating to see one developer interfering with another developer's code: "my" coding style is always better than "yours."

Quickly, quality problems begin arising on the Common Codebase: quarrels over style, functionality placement, inconsistencies and many others.

As more and more developers enter the scene, the tendency for code silos built around personal preference, coding style, technologies, or even domains increases.

Whereas one team can often be left to freely roam the green pastures of service implementation, the field quickly turns into a brown quagmire when multiple teams all play their preferences, until the code base becomes entirely unworkable.

Once teams build fences around their codebase, Collective Code Ownership becomes a thing of the past, and Component Teams find themselves entertaining a dreadful nightmare of coordination meetings.

Better approaches could be:

  • code conventions
  • linting rules
  • cross-team Pairing sessions
  • Code dojos
  • Continuous Integration / Continous Delivery

- but these things are all topics large organizations struggle with after the code has been divided into silos.

Common Innovation

A green field project is often a great way to introduce new, more powerful technologies that allow teams to do less with more.

As the development organization matures, and standards begin to shape the landscape, new ideas becomes exotic and marginal, struggling to overcome inertia.

Imagine - just for example - introducing agent-based logic into an architecture driven by RPC's and SOAP requests: There will be few takers for such an innovation.

The Tragedy of Common Innovation is that new ideas have to find a really small niche when the field is already taken by old ideas. Many good ideas go extinct before they can catch a hold.

With a constant decline of innovative ideas, organizations eventually find themselves investing massive efforts into servicing outdates ways of working and technologies, incapacitating their ability to deliver high value to the customer in the same ways others do.

Better approaches might be:

  • innovation allotments
  • hackathons
  • innovation sessions
  • innovation champions
  • cross-team collaboration on innovation
  • intrapreneurship

Common Meetings

Have you ever been in a 2-hour meeting with 40 people? Did you ever pay attention to how many people actually speak? Hint: it's most likely not an even distribution.

Small organizations find their meetings very effective, but as more and more people appear on the scene, meeting effectiveness quickly declines. And there's a reason.

In an effective 3-people, 1-hour meeting, every person gets to speak roughly 20 minutes. That's a lot of time to voice ideas, offer feedback and draw conclusions. That's a 33% activity ratio. And everyone has pretty much the same understanding afterwards.

When we constrast this with a 30-people, 2-hour meeting: Simply by dividing clock time, we see that every person gets to speak an average of 4 minutes, while being forced to listen for an average of 116 minutes: The ratio of ideas contributed versus passivity is staggering for each individual - the activity ratio has dropped to a mere 3%! In such a scenario, the tragedy of common meetings becomes that some of the more experienced people take the stage, and everyone else becomes decoration.

Solution approaches might be:

  • focus sessions
  • using a need-to-know principle
  • Law of Two Feet
  • breakout sessions
  • topic ownership

Specialisation also removes the need for everyone to participate in all discussions.

The tradeoff is mainly between not everyone getting firsthand information and people suffering through hours of only marginally relevant meetings. To any solution, there's a downside.

Common Work

A single developer can work on any code at any time, and there will be no unpredicted side effects like merge conflicts caused by others' work. Small teams will usually learn quickly how to coordinate so that they minimize mutual interference.

Without good engineering practice, delivering a larger, integrated piece of software means lots of simultaneous changes in many places. Teams will either get into a Riverdance of constantly stepping on each other's toes, or they will require so much coordination that things get really messy. Of course, the "solution," is - once again - code silos and dependency hell: productivity tanks as risks and delays rise skywards.

Every developer joining an organization that hasn't managed to deal with the Tragedy of Common Work adequately, will make every developer's productivity decline - up to a point where the net productivity gain of hiring additional developers may be negative, i.e. with each new hire, the organization gets less productive overall!

Potential solutions could be:

  • visual dependency management
  • domain separation
  • decoupling
  • joint roadmap planning
  • cyclical synchronisation points
  • communication by code

Now what?

These are just four examples of how the Tragedy of Commons matters a lot in a Scaled Agile setting, and there are a vast number of potential commons.

Regardless of whether an Enterprise is new to agile ways of working or have been doing so for a while: you need to establish overarching rules that mitigate the conflicts, lest you run afoul of the Tragedy of Commons.

The "Tragedy of Commons" is entirely evitable in a holistic system where every participant sees themselves as an integral part of the whole. The solution is coexistence and collaborative conflict resolution rather than competition.

Ground rules address unregulated, harmful growth, a lack of discipline, and myopic actions, but each rule comes with a drawback: it reduces flexibility. While Team Autonomy needs boundaries where these are for the Greater Good, it's important to set these boundaries wisely, and revisiting these where they aren't useful. That can't be done by only one group - in a common system, it has to involve all those whom it concerns.

Which boundaries will you set to prevent your organization from suffering the Tragedy of the Commons, and what is their cost?

Monday, November 29, 2021

The "Planning Tetris" Antipattern

 "We need to utilize all of our Story Points" - that's a common dysfunction in many Scrum teams, and especially in SAFe's Agile Release Trains where teams operate on a Planning horizon of 3-5 Sprints. It results in an antipattern often called "Planning Tetris." It's extremely harmful, and here's why.


Although the above feature plan appears to be perfectly optimized, reality often looks different: all items generate value later than they potentially could - at a higher cost, in longer time and with lower efficiency!


Accumulating Work in Process

Planning Tetris often leads to people starting work on multiple topics in one Sprint, and then finishing it in a later Sprint. It is resource-efficient (i.e. maximizing the utilization of available time), not throughput-efficient (i.e., maximizing the rate at which value is generated.)

That leads to increased Work in Process, which is a problem for multiple reasons:

Value Denial

Just like in the sample diagram above, "Feature 1" and "Feature 2" could each be finished in a single Sprint. And still, Feature 1 doesn't provide any value in Sprint 1, and Feature 2 has no value in Sprint 2. So, we lose 1 Sprint of marketability on Feature 1 (our highest priority) - and on Feature 2 as well:
A perfect example how utilizing the team makes the value come later!

Loss of money

Imagine now that every feature costs less than it's worth (which it should, otherwise it wouldn't be worth developing) - and you see that the "saved" efficiency of having worked on features 3 and 4 before finishing feature 1 costs the company more money than the added benefit .

Efficiency loss

You may argue, "different people are working on the features, so there's no multitasking."
Yes - and no. What is happening?
Sprint Planning for Sprint 1 has to discuss 3 features: 1,3 and 4. This means that the whole team is discussing three different topics, (none of which will be delivered in that Sprint.) The same happens in Dailies and Review. And, potentially at a source code level as well. The feature interference may also bloat up the complexity of technical configuration, deployment processes and the like.
The team becomes slower, hence less efficient.

Adding needless risk

In statistics, there's a phenomenon called "the high probability of low probability events." Let me explain briefly:  There's an infinite amount of almost infinitely-unlikely events, but unfortunately, high infinity divided by low infinitiy is still a number close to one: Something will happen. You just don't know what, and when, so you can't prepare or mitigate. Since you don't know which aspect of your plan will be affected when a risk hits, you'll always be caught by surprise.
How is that a bigger problem in Planning Tetris than in sequentialized delivery?

Massive ripple effect

When you're working on one topic, and an event hits that affects your entire team, you have one problem to communicate. When the same happens as you're working on multiple topics, all of them are impacted, and you're generating a much stronger ripple effect.

Complex mitigation

As multiple topics are in process, you suddenly find yourself mitigating multiple topics. And that means multiplicative mitigation effort - less time to work, and at the same time a higher risk that not all mitigations are successful. You end up with a higher probability of not being able to get back on track!

Chaotic consequences

Both the ripple effect into the organization and the mitigating actions could lead to unpredicted consequences which are even harder to predict than the triggering event. In many cases, the only feasible solution is to surrender and mark all started topics as delayed, and try to clean up the shards from there.



Prepare to Fail

There's Parkinson's Law - "work always extends to fill the amount of time available." That's often used as an argument to start another topic, because it stops gold-plating and keeps people focused.
But there's also the (F)Law of Averages: "Plans based on averages fail half the time."
The latter makes planning tetris a suicidal approach from a business perspective: it starts a vicious circle.

Predictable failure

Because there's no slack built into planned tetris, the mid-term plan will automatically fail as soon as a single feature turns out more complex than planned. The more features are part of our tetris stack, the more likely at least one of them will fail. And the team will usually get blamed for it. Because of that, we end up with

Conservative estimates

Teams must allocate the slack buffers into their feature estimates to reduce the probability of failure. When a Tetris plan spans multiple Sprints, some feature content may not be "Ready" for implementation during the Sprint when slack would be available - so we end up with Parkinson's Law, the buffered estimates don't reduce failure probabilities. 

Declining throughput

At this point, Parkinson's Law tag-teams with the Flaw of Averages to KO the team: Regardless of how conservative the estimates, the team will still end up failing half the time. The consequence is that business throughput continues to decline (there's an interesting bottom: when a Sprint only contains one feature!) 


Strangulating the team

Let's take a look at the psychological impact of Planning Tetris now as well:

No space for Creativity

I have never seen an organization where Product Management was happy that developers would add "creative spaces" into a Tetris Plan. It's all about churning out feature, after feature, after feature, without a pause, without a break. When one feature is done, another is already in progress. There is no room to be creative.

No space for Growth

The only relevant business outcome in Tetris Plans is usually business value delivered. It ignores that developers are the human capital of the organization, and growing them is growing the organization's ability to deliver value. Especially in the rapidly changing tech industry, not growing equals falling back until eventually, the team is no longer competitive.

No space for Improvement

I often advise that developers should take some time to look at "Done" work to reflect how it could have been done better, and turning that better way into action. With Planning Tetris, that opportunity doesn't exist - another feature is waiting, and improving something that exists is always less important than delivering the next big thing. That often ends in terrible products which are no joy to deal with - for developers and customers alike!



Now ... what then?

The point that Planning Tetris is a terrible idea should be blatantly obvious.
"Now what's the better way then?" - you may ask.

It sounds incredibly simplistic, because it is actually that simple.
  1.  Reduce the amount of features the team is working on in parallel to an absolute minimum. This minimizes blast radius.
  2.  Instead of having people parallelize multiple topics, let "inefficient", "not-skilled" people take easier parts of the work to step up their game. That reduces the impact of low-probability events and gives everyone air to breathe.
  3.  Put slack into the Sprints. The gained resilience can absorb impact. It also reduces the need for buffered estimates, countering Parkinson's Law and the Flaw of Averages. It also gives people air to breathe.
  4.  Agree on Pull-Forward. When the team feels idle, they can always pull future topics into unused idle time. Nobody complains when a topic is finished ahead of time, everyone complains when something turns late. Pull Forward has no ripple effects or chaotic consequences.

Ok, too many words, so TL;DR:
  1. Sequentialize.
  2. Slack.
  3. Pull.
All problems mentioned in this article = solved.

Monday, November 15, 2021

From Standards to Baselines

Many - especially large - organizations are looking for standards that they want everyone to adhere to. The idea is that "standards reduce complexity." Yes and no. Unfortunately, there's a risk that standards create more complexity than they are intended to reduce. 

Let's take a look at the issue by using Scrum as a showcase. Whatever I say about Scrum will also apply to Kanban, DevOps, CI/CD - and many other topics.



The Standard

There's no argument that Scrum is a de-facto standard in the industry for many teams. Many organizations mandate that development teams must use Scrum, and rigorously enforce adherence to a company standard of Scrum. While it's entirely possible to use Scrum in this manner, this entirely misses the point of Scrum: as the mechanics of Scrum are honed to perfection, the core ideas of flexibility and continuous improvement are lost. Teams lose ownership as their way of working is externally imposed.

Teams using Scrum as a standard lose the ability to evolve beyond Scrum. Scrum becomes their mental shutter - they become unable to think in a different way.


Broken Scrum

Teams confined by Standard Scrum often feel that it is far too restrictive. Especially inexperienced teams often suffer from poorly implemented practices, which seem to have no value and just generate overhead for the team. Not being aware of the actual intent, and being unable to discern intent and practice, they proverbially throw out the baby with the bathtub: "Scrum is broken," Scrum is discarded.

Such teams fall below the baseline of Scrum, and they think that Scrum is the problem.


The Baseline

Instead of considering Scrum as the confines within which development must be organized, a team can also perceive Scrum as their baseline. Understanding Scrum as a Baseline means that there's no prescription what you must do or how to do it: it doesn't even tell you that you need to use Scrum. What it tells you is what you must have to be at least as good as a Scrum team could be.

For example - everyone should be able to tell at any point in time what the team's highest priority is.
And there should be closed feedback loops both for decisions and execution.
And the team shoud apply double-loop learning at least in monthly cycles.
And so on.

Now: what's the difference?


From Standard to Baseline

What may sound like a game of semantics makes a massive difference in practice:

Standards create restrictions. Baselines foster growth.

Standard Scrum teams often find themselves rendered ineffective, not because of Scrum, but because the standard stops Continuous Improvement as soon as it touches the rules of Scrum. Baseline Scrum teams aren't concerned with the rules of Scrum - they're concerend with being "at least as good" as Scrum would have them be. A team on a Baseline of Scrum can do whatever they want. There is no rule that the team must use Scrum. Instead, Scrum becomes a list of things that help the team inspect and adapt.

For example, there is no rule such as "we must have Retrospectives." But there is a benchmark - the team should frequently re-examine their ways of working, and actively work on improving their effectiveness. There are other means than a Sprint-end Retrospective to achive this: for example,  extended coffee breaks with deep discussion.

Standards can be measured and assessed based on adherence to a fixed set of rules and practices: it's all about checking the boxes.

Baselines can't be measured in this way. They must be measured based on outcomes: What is the value we're trying to get out of a standard?  And: are we getting at least that?

Measuring adherence to standard leads to improvements primarily focued on compliance. At full compliance, the journey of change ends. Measuring baseline performance is much more complicated. There is no "true/false" answer such as for compliance, and there's an open-ended scale that knows no "perfect" - there's always room for improvement.

 

Now what?

I would advise everyone to look at everything in the Agile domain - Values, Principles, Practices, Frameworks and even tools - as Baselines: "How far do we go beyond that?

If the answer is "Not even there," then have a discussion of how you can up your game. Maybe adopting that thing is a simple, easy way to improve?

However, if the answer is, "Already far beyond," then compliance is off your list of worries. Even if you don't have that thing, you most likely won't need it.

Monday, November 1, 2021

Four key metrics for transformation effectiveness

What matters for a company in an "Agile Transformation?" Well, let me give you my perspective. Here are four key metrics which I would advise to track and improve:


Customer satisfaction

Customer Satisfaction can be both a leading and a lagging indicator. It's leading, because it informs us how likely our next move will grow our business - and it's lagging, because it tells us how well we did in the past.
We can measure it asynchronously with tools like the Net Promoter Score or observing Google ratings. Softer metrics include customer interviews. Modern, technological means include A/B tests, conversion and resubscription rates.
Regardless of which of these you measure: if you see this indicator going down, you're most likely doing something wrong.

A proper change initiative relies on being able to track user satisfaction in some way and use that to inspect and adapt accordingly.

Employee happiness

Employee happiness is a pretty strong leading indicator for potential success: happy staff tend to do everything in their power to help the company succeed, because they like being part of this success. In reverse, unhappy staff are a leading indicator for many other problems that only become visible once they hit.

I'm not a huge fan of employee morale surveys, as depending on the organizational culture, it's weaponized against management, so people always give scores of 100% and bolt for the door at the next opportunity. 
The minimum you can do is measure staff attrition - if your attrition rates are above industry, you're doing something wrong. If you're significantly below, you're probably doing something right.
At a more detailed level, it does indeed help to look at factors like psychological flow, change fatigue, diversity and compensation fairness, although we need to be careful that these are used as indicators for inspection and adaptation, not as the next management KPI to be gamed.
 

Throughput rate

Throughput rate is a leading indicator for capability and capacity: when you know your throughput rate and the amount of work ahead, you can well predict what happens when.

The effectiveness of an organization can be tracked by looking at end-to-end throughput rates of the core value streams. Some examples are the duration from lead to conversion, from demand to delivery, or from order to cash.
Take a look at queues, lack of priority, overburdened employees and unavailable resources. By tweaking these levers, throughput rate can often be doubled or more, without any additional effort or expense.
Although it may seem counter-intuitive: the key is not to get people to "work harder," it is to eliminate wait time by having some people do less work. For that, we must understand where wait time accumulates and why it does so.


Financial Throughput

The proof in the pudding is financial throughput, a lagging indicator. Be wary - it can't be measured by department, and it can't be measured by unit. It also can't be measured exclusively by looking at the cost sheet and the amount of work done.
Financial throughput is the one key metric that determines business success: it's the rate at which we're earning money!
We have two significant levers for financial throughput: speeding up the rate at which we turn investment into returns, and reducing the size of the investments such as to bind less capital. Ultimately, combining both of these is the sweet spot.


How to improve

Customer and Employee satisfaction

These metrics depend mainly on the managerial system of the company: how goals are set, how people are treated. Usually, there's a strong correlation: happy employees make customers happy, and happy customers give positive feedback to employees.
Deming noted in his 14 points that management must work to "remove barriers that rob people of their right to pride of workmanship.

Transparency is one factor, getting out of the way is another. Removing policies that reduce people's ability to do the right thing is yet another. A management committed to quality, that is, fixing things that lead to poor outcomes, is vital here.
And, of course, the ultimate key here is letting people own their process.

Throughput rate

This third metric depends on flow efficiency. 
Note that "flow efficiency" is not "resource efficiency:" A process where people and/or resources operate without slack is usually dysfunctional and will falter at the slightest hiccup. Process flow requires resilience. 
Queues are a universal killer of throughput rate, so avoid queues wherever and whenever possible.

In software engineering, process efficiency is mainly determined by engineering practice: Software Craftsmanship and Continuous Delivery (CD). The prior requires people to know how to develop software using the most appropriate techniques, such as Clean Code practice. The latter requires some tooling, a product architected for CD, a high commitment to quality as well as policies and practices consistent with the intent of CD.


Financial Throughput

The final metric depends on how aligned decision-makers are with their customer base and their organization.

While financial throughput relies on the organization's operative throughput rate, we have to look at which things affect our financial throughput and enable our organization to do more of these - and quicker. And in many cases, that means doing less of the things that have sub-optimal financial throughput. For example, eliminating "failure demand." (work that's only required because something else went wrong.) Or "null objectives." (targets which do not affect these four metrics.)



And how about Agile Frameworks?

Nowhere does this article mention a specific "Agile Framework." This is not an oversight - frameworks are irrelevant, or potentially even harmful, in the discussion of business relevant metrics. They could be a tool in the solution space. That depends on where we come from and which challenges we face.

For example, if we're challenged on engineering practice - we can't solve that with Scrum or SAFe.  Likewise, if we have customer satisfaction issues: Kanban doesn't even consider the topic.

Not even "Agile" is either a relevant means, nor a relevant outcome. Working "Agile" is merely one possible approach that tends to be consistent with these metrics. Where agile values, principles, practices and mindset help us improve on these metrics, they are valuable. But when that is not the case, they aren't worth pursuing.


Sunday, October 3, 2021

Five reasons for having a Definition of Done

"Do we really need to waste our time to come up with a Definition of Done?"  - well: of course, you're free to do that or not. Just give me your attention for a few minutes to consider the multiple benefits you gain from having one.



Before we start
Many people use the term "Done" in reference to their share of the work, as in, "I am done." This leads to multiple, oftentimes inconsistent, definitions of "development done," - "testing done," - "deployment done." While this may be a choice the team makes, customers don't care who specifically is done with their own work. They care when they receive a product - hence:
The Definition of Done is not set for specific phases of the working process
- it refers to product increments that have passed all stages of the team's process.
As such, it encompasses all the usual activities related to an organization's development process and applies to every single product backlog item.

Now - what does a Definition of Done offer?

#1 - A Quality Standard

"But I thought you had performance tested it?" - "No, a performance test was not part of the acceptance criteria!" - "It's obvious that with that kind of performance, the feature is entirely unusable!" - "Then you should have requested us to performance test it!"

Well - in such a conversation, nobody is a winner: the customer is dissatisfied because they don't get what they need, developers are dissatisfied because they just got an extra load of work, and the Product Owner is dissatisfied because they ended up getting no deliverable value.

The Definition of Done aligns stakeholder expectations on product quality.

Should the customer be bothered to re-confirm their quality expectations on every single user story, and should they be required to re-state that this expectation applies for every release? No.

Once everyone is clear that something is a universal quality requirement, adding a simple statement like, "all pages load within less than a second," would make it clear that the team's work isn't done until the customer's demand is satisfied.



#2 - Common understanding

"I'm done." - "When can I have it?" - "Well, I still need to commit it, then we'll produce a build package, then it goes to testing, let's see where it goes from there ..." - "But didn't you say you were done?" - "I'm done with development."

When different people have different mental models of what "done" means, everyone uses the term in the way that is most convenient for them.

The Definition of Done defines how the term, "Done" is used in the organization.

So much organizational waste - from extended meetings, over people trying to do things that can't possibly succeed, all the way to massive management escalations, is attributed toward misaligned use of this short, simple word: "Done."



#3 - Simpler communication

"I'm done." - "Did you do any necessary refactorings?" - "Will do later." - "If it's not refactored, you can't commit it! And did you get your code reviewed?" - "Do I really need to?" - "That's part of our policy. Where are your unit tests?" - "I don't think that module needs tests." - "Dude, without tests, it's un-maintainable! And did you check with the testers yet?" - "Why should I?" - "What if they found any problems?" --- "Sorry to disturb: I saw that the feature is marked as Done. Can I use it yet?" - "Almost." - "No!" - "Okay, I'm confused now, I'll set up a meeting later so you can explain the status to me."

When everyone understands what needs to be done to be "Done" (pun intended) - communication is much simpler - long, deep probing to discover the actual condition of an individual work item become un-necessary.

The Definition of Done simplifies conversations.

Everyone should be clear what it means when someone uses the term "Done" - what must be understood, and what can be understood.

Everyone must understand, "When it's not covered by the DoD - you can't expect that someone did it, especially when it wasn't explicitly agreed beforehand."

Likewise, everyone can understand, "When it is covered by the DoD - you don't need to ask whether someone did it when people say it was done."



#4 - Providing clarity

"I need an online shop." - "No problem." - "Can you store the orders in the database?" - "We know how to do our job!" - "Can you make sure that baskets don't get lost when users close the browser?" - "That makes it much more complicated. How about we do a basic version now, and we'll add basket session persistence later on once you have a solid user base?" - "Can you make sure the shop has good performance?" - "What do you mean, 'good performance?'"

Stakeholders are often unaware what the team normally does or doesn't do and what they can definitely expect from the product. Hence, they may communicate incomplete or overspecific requirements to the team, both of which are problematic.

Incomplete requirements lead to low-value products that lack essential features, oftentimes leading both parties to conclude that the other party is an idiot.

Overly specific requirements, on the other hand, usually lead to sub-optimal implementations and waste effort when there are easier, better ways to meet user expectations than specified by the customer.

The Definition of Done avoids over-specification on items covered in the DoD.
It likewise avoids under-specification for points not covered in the DoD.

Within the confines of the Definition of Done, the team gains freedom what to do and how to do things, as long as all aspects of the DoD are met. It allows customers to keep out of details that the team handles within the standards of their professional ethics.



#5 - Preventing spillover work

"We're done on this feature." - "Splendid. Did you do functional testing?" - "Yup." - "And?" - "3 defects." - "Are they fixed yet?" - "If we'd do that, we'd not meet the timeline, so we deferred them until after the Release." - "But ... doesn't that mean you still have work to do?" - "Not on the feature. Only on the defects." - "But don't defects mean the Acceptance Criteria aren't met?" - "The defects are so minor that they can be fixed later ..."

We see this happening in many organizations.  Unfortunately, there are two insidious problems here:

1. Based on the Pareto principle, the costs of the undone work could massively outweigh the cost of the done work, potentially toppling the product's entire business case. And nobody knows.

2. Forecasting future work is a challenge when capacity is drained in an ill-defined manner. The resulting loss of transparency decreases customer trust and generates stress within the team.

The Definition of Done ensures that there is no future work induced by past work.

The Definition of Done is a protection for the team, in that they will not accumulate a constantly rising pile of undone work which will eventually incapacitate them.

Likewise, a solid DoD protects the business, because there is a much lower risk that one day, developers will have to state that, "We can't deliver any further value until we invest a massive amount of time and money to clear our debt."



Summary

The reasons for having a Definition of Done may vary from team to team, and each person might find a different reason compelling. While it's definitely within the realm of possibility that none of the benefits outlined in this article are meaningful in your context, at least ponder whether the hour-or-two it takes to align on the key points of a Definition of Done are worthwhile the amount of stress you might have to indulge by not having one.

A Definition of Done is not cast in stone - it's up to negotiation, and points can be added and removed during Inspection and Adaptation events, such as team Retrospectives. As long as everyone can agree to a change, that change is legitimate.

If you don't have a DoD yet, try with a very simple one and take it from there.

As a conclusion of this article, I'll even throw in my favorite minified DoD:

No work remaining.

Sunday, August 8, 2021

The Product Owner Role

What concerns me in regards to the Product Owner role: it's so horribly diluted by many organizations that sometimes, practitioners ask me "What's the meaning of my work, and what should I do in order to my job well?"

There's so much garbage out there on the Internet regarding the Product Owner Role that it's very difficult for someone without significant experience to discern what's a proper definition that would help both companies define proper job descriptions, and provide guidance to PO practitioners how to improve. So - let me give you my perspective.


Great Product Ownership

I would classify great Product Ownership into four key domains:


While one of the core responsibilities of the Product Owner is the Product Backlog, it should be nothing more than the expression of the Product Owner's intent. And this intent should be defined by acting on these four domains.

Product Leadership

The Product Owner should be providing vision and clarity to the team, the product's stakeholders, customers and users alike.

Product Vision

Who is better than the Product Owner at understanding what the product is, which problem it solves, why it's needed and where it's going? Great Product Owners build this vision, own it and inspire others to follow them in their pursuit of making it happen.

This vision then needs to be made specific by elaborating long-term, mid-term and short-term objectives - a guiding visionary goal, an actionable product goal and the immediate sprint goal.

Clarity of Purpose

While the Vision is often a bit lofty, developers need substantial clarity on "what happens now, what happens next?" - and customers will want to know "what's in it - for me?" The Product Owner must be crystal clear on where the product currently is, and where it's going next. They must be able to clearly articulate what the next steps are - and what the next steps are not. They need to be able to state at any given point in time what the highest priority is, and why that is so.

The Product Backlog is then the place where the PO maintains and communicates the order of upcoming objectives and content.

Communication

The Product Owner must communicate their product with both internal and external stakeholders. Life is never easy, so they must rally supporters, build rapport with sponsors, and resolve the inevitable conflicts amongst the various groups of interest.

Networking

The product can only be as successful as the support it receives. As such, the Product Owner must build a broad network of supporters, and continuously maintain and grow their influence in their organization - and for the product's market. Keeping a close eye on stakeholder satisfaction and interest, continuously re-kindling the fire of attention drawn to the product is essential in sustaining and fostering the product.

Diplomacy

As soon as multiple people are involved, there tend to be conflicts of interest. Even if there is only one single stakeholder, that stakeholder has choices to make, and may need to resolve between conflicting priorities themselves.

In peace times, the Product Owner builds common ground with stakeholders, so that they are more likely to speak positively of the product.
In times of crisis, the Product Owner understands the sources of conflict, ebbs the waves, reconciles differences, brings people together to work out positive solutions, and mends wounds.

Insight

The Product Owner is the go-to source for both the team and the product's stakeholders when they want to know something about the product's purpose or intent. The Product Owner has both factual knowledge and inspiring stories to share.

Product Knowledge

Caution - the Product Owner isn't a personified Product Instruction Manual, and they don't need to be. Much rather, they should be the people to be able to explain why the product currently is the way it is, and why it's going to be the way it's going to be. They must be able to fully understand the product's capabilities and purpose - and they must be able to convey why these are good choices. 
From a more negative take, the Product Owner must understand the weaknesses of the current product and have ideas how to leverage or compensate these.
And for all of this, the Product Owner should have the domain expertise, market information and hard data to back up their statements.

Storytelling

"Facts tell, stories sell." - the Product Owner's role is to sell the product, both to the team, and to the customers. They should be able to tell a relatable, realistic story of what users want to and/or are doing with the product, what their current pains are, and what their future benefits will be.
"Speaking to pain and pleasure" is the game - touch hearts and minds alike. The Product Owner should be NEAR their users, and bring developers NEAR as well.


Business Acument

The Product Owner's primary responsibility is to maximize the value of the product, by prioritizing the highest value first, and by making economically sensible choices both in terms of obtaining funding and spending.

Value Decisions

There are three key value decisions a Product Owner faces every day:
  1. What is our value proposal - and what isn't?
  2. What value will we deliver now, and what later?
  3. What is not part of our value proposal, and will therefore not be delivered at all?
The question oftentimes isn't whether the customer needs something, but whether they need it so urgently that other things have to be deferred or be ditched.

When anyone, customer or developer alike, asks the Product Owner what is on the agenda today, this week, or this month - the Product Owner must be able to answer in a way that the underlying value statements are clear to all.

Economics

With infinite money and infinite time, you could build everything - but since we don't have that luxury, the Product Owner must make investment decisions - what is a positive business case, what is a negative business case, what can we afford to do - and what can we afford to not do?

The Product Owner should be able to understand the economical impact of any choices they make: More people can do more work, but burn the budget faster. Every feature has an opportunity cost - all other features that get deferred because of it. Fines could be cheaper than implementations, so not everything "mandatory" must be done. These are just a few.
There is often no straightforward answer to "What should we spend our money on this month?" - and considering all of the trade-offs from every potential economic angle before bringing product related decisions to the team or towards stakeholders is quite a complex endeavour.

Economic decisions need to then be transported transparently towards the relevant organizational stakeholders - to team members, who may not understand where priorities come from, to customers who may not understand why they don't get their request served - to managers, who may not understand why yesterday's plan is already invalid.


Given all of these Product Owner responsibilities above, it should be quite clear that the Product Owner must focus and has little time to take care of things that are ...

Not the Product Owner's concern

Three domains are often seen as expectations on the Product Owner, which are actually a distraction from their responsibilities, and putting them onto the PO's shoulders actually steals the time they need in order to do the things that make them a good Product Owner:


Project Management

The Product Owner is not responsible for creating a project plan, tracking its progress or reporting status.

Let's briefly describe how this is supposed to happen:

Planning is a collaborative whole-team exercise, and while the Product Owner participates and provides context, a goal and a sorted backlog as input, they are merely contributing as the team creates their plan.

Developers are autonomous in their work, and the Product Owner should rely on being requested for feedback whenever there's visible progress or any impediments hinder the planned outcomes. If the team can't bear the responsibility of their autonomy properly, that would be a problem for the Scrum Master to tackle. The PO should entirely keep out of the work.

Since Sprint Reviews are the perfect opportunity to inspect and adapt both outcomes and progress, no status reporting should be required. A "gemba mindset" would indicate that if stakeholders are concerned about progress, they need to attend the Reviews, and should not rely on hearsay, that is, report documents. 


Team Organization

The Product Owner is not reponsible for how the team works, when they have meetings or who does what.

When Scrum is desired as a way of working, the team should have a Scrum Master. The worst thing a Product Owner can do with their time is bother with introducing, maintaining or optimizing Scrum - they should be able to rely on having proper Scrum in place.

Team events, such as Plannings or Reviews, help the team do their work, and as such, should be organized by the developers themselves, because only they know when and how they need these. The Scrum Master can support, and the Product Owner should attend - but the PO shouldn't be bothered with setting these up, and most definitely shouldn't run the entire show.

If anyone assigns tasks on a Scrum team, it's the team members self-organizing to do this. Having the Product Owner (or Scrum Master) do this job is an antipattern that will hurt the team's performance. The Product Owner should not even need to know who does what, or when.


Development Work

The Product Owner develops the product's position, not a technical solution. They have a team of experts to do this, and these experts (should) know better than the PO how to do this. That means the PO should be able to keep entirely out of design, implementation and testing.

Product Owners actively designing solutions often fall into the "premature optimization" trap, resulting in poor solutions. The best approach is to have the Product Owner collaborate with developers as needed to get sufficient clarity on how developers would proceed, but to focus fully on the "What" and keep entirely out of the "How."

When Product Owners have time for implementation, the product is most likely going to fail: while they're paying attention to the development, they aren't focusing on what's happening to the product and its customers out in the market.

Product Owners have a team of professionals around them who are supposed to deliver a high quality "Done" Increment. If their team has no quality assurance, the solution is to bring testers in, not to delegate testing to the Product Owner.

Thursday, August 5, 2021

Continuous Integration Benchmark Metrics

 While Continuous Integration should be a professional software development standard by now, many organizations struggle to set it up in a way that actually works properly.

I've created a small infographic based on data taken from the CircleCI blog - to provide an overview of the key metrics you may want to control and some figures on how the numbers should look like when benchmarked against industry performance:



The underlying data is from 2019, as I could not find data from 2021.

Key Metrics

First things first - if you're successfully validating your build on every single changed line of code and it just takes a few seconds to get feedback, tracking the individual steps would be overkill. The metrics described in this article are intended to help you locate improvement potential when you're not there yet.


Build Frequency

Build frequency is concerned with how often you integrate code from your local environment. That's important because the assumption that your local version of the code is actually correct and consistent with the work of the remaining team is just that - an assumption, which becomes less and less feasible as time passes.

By committing and creating a verified, valid build, you reset the timer on that assumption, thereby reducing the risk of future failure and rework.

A good rule of thumb is to build at least daily per team member - the elite would validate their changes every couple of minutes! If you're not doing all of the following, you may have serious issues:

  • Commit isolated changes
  • Commit small changes
  • Validate the build on every single change instead of bulking up

Build Time

The amount of time it takes for a committed change until the pipeline has successfully completed - indicating that the build is valid and ready for deployment into production.

Some organizations go insanely fast, with the top projects averaging at 2 seconds from commit all the way into production - and it seems to work for them. I have no insights whether there's much testing in the process - but hey, if their Mean Time to Restore (MTTR) on productive failures is also just a couple minutes, they have little to lose.

Well, let's talk about normal organizations - if you can go from Commit to Pass in about 3 and a half minutes, you're in the median range: half the organizations will still outperform you, half won't.

If you take longer than 28 minutes, you definitely have to improve - 95% of organizations can do better!


Build Failure Rate 

The percentage of committed changes causing a failure.

The specific root cause of the failure could be anything - from build verification, compilation errors or test automation - no matter what, I'm amazed to learn that 30% of projects seem to have their engineering practice and IDE tooling so well under control that they don't even have that problem at all, and that's great to hear. 

Well, if that's a problem for you like 1/5th of the time, you'd still pass as average, but if a third or more of your changes are causing problems, you should look to improve quickly and drastically!


Pipeline Restoration Time

How long it takes to fix a problem in the pipeline.

Okay, failure happens. Not to everyone (see above), but to most. And when it does, you have failure demand - work only required because something failed. The top 10% organizations can recover from such a failure within 10 minutes or less, so they don't sweat much when something goes awry. If you can recover within the hour, you're still on average.

From there, we quickly get into a hugely spread distribution - the median moves between 3 hours and 18 hours, and the bottom 5% take multiple days. The massive variation between 3 and 18 hours is explained easily - if you can't fix it before EOB, there's an entire night between issue and resolution.

Nightly builds, which were a pretty decent practice just a decade ago, would immediately throw you at or below median - not working between 6pm and 8am would automatically botch you above 12 hours, which puts you at the bottom already.


First-time Fix Rate

Assuming you do have problems in the pipeline - which many don't even have, you occasionally need to provide a fix to return your pipeline to Green condition.
If you do CI well, your only potential problem should be your latest commit, and if you follow the rules on build frequency properly, the worst case scenario is reverting your change, and if you're not certain that your fix will work, that's the best thing you can do in order to return to a valid build state.

Half the organizations seem to have this under control, while the bottom quartile still seems to enjoy a little bit of tinkering - with fixes being ineffective or leading to additional failures. 
If that's you, you have homework to do.


Deployment Frequency

The proof of the pudding: How often you successfully put an update into production.

Although Deployment Frequency is clearly located outside the core CI process, if you can't reliably and frequently deploy, you might have issues you maybe shouldn't have.

If you want to be great, aim for moving from valid build to installed build many times a day. If you're content with average, once a day is probably still fine. When you can't get at least one deployment a week, your deployment process is definitely ranking on the bottom of the barrel and you have definite room for improvement.

There are many root causes for lower deployment frequency, though: technical issues, organizational issues or just plain process issues. Depending on what they are, you're looking at an entirely different solution space: for example, improving technically won't help as long as your problem is an approval orgy with 17 different comittees.


Conclusion

Continuous Integration is much more than having a pipeline.

Doing it well means:

  1.  Integrating multiple times a day, preferably multiple times an hour
  2. Having such high quality that you can be pretty confident that there are no failures in the process, 
  3. And even when a failure happens, you don't break a sweat when having to fix it
And finally, your builds should always be in a deployable condition - and the deployment itself should be so safe and effortless that you can do it multiple times a day.

Thousands of companies world-wide can do that already. What's stopping you?