Monday, November 29, 2021

The "Planning Tetris" Antipattern

 "We need to utilize all of our Story Points" - that's a common dysfunction in many Scrum teams, and especially in SAFe's Agile Release Trains where teams operate on a Planning horizon of 3-5 Sprints. It results in an antipattern often called "Planning Tetris." It's extremely harmful, and here's why.


Although the above feature plan appears to be perfectly optimized, reality often looks different: all items generate value later than they potentially could - at a higher cost, in longer time and with lower efficiency!


Accumulating Work in Process

Planning Tetris often leads to people starting work on multiple topics in one Sprint, and then finishing it in a later Sprint. It is resource-efficient (i.e. maximizing the utilization of available time), not throughput-efficient (i.e., maximizing the rate at which value is generated.)

That leads to increased Work in Process, which is a problem for multiple reasons:

Value Denial

Just like in the sample diagram above, "Feature 1" and "Feature 2" could each be finished in a single Sprint. And still, Feature 1 doesn't provide any value in Sprint 1, and Feature 2 has no value in Sprint 2. So, we lose 1 Sprint of marketability on Feature 1 (our highest priority) - and on Feature 2 as well:
A perfect example how utilizing the team makes the value come later!

Loss of money

Imagine now that every feature costs less than it's worth (which it should, otherwise it wouldn't be worth developing) - and you see that the "saved" efficiency of having worked on features 3 and 4 before finishing feature 1 costs the company more money than the added benefit .

Efficiency loss

You may argue, "different people are working on the features, so there's no multitasking."
Yes - and no. What is happening?
Sprint Planning for Sprint 1 has to discuss 3 features: 1,3 and 4. This means that the whole team is discussing three different topics, (none of which will be delivered in that Sprint.) The same happens in Dailies and Review. And, potentially at a source code level as well. The feature interference may also bloat up the complexity of technical configuration, deployment processes and the like.
The team becomes slower, hence less efficient.

Adding needless risk

In statistics, there's a phenomenon called "the high probability of low probability events." Let me explain briefly:  There's an infinite amount of almost infinitely-unlikely events, but unfortunately, high infinity divided by low infinitiy is still a number close to one: Something will happen. You just don't know what, and when, so you can't prepare or mitigate. Since you don't know which aspect of your plan will be affected when a risk hits, you'll always be caught by surprise.
How is that a bigger problem in Planning Tetris than in sequentialized delivery?

Massive ripple effect

When you're working on one topic, and an event hits that affects your entire team, you have one problem to communicate. When the same happens as you're working on multiple topics, all of them are impacted, and you're generating a much stronger ripple effect.

Complex mitigation

As multiple topics are in process, you suddenly find yourself mitigating multiple topics. And that means multiplicative mitigation effort - less time to work, and at the same time a higher risk that not all mitigations are successful. You end up with a higher probability of not being able to get back on track!

Chaotic consequences

Both the ripple effect into the organization and the mitigating actions could lead to unpredicted consequences which are even harder to predict than the triggering event. In many cases, the only feasible solution is to surrender and mark all started topics as delayed, and try to clean up the shards from there.



Prepare to Fail

There's Parkinson's Law - "work always extends to fill the amount of time available." That's often used as an argument to start another topic, because it stops gold-plating and keeps people focused.
But there's also the (F)Law of Averages: "Plans based on averages fail half the time."
The latter makes planning tetris a suicidal approach from a business perspective: it starts a vicious circle.

Predictable failure

Because there's no slack built into planned tetris, the mid-term plan will automatically fail as soon as a single feature turns out more complex than planned. The more features are part of our tetris stack, the more likely at least one of them will fail. And the team will usually get blamed for it. Because of that, we end up with

Conservative estimates

Teams must allocate the slack buffers into their feature estimates to reduce the probability of failure. When a Tetris plan spans multiple Sprints, some feature content may not be "Ready" for implementation during the Sprint when slack would be available - so we end up with Parkinson's Law, the buffered estimates don't reduce failure probabilities. 

Declining throughput

At this point, Parkinson's Law tag-teams with the Flaw of Averages to KO the team: Regardless of how conservative the estimates, the team will still end up failing half the time. The consequence is that business throughput continues to decline (there's an interesting bottom: when a Sprint only contains one feature!) 


Strangulating the team

Let's take a look at the psychological impact of Planning Tetris now as well:

No space for Creativity

I have never seen an organization where Product Management was happy that developers would add "creative spaces" into a Tetris Plan. It's all about churning out feature, after feature, after feature, without a pause, without a break. When one feature is done, another is already in progress. There is no room to be creative.

No space for Growth

The only relevant business outcome in Tetris Plans is usually business value delivered. It ignores that developers are the human capital of the organization, and growing them is growing the organization's ability to deliver value. Especially in the rapidly changing tech industry, not growing equals falling back until eventually, the team is no longer competitive.

No space for Improvement

I often advise that developers should take some time to look at "Done" work to reflect how it could have been done better, and turning that better way into action. With Planning Tetris, that opportunity doesn't exist - another feature is waiting, and improving something that exists is always less important than delivering the next big thing. That often ends in terrible products which are no joy to deal with - for developers and customers alike!



Now ... what then?

The point that Planning Tetris is a terrible idea should be blatantly obvious.
"Now what's the better way then?" - you may ask.

It sounds incredibly simplistic, because it is actually that simple.
  1.  Reduce the amount of features the team is working on in parallel to an absolute minimum. This minimizes blast radius.
  2.  Instead of having people parallelize multiple topics, let "inefficient", "not-skilled" people take easier parts of the work to step up their game. That reduces the impact of low-probability events and gives everyone air to breathe.
  3.  Put slack into the Sprints. The gained resilience can absorb impact. It also reduces the need for buffered estimates, countering Parkinson's Law and the Flaw of Averages. It also gives people air to breathe.
  4.  Agree on Pull-Forward. When the team feels idle, they can always pull future topics into unused idle time. Nobody complains when a topic is finished ahead of time, everyone complains when something turns late. Pull Forward has no ripple effects or chaotic consequences.

Ok, too many words, so TL;DR:
  1. Sequentialize.
  2. Slack.
  3. Pull.
All problems mentioned in this article = solved.

Monday, November 15, 2021

From Standards to Baselines

Many - especially large - organizations are looking for standards that they want everyone to adhere to. The idea is that "standards reduce complexity." Yes and no. Unfortunately, there's a risk that standards create more complexity than they are intended to reduce. 

Let's take a look at the issue by using Scrum as a showcase. Whatever I say about Scrum will also apply to Kanban, DevOps, CI/CD - and many other topics.



The Standard

There's no argument that Scrum is a de-facto standard in the industry for many teams. Many organizations mandate that development teams must use Scrum, and rigorously enforce adherence to a company standard of Scrum. While it's entirely possible to use Scrum in this manner, this entirely misses the point of Scrum: as the mechanics of Scrum are honed to perfection, the core ideas of flexibility and continuous improvement are lost. Teams lose ownership as their way of working is externally imposed.

Teams using Scrum as a standard lose the ability to evolve beyond Scrum. Scrum becomes their mental shutter - they become unable to think in a different way.


Broken Scrum

Teams confined by Standard Scrum often feel that it is far too restrictive. Especially inexperienced teams often suffer from poorly implemented practices, which seem to have no value and just generate overhead for the team. Not being aware of the actual intent, and being unable to discern intent and practice, they proverbially throw out the baby with the bathtub: "Scrum is broken," Scrum is discarded.

Such teams fall below the baseline of Scrum, and they think that Scrum is the problem.


The Baseline

Instead of considering Scrum as the confines within which development must be organized, a team can also perceive Scrum as their baseline. Understanding Scrum as a Baseline means that there's no prescription what you must do or how to do it: it doesn't even tell you that you need to use Scrum. What it tells you is what you must have to be at least as good as a Scrum team could be.

For example - everyone should be able to tell at any point in time what the team's highest priority is.
And there should be closed feedback loops both for decisions and execution.
And the team shoud apply double-loop learning at least in monthly cycles.
And so on.

Now: what's the difference?


From Standard to Baseline

What may sound like a game of semantics makes a massive difference in practice:

Standards create restrictions. Baselines foster growth.

Standard Scrum teams often find themselves rendered ineffective, not because of Scrum, but because the standard stops Continuous Improvement as soon as it touches the rules of Scrum. Baseline Scrum teams aren't concerned with the rules of Scrum - they're concerend with being "at least as good" as Scrum would have them be. A team on a Baseline of Scrum can do whatever they want. There is no rule that the team must use Scrum. Instead, Scrum becomes a list of things that help the team inspect and adapt.

For example, there is no rule such as "we must have Retrospectives." But there is a benchmark - the team should frequently re-examine their ways of working, and actively work on improving their effectiveness. There are other means than a Sprint-end Retrospective to achive this: for example,  extended coffee breaks with deep discussion.

Standards can be measured and assessed based on adherence to a fixed set of rules and practices: it's all about checking the boxes.

Baselines can't be measured in this way. They must be measured based on outcomes: What is the value we're trying to get out of a standard?  And: are we getting at least that?

Measuring adherence to standard leads to improvements primarily focued on compliance. At full compliance, the journey of change ends. Measuring baseline performance is much more complicated. There is no "true/false" answer such as for compliance, and there's an open-ended scale that knows no "perfect" - there's always room for improvement.

 

Now what?

I would advise everyone to look at everything in the Agile domain - Values, Principles, Practices, Frameworks and even tools - as Baselines: "How far do we go beyond that?

If the answer is "Not even there," then have a discussion of how you can up your game. Maybe adopting that thing is a simple, easy way to improve?

However, if the answer is, "Already far beyond," then compliance is off your list of worries. Even if you don't have that thing, you most likely won't need it.

Monday, November 1, 2021

Four key metrics for transformation effectiveness

What matters for a company in an "Agile Transformation?" Well, let me give you my perspective. Here are four key metrics which I would advise to track and improve:


Customer satisfaction

Customer Satisfaction can be both a leading and a lagging indicator. It's leading, because it informs us how likely our next move will grow our business - and it's lagging, because it tells us how well we did in the past.
We can measure it asynchronously with tools like the Net Promoter Score or observing Google ratings. Softer metrics include customer interviews. Modern, technological means include A/B tests, conversion and resubscription rates.
Regardless of which of these you measure: if you see this indicator going down, you're most likely doing something wrong.

A proper change initiative relies on being able to track user satisfaction in some way and use that to inspect and adapt accordingly.

Employee happiness

Employee happiness is a pretty strong leading indicator for potential success: happy staff tend to do everything in their power to help the company succeed, because they like being part of this success. In reverse, unhappy staff are a leading indicator for many other problems that only become visible once they hit.

I'm not a huge fan of employee morale surveys, as depending on the organizational culture, it's weaponized against management, so people always give scores of 100% and bolt for the door at the next opportunity. 
The minimum you can do is measure staff attrition - if your attrition rates are above industry, you're doing something wrong. If you're significantly below, you're probably doing something right.
At a more detailed level, it does indeed help to look at factors like psychological flow, change fatigue, diversity and compensation fairness, although we need to be careful that these are used as indicators for inspection and adaptation, not as the next management KPI to be gamed.
 

Throughput rate

Throughput rate is a leading indicator for capability and capacity: when you know your throughput rate and the amount of work ahead, you can well predict what happens when.

The effectiveness of an organization can be tracked by looking at end-to-end throughput rates of the core value streams. Some examples are the duration from lead to conversion, from demand to delivery, or from order to cash.
Take a look at queues, lack of priority, overburdened employees and unavailable resources. By tweaking these levers, throughput rate can often be doubled or more, without any additional effort or expense.
Although it may seem counter-intuitive: the key is not to get people to "work harder," it is to eliminate wait time by having some people do less work. For that, we must understand where wait time accumulates and why it does so.


Financial Throughput

The proof in the pudding is financial throughput, a lagging indicator. Be wary - it can't be measured by department, and it can't be measured by unit. It also can't be measured exclusively by looking at the cost sheet and the amount of work done.
Financial throughput is the one key metric that determines business success: it's the rate at which we're earning money!
We have two significant levers for financial throughput: speeding up the rate at which we turn investment into returns, and reducing the size of the investments such as to bind less capital. Ultimately, combining both of these is the sweet spot.


How to improve

Customer and Employee satisfaction

These metrics depend mainly on the managerial system of the company: how goals are set, how people are treated. Usually, there's a strong correlation: happy employees make customers happy, and happy customers give positive feedback to employees.
Deming noted in his 14 points that management must work to "remove barriers that rob people of their right to pride of workmanship.

Transparency is one factor, getting out of the way is another. Removing policies that reduce people's ability to do the right thing is yet another. A management committed to quality, that is, fixing things that lead to poor outcomes, is vital here.
And, of course, the ultimate key here is letting people own their process.

Throughput rate

This third metric depends on flow efficiency. 
Note that "flow efficiency" is not "resource efficiency:" A process where people and/or resources operate without slack is usually dysfunctional and will falter at the slightest hiccup. Process flow requires resilience. 
Queues are a universal killer of throughput rate, so avoid queues wherever and whenever possible.

In software engineering, process efficiency is mainly determined by engineering practice: Software Craftsmanship and Continuous Delivery (CD). The prior requires people to know how to develop software using the most appropriate techniques, such as Clean Code practice. The latter requires some tooling, a product architected for CD, a high commitment to quality as well as policies and practices consistent with the intent of CD.


Financial Throughput

The final metric depends on how aligned decision-makers are with their customer base and their organization.

While financial throughput relies on the organization's operative throughput rate, we have to look at which things affect our financial throughput and enable our organization to do more of these - and quicker. And in many cases, that means doing less of the things that have sub-optimal financial throughput. For example, eliminating "failure demand." (work that's only required because something else went wrong.) Or "null objectives." (targets which do not affect these four metrics.)



And how about Agile Frameworks?

Nowhere does this article mention a specific "Agile Framework." This is not an oversight - frameworks are irrelevant, or potentially even harmful, in the discussion of business relevant metrics. They could be a tool in the solution space. That depends on where we come from and which challenges we face.

For example, if we're challenged on engineering practice - we can't solve that with Scrum or SAFe.  Likewise, if we have customer satisfaction issues: Kanban doesn't even consider the topic.

Not even "Agile" is either a relevant means, nor a relevant outcome. Working "Agile" is merely one possible approach that tends to be consistent with these metrics. Where agile values, principles, practices and mindset help us improve on these metrics, they are valuable. But when that is not the case, they aren't worth pursuing.