Friday, July 22, 2022

U-Curve Optimization doesn't apply to deployments!

Maybe you have seen this model as a suggestion how we should determine optimum batch size for deployments in software development? It's being propagated, among other places, on the official SAFe website - unfortunately, it sets people off on the wrong foot and suggests them to do the wrong thing. Hence, I'd like to correct this model -

In essence, it states is that "if you have high transaction costs for your deployments, you shouldn't deploy too often - wait for the point where the cost of delay is higher than the cost of a deployment." That makes sense, doesn't it?

The cause of Big Batches

Well - what's wrong with the model is the curve. Let's take a look at what it really looks like:

The difference

It's true that holding costs increase over time, but so do transaction costs. And they increase non-linearly. Anyone who has ever worked in IT will confirm that making a huge, massive change isn't faster, easier or cheaper than making a small change.

The amount of effort in making a deployment is usually unrelated to the amount of new features part of the deployment - the effort is determined by the amount of quality control, governance and operational activity required to put a package into production. Again, experience tells us that bigger batches don't cause less effort for QC, documentation or operations. If anything, this effort is required less often, but bigger batches typically require more tests, more documentation and more operational activity each time - and the probability of Incidents rises astronomically, which we can't exclude from the cost of change if we're halfway honest.

Metaphorically, the U-Curve graph could be interpreted as, "If exercise is tiresome, exercise less often - then you won't get tired so often. The optimum amount of exercise is going to door to receive the pizza order, but rather order half a dozen pizzas at once if the trip to the door is too exhausting, and then just eat cold pizza for a few days."

Turning back from metaphors to the world of software deployment: It's true that for some organizations, the cost of transaction exceeds the cost of holding. This means that the value produced but unavailable to users is lower than the cost of making that value available. And that means that the company is losing money while IT sits on undeployed, "finished" software. The solution, of course, can't be to wait even longer with not deploying, and losing even more money - even if that's what many IT departments do.

As shown in the model, the optimum batch size isn't achieved when the company is stuck between a rock and a hard place - finding the point where the amount of money lost by not deploying is so big that it's worth to spend a ton of money on making a deployment.

The mess

Let's look at some real world numbers from clients I have worked with.

As I hinted, some companies have complex, cumbersome deployment processes that require dozens of people weeks of work, easily costing $50000+ for a single new version. It's obvious that due to the sheer amount of time and money involved, this process happens as rarely as possible. Usually, these companies celebrate it as a success when they're able to go from quarterly releases to semiannual releases. But what happens to the value of the software in the meantime?

Just assuming that the software produced is worth the cost of production (because if it wasn't, why build it to begin with) - if the monthly cost of development is $100k, then a quarterly frequency means that the holding cost is already at $300k, and it goes up to over half a million for semiannual releases.

Given that calculation, we should assume that the optimal deployment frequency is when the holding cost reaches $50k, which would be two deployments per month. That doesn't make sense, however: when 2 deployments costs $50k each per month, then 100% of the budget would flow into deployment - of nothing.

Thus, the downward spiral begins: fewer deployments, more value lost, declining business case, pressure to deliver more, more defects, higher cost of failure, more governance, higher cost of deployments, fewer deployments ... race to the bottom!

The solution

So, how do we break free from this death spiral?

Simple: when you're playing a losing game, change the rules.

The mental model that deployments are costly and we should optimize our batch size to only deploy when the cost of deployment outweighs the holding cost is flawed. We are in that situation because we have the wrong processes to begin with. We can't keep these processes. We need to find processes that significantly reduce our deployment costs:

The cost of Continuous Deployment

Again, using real world data from a different client of mine:

This development organization had a KPI on deployment costs, and they were constantly working on making deployments more reliable, easier and faster.

Can you guess what their figures were? Given that I have anchored you at $50k before, you might think that they have optimized the process maybe to $5000 or $3000.
No! If you think so, you're off by so many orders of magnitudes that it's already funny.

I attended one of their feedback events, where they reported that they had brought down the average deployment cost from $0.09 to $0.073. Yes - less than a nickel!

This company made over 1000 deployments per day, so they were spending $73 a day, or $1460 a month, on deployments. If we calculated the accumulated cost of deployments for the whole period, they were still spending over $5000 for three months' worth of software development. But the transaction cost for each single deployment is ridiculously low.

Tell me of anything in software where the holding cost is lower than 7 Cents - and then tell me why we are building that thing? Literally: 7 Cents is mere seconds of developer time!

With a Continuous Deployment process like this, anything that's worth enough for a developer to reach for their keyboard is worth deploying without delay!

And that's the key message why the U-Curve optimization model is flawed:

Anything worth developing is worth deploying immediately.

When the cost of a single deployment is so high that anything developed isn't worth deploying immediately, you need to improve your CI/CD processes, not figure out how big you should make that batch.

If your processes, architecture, infrastructure or practices don't permit for Continuous Deployment, the correct solution is to figure out which changes you need to make so that you can continuously deploy.

Thursday, July 7, 2022

How Equivocation destroys Agile

"Scrum engages groups of people who collectively have all the skills and expertise to do the work and share or acquire such skills as needed" - What does that even mean?
This is just a door-opener for one of many equivocation fallacies which are destroying the entire "Agile" space. Let's dig into it.

Equivocation

We're talking about "equivocation" when a term is used in multiple inconsistent ways, which thereby invokes false images in the minds of the audience. A simple example of how quickly this can get out of hand: "White Supremacy is racism. Since the White House is also White, it's also a sign of insitutional racism." - it's clear that the term "White" means two different things (in one case, a mindset - in the other, a description), but it's already hard to get the thought out of our heads.

Equivocations in the "Agile" space are much more subtle, thus harder to spot, but potentially no less damaging. So, let's get into some of the equivocations that are permeating "Agile." (and I purposefully don't dig into how "Agile" itself doesn't have a single definition people would agree on)

Customer

Based on Investopedia, "A customer is an individual or business that purchases another company's goods or services. Customers are important because they drive revenues; without them, businesses cannot continue to exist."

And yet, we have this concept of "internal customers" popping up left and right.

The first Agile Principle reads, "Our highest priority is to satisfy the customer through early and continuous delivery of valuable software." So, every Agile team needs a customer - and who's that customer for most agile teams? Well, they deliver Business Support Systems, so they say that "Business is our customer!" - circular logic based on the Investopedia definition: the business is important, because without it, the business can't continue to exist. Does that make sense?

It doesn't even end there: Modern software is complex. In larger companies, we often see platform teams building platforms that other teams work on. These say, "The other development teams are our customer!" - how much more profit would we make when we get more and more development teams asking for services from our platform team while nothing in our company's relationship with the open market changes?

To quote Wikipedia, "Leading authors in management and marketing, like Peter Drucker, Philip Kotler, W. Edwards Deming, etc., have not used the term "internal customer" in their works. They consider the "customer" as a very specific role in society which represents a crucial part in the relationship between the demand and the supply. Some of the most important characteristics of any customer are that: any customer is never in a subordination line with any supplier; any customer has equal positions with the supplier within negotiations, and any customer can accept or reject any offer for a service or a product."

The equivocation fallacy is a dual use of the word "customer," in one sense meaning "entity on whom a business depends as a source of revenue" - and in another sense meaning "entity consuming the outcomes of the work done by a group of people."

Why is that a problem? Demand.

When interacting with the open market, we would like to maximize demand - that is, we would like to have more customers constantly requesting more of our product, and more features from our product - to get them addicted for more, more and more. That enables growth, and makes our company sustainable.

On the other hand, internal demand is undesirable, because it diverts resources away from serving the real customer: internal entity A doing work for internal entity B means that B must spend time with A, which B doesn't have for the customer - and A must spend time to do something for B. That costs money. This money isn't profit any more, and the time B spends with A is profit not generated: internal demand costs twice. We therefore would like to minimize internal demand.

Equivocating "customers" to include both internal and external people opens the doors for a destructive game: maximizing internal demand. It makes internal service teams feel good about themselves and their work, while the inner proceedings bleed the company dry of critical resources.

By clearly differentiating between "customers" as "those who will determine whether they will pay for what we do" and "consumers" as "those who need what we do," we avoid this pitfall.

We won't go as deeply into other equivocation fallacies - just teasing what they are, and the damage that they're doing.

Experts

I chose that little quote from the Scrum Guide as an opener on purpose, because we hear very often that "developers are the experts, trust them."

The equivocation fallacy: Scrum requires developers to be experts in order to function properly. Many Scrum folks argue that being a developer means they're an expert.

This harms the developers when they don't get the support they would sorely need, and it harms the company when the team doesn't perform properly.

Owner

Actual Owners, according to Investopedia, are "a person or entity that receives the benefits of ownership." (emphasis added)

However, when you look at Scrum's Product Owner - how often are these individuals the real beneficiaries of the stuff they allegedly "own?" They have a bunch of accountabilities and responsibilities, but rarely own anything.

The equivocation? "Being a beneficiary of" and "Being accountable for."

The consequence?

While on the one hand, an owner can do whatever they please with their property, they tend to have a self-interest in maximizing their benefits. Without this self-interest, an Owner can merely minimize the effort they have to put into whatever they are supposed to "own."

Similar things apply to other forms of "Ownership" - process ownership, code ownership, system ownership or anything else: let people reap the benefits of whatever they're supposed to own, or it's going to backfire.

Value

Investopedia defines "Value" as "the monetary, material, or assessed worth of an asset, good, or service." In the "Agile" space, however, there's a notion that a Product Increment is the value produced by the team.

That's a confusion of cause and effect: If the Product Increment has value, then the team has produced value. But merely because the team has produced an increment, it doesn't mean it is value - it could be that it actually has a negative value.

Value-Add

Not specifically an issue of "Agile," more a confusion in the entire IT industry - "value added activity" is an activity that increases the value of a product or service. Analysis, design, documentation, testing and support do not create any assets. They are therefore "non-value-added activity."

The equivocation in software development is that all these activities are also value-adding, since they're contributing to the creation of value.

Careful: Just because an activity is non-value-adding, that doesn't mean it's not required. It just means it's not increasing the value.

Think of it like this: If you do more of activity X, does the company's total value go up? If not, then X is a non-value adding activity.

The issue?

We would like to maximize our share of value-added activity, and minimize our share of non-value-added activity. By equivocating, for example, testing, to be a value-added activity, we paint a picture that testing is something we should do more of, because hey, it creates value. No. It doesn't: It's an activity necessary to create and maintain value, but all other things equal, more testing doesn't mean we're delivering more value.

Value Stream

One of the latest fads that came up with SAFe is the "value stream." It's been around since the early days of Lean (although the concept was already used by Henry Ford around a century ago.)

Yet, if you'd ask how SAFe's "Development Value Stream" is different from a Software Development Lifecycle Process, people might not know how to answer - "Isn't a value stream just a high level process?"

That's the equivocation fallacy of value streams: Whereas a value stream is defined as "the set of value-added activities between customer need and realization of value by the customer"- when we modify the meaning of "customer," "value" and "value-add," then it is nothing other than a process.

The problem?
Clearly optimized value streams are make-or-break for the success of any business. Everything is subject to an organization's core value streams. Processes need to optimize around them. When we can't discriminate the two any more and locally optimize a support process at the expense of our actual value stream, we endanger the whole company.

Closing remarks

Equivocations make improvement difficult, because the thing we're talking about may not be the thing we're talking about. Let's get clear first what we really mean when we're talking about something, remove the equivocation, find a proper label that means what it says - and improve on the things that are hidden in plain sight by using the wrong words.

Fail Fast, Move On

Pages