Monday, September 10, 2018

Why one size of "Agile" can't fit all

Have you ever wondered why there isn't one "Agile" approach that works everywhere? After all, being agile is about being flexible - so why not invent a sufficiently flexible method that's suitable in every context?

This model is an attempt at explaining:

Your context defines your approach!


The domain of Understanding

On the horizontal axis, we have the domain of "Understanding". It's what we know and understand about the things we do. Discounting dumb mistakes, we can just assume that we put our understanding to good use.

Known

In the Known, we're very clear what the outcome of our actions is, and therefore, which action is best - for example, what happens when we produce 100000 more of the product we have already produced 1000000 of, that is: mass production, factory work.

Working with Known domain basically centers around doing the same thing as good as we can.
We try to minimize cost and effort in our attempts to maximize productivity.

Dwelling in the Known is about optimizing.

Unknown

In the Unknown, we either lack prerequisite information on the consequence of our actions - or there are simply too many possible outcomes to predict which one we will receive.

Working with the Unknown tends to boil down to eliminating potential undesirable outcomes until the only possible outcomes left are desirable. It means doing variations of things we know until we know what works best - and what doesn't.
We try to maximize the result of doing something successful while systematically evading the things we know to be unsuccessful - we explore.

Dwelling in the Unknown is about exploring.

Unknowable

The Unknowable is the realm where regardless of how much information we had, we still couldn't predict the outcome. The Unknowable is full of opportunities - and that's what makes it attractive. Untapped business opportunities only reside in this domain. A new board of Minesweeper would be a case of the "Unknowable".

An example of the Unknowable would be next week's lottery numbers - a future action has to occur before we have a chance of knowing the result of our action.

Working with the Unknowable is taking best guesses at what might potentially work and hoping that we don't hit on something we can't recoup from. It means trying out something we haven't tried before - and learning from experience.

Dwelling in the Unknown is about discovering.


The domain of  Risk

On the vertical axis, we have the domain of "Risk", or "Control". It's how well we can influence the outcome of our actions, how likely a predicted result can be produced. Again, discounting dumb mistakes, we can just assume that our controls work as designed.

Controlled

Working with controlled processes, we would deterministically get what we bargained for. For example, dropping a needle will always result in a needle on the floor, everything else would be outlandish.

Controlled processes can be fine-tuned. We try to eliminate any activity that doesn't result in the desired outcome and introduce more activity that does get us the desired outcome.

Dwelling in the Controlled is about simplifying.

Uncontrolled

There's a lot of uncontrolled stuff - things that either simply aren't likely enough to warrant a control action or things that are off our radar. A typical example of the Uncontrolled may be the likelihood of a Nuclear Meltdown causing fallout and radiation poisoning our Lead Developer. We might simply choose to keep this risk uncontrolled, because investing into ABC-Suits for our office doesn't sound like the most sensible way of spending money.

Uncontrolled factors in our processes are risky. There are risks we choose to take knowingly - and risks we simply forgot about. While it's haphazard to take high impact risks with simple mitigation plans, we can never eliminate all risks we can think of.

Uncontrolled processes can be improved by reducing the impact of variation.

Dwelling in the Uncontrolled is about standardizing.

Uncontrollable

And finally, there's the uncontrollable stuff. Those are things we couldn't control even if we tried to - for example, flu season. We can get flu shots, but we can't guarantee this year's virus isn't resistant to the vaccine. We have to live with the Uncontrollable and hope it doesn't kill us.

At best, we can try to pull things from the Uncontrollable into the realm of the Controllable. That may work best by changing our strategy to not rely on uncontrollable things.

Dwelling in the Uncontrollable is about stabilizing.



The Domains

Having explained the model, let's take a really short look at each of the quadrants:


Known

Competing in this area, the winner is determined mostly by efficiency.

Known & Controlled

Here, we can reliably plan and trust in the plan's execution. The best way forward is to create the best plan we can think of and follow it through.
Given two competing organizations, the one with the most efficient process wins.


Known & Uncontrolled

Our plan may be disrupted by day-to-day events, slowing us down, leading to extra costs and causing inefficiency. The best way forward is to learn from mistakes, improve and return to the plan.
Given two competing organizations, the one who is best at eliminating errors wins.


Known & Uncontrollable

Our plan can also be disrupted by events outside our sphere of influence and control. This may throw us off track, requiring in major efforts to return to plan. The best stratetgy is to have a watchful eye when something Uncontrollable infringes on our process and design a way to make the process resilient to identified disruptions.

Given two competing organizations, the one with the most resilience wins.




Unknown

Competing in this area, the winner is determined by speed of learning.

Unknown & Controlled

Having a clear way of taking steps to reach something that is either useful or not - and we use simple experiments to determine our path. We make the Unknown Known, then optimize.

Given two competing organizations, the one who can turn experiments into results faster wins.


Unknown & Uncontrolled

There is unpredictability in both process and in the results. Our best strategy forward is one of small steps to minimize the risk of having variance leading to undesirable outcomes. We build upon good results and backtrack from bad results.

Given two competing organizations, the one who is most scrutinous at dealing with problems wins.


Unknown & Uncontrollable

There's a constant danger that we're thrown off-track and we don't even know what the right track is., much less what the "best" track would be. We need to safeguard our path with measurements, as the path might collapse right under our feet. We constantly need to innovate in order just to maintain that which we have.

Given two competing organizations, the one that can adapt to circumstance best wins.



Unknowable

In this area, those who happen not to stumble upon anything of value lose by default. On the other side, there's no guarantee of winning, either.


Unknowable & Controlled

Science is full of the Unknowable. We hypothesize, we explore - and we see if we're right. The most solid guesses lead to something we can build upon. When we discover something of value, we still need to explore the context until we have something workable.

There is no predictable or reproducible winning strategy, but if we have a means of turning discovered value into profit, we're more likely to succeed.


Unknowable & Uncontrolled

Being exposed to an unfamiliar environment requires us to make the best use out of whatever means we have. If we can't manage to get into familiar terrain, we might at least familiarize ourselves with the terrain, then work from there.

Those who lose have no success story to tell, while those who won oftentimes make it look like their success was planned all the time.

Unknowable & Uncontrollable

There is no predictable strategy for anything. We can try something - and if it works, we try more of it to see if it still works. And just because something didn't work, it doesn't mean it was bad. It might have been the right thing turning out wrong.

In this realm, everyone who manages to survive is a winner.



Summary

What does "succeeding with agile" mean? It depends on what you're looking for.
Those who operate in known, controlled circumstances will be much more successful with improved planning - while those who operate in unknowable, uncontrolled circumstances are served best by not wasting time on planning.
Those who try to adapt to external risk in known processes need a different strategy than those who explore in a lab environment.


None of these environments is pure. Oftentimes, we have some things we know and control, while other things are neither known nor controllable. It would depend on what the ratio of these areas is in our work - and where we want that ratio to be.

What does that mean for agility?

We need to determine which circumstances we are in, then tailor our approach to doing that which is most likely to be a winning strategy. And we must know many different approaches, in order to select that which helps us best.

We're not served well by introducing Scrum into a Known/Controlled environment that would perform better with a linear approach - and we're not served well by introducing SAFe into an environment that would have their problems solved by Six Sigma. Likewise, we're not helping anyone by putting Kanban on top of an undetermined, unpredictable process.

What is appropriate where - fully depends on where we are and where we want to go.

And what does that mean for agile practitioners?

There is no universal context that makes any approach valid or invalid. As practitioners, we must:

  • Leave our dogmatism
  • Tailor "agility" to context
  • Understand and remove hindering constraints
  • Acknowledge Unknowns and seek help
  • Remember that "agility" isn't the end goal
  • Recognise that there is more to "agile" than "agile"

And that's why we came up with Agnostic Agile.

Saturday, September 1, 2018

Why "Agile" rarely works

Have you ever wondered why every organization wants to be agile, yet very few managers are? 
In this article, I will explore a highly philosophical model to attempt an answer.

TL;DR: Because it means changing how we see reality - and that's a price few are willing to pay!


Let's explore the model:


Yes, the main terms are German - because I realized that the English language uses the same term for "how we see the world" and "how we think the world is" - a distinction that clearly exists in the German language. Then again, I use the term "world model" - so let's float with English. Let's take a look at the model from right to left. I will just gloss over the deep concepts of these terms, because this is just a quick glance rather than a scientific essay.



Reality

Reality does not care for us - it's just us who are affected by it. The better we get at predicting how reality will respond to our interactions, the more we start to believe that we are "right" about reality - while essentially, it's just congruence between our world model and reality around us.
For example, a team might just do what they do - oftentimes, oblivious of their managers' thoughts and without regards to whether any manager is even around.


Impressions

As we observe or interact with reality, it affects us - what we feel, see, how we classify things, and what we would do next.
Our impressions are strongly filtered: First, we only receive a limited amount of impressions - essential information may exist outside our impressions. Second, we have a strong tendency to only receive those impressions we are looking for.

As a specific example: A traditional manager observes an agile team in action. The manager might get the impression that "this team is working laissez-faire" because nobody is checking on people, and might also get the impression that "this can't work" because there is no visible hierarchy in the team.

To change our ways, it's quite important to discuss the impressions we receive, in order to learn where we lack "the big picture" or we're over-emphasizing details.

World view

Our world view is the main filter of the impressions we receive. Every impression consistent with our world view will be forwarded into our conscious thinking, while impressions that are inconsistent with our world view will either be reinterpreted until they fit - or they will be discarded outright.
In this sense, our world view is the "eye" through which we obtain information. A narrow world view will imply that few impressions will go through unfiltered - while a broad world view will allow us to receive a lot more impressions.

Giving an example again, when an agile team decides to abolish progress reporting in favour of live product reviews, their manager must first be open to the idea that "working software is the (only relevant) measure of progress". As long as the manager's world view does not allow measuring progress in terms of production-ready software, they will discount both the benefits of interacting with users and the additional productivity obtained by not tracking work. Instead, their world view will make them see all the problems encountered by the team as caused by not following "a proper process".

The challenge with our world view is that it is strongly related to our world model (which is why in English, that would be the same term): what we see depends on what we can see. As long as our model does not permit us to process a different perception, our view will be limited to perceptions consistent with our model.

Genuine change, then, requires us to at least permit the possibility that our model is incomplete or even "wrong".


Perception

As the German word "perception" is also rendered, "taking as true", we can only take as true that which makes a true claim within our world model - i.e., only that which is both consistent with what we already call "true" and also within a spectrum of what we can classify to be "true".
Problems arise when we have already accepted false claims as "true" - we will discard or re-classify perceptions that are actually based on relevant impressions.

To take this out of the abstract realm, as long as we swallow the idea wholesale that "order is good, chaos is bad", we will never be able to appreciate the shaping and creating power of change - because change means that we change away from that which we consider "ordered" into something that, based on our current understanding, might be "chaos". Specifically, a self-organized team may not have a spokesperson or team leader at all. Such a team appears "totally chaotic" from the perspective of a manager who is used to corresponding only with team leads - and it will be very challenging to accept such a team as mature.

As long as we rely on an immutable world model, it's really difficult to see the benefits of conditions that don't git our model. Our perception of things that others consider "good" might be "bad", and we will classify situations accordingly.


World model

At the core of everything we see and do is our own world model. As hinted above, our world model has already decided whether any impression we receive is "true", "possibly true" or "false". Our world model helps us determine whether reality as we perceive it is "good" or "bad" and what we should then do.
The more static our world model is, the more binary this classification will be - and the simpler we will make decisions. Or, in other terms, "confidence" and "certainty" depend on a rather static world model, while ideas such as "doubt" or "hestitation" are related to a shifting or shaken world model.

Let's talk about me in this example: A few years ago, I was certain that clear process definitions would solve all business problems. Today, I stand on the perspective that clear process definitions in a changing world are the cause of all business problems - we need to be flexible to deal with situations that haven't happened before (and that may never happen again).



Summary


"Agility" might have to shake up our world model - as long as we're striving for certainty, we apply perception filters and biases that make situationally right choices invisible and lead us off-track. At the same time, the price of adjusting and softening up our world model may be high: we may need to admit that that which we fought for, sacrificed for, stood for, are no longer valid.

And that's the difficulty with agility: before we can get the benefits of being agile, we ourselves need to adjust our world model to be ready being agile.




Sunday, July 15, 2018

Agility isn't for everyone!

A lot of conflict in the workplace is caused by different expectations regarding the nature of the work. And agilists may not even be helping - they might just make it worse!
Here's why:

The Stacey Matrix - annotated with character traits.

Character traits per domain


Simple work gives confidence to people who excel at tasks that others may consider "chores". Although workplace automation has abolished a lot of simple work, there are still areas where well-defined, routine processes are commonplace. The most important characteristic to success in this domain is diligence - getting stuff done properly.

Complicated work rely on getting multiple pieces of work executed correctly and in proper sequence. This requires good coordination - putting multiple pieces of the puzzle together in the most effective way.

Complex work means that there is no one known best way of doing things, and there is no one specific goal to attain, either. Even though most of today's knowledge work occurs in this domain, people easily get irritated when they "just don't know" and still need to product results. The essential trait here is creativity - coming up with a solution in the face of the unknown.

Chaotic work occurs when there is no clear-cut way of doing things. Many people feel challenged working under such conditions, as the constant barrage of new information often invalidates former achievements. Resilience and a high amount of flexibility helps - changing direction whenever it makes sense!


The problem with "Projects"

The complex and chaotic domain are the places where projects crumble: The base assumptions of the project fall apart as soon as people start doing work. The coordinative ability of the Project Manager are of little help when the tasks to coordinate aren't helping achieve meaningful goals. Likewise, the most diligent worker isn't helping the company when the work isn't even going in the right direction.

It's extremely difficult to run a development project with the premise that project management is merely the meta-task of coordinating development tasks, as nothing would need to be developed if everything was clear to begin with.

A lack of flexibility often causes projects to fail in a sense that the outcome isn't needed when the project is done.
Likewise, a lack of creativity often causes projects to turn Red - objectives can't be met by following the plan. In unfortunate cases, the only creativity on a project team might be the project manager's ability to find excuses for the poor outcome.

Unless projects have people who exhibit the flexibility to deal with new information - and the creativity to do without proper processes or still do something useful when goals become invalid - the project is in trouble.


The problem with "Agile"

"Agile" dogma often seems to presume that all work requires flexbility, and that all workers are flexible.
Both premises are invalid. Not only are flexible, creative workers a rarity rather than a commodity - working in this domain should be an exception rather than the norm.
Creativity is often needed to pull unknown stuff into an area where slices of known work can be coordinated and executed, but that work still needs to get done.
Highly flexible people often enjoy the streaks of chaos that allows them to innovate - and they may not enjoy the grind and routine of doing the base work.

In a healthy team, there has to be a place for people who are diligent, for those who are good at coordinating stuff, for those who are creative - and for those who enjoy the whimsiness of the Unknown. Put together, such a team can be extremely effective.



Summary

We need to respect that not everyone is creative and that some prefer routine - and we need to respect those who can't bear routine work and their drive for change. And we are well advised to neither compare nor mix up such work: it's just too different.


Try discovering where people see their favorite work and help them find their place accordingly.

Avoid creating a culture where people enjoying only one type of work feel left behind. It might create a dangerous monoculture!



Order vs. Chaos - a look at the science

Order creates a feeling of safety. Chaos, on the other hand, is a term with negative connotation.
But is order really what we want?

Using an analogy from material sciences, let's take a different look at the struggle of "order versus chaos".


Take a look at this model:


Order is the "frozen state" and Chaos is the "volatile state". I use this example on purpose: Kurt Lewin has proposed a 3-state change model, "Unfreeze-Change-Refreeze". It perfectly correlates to material sciences.

Lewin assumes that organizational structures and/or processes are best kept in a "frozen state". And in many organizations, that's true. Does it need to be true? Let's take a step by step look by examining the states first.


Different states

Frozen = Ordered

A frozen organization is ordered. It doesn't matter how effective or efficient the organization is - things are clear. Well, maybe not so much, but still. There are known structures to uphold, there are known people to address, there's a known protocol to follow and know processes to apply.
This makes things simple: If A, then B.
I go to work in the morning, know what I will be doing - and even when I take a week off, when I come back, I know the exact state where I will resume.

Terms used to describe a frozen state include: "predictable, reliable, convenient". All of these words are positively connotated both with workers and managers.

Hence, the desire to freeze the organizational system.

Along comes a troublemaker. A person who does things different. Who won't accept "That's how we always done it" for a reason. Who breaks rules to get stuff done. Who bypasses hierarchies getting in the way of success. Who messes with people following dysfunctional processes without further ado. Like - me.

In a frozen state, the organization will consider such a person a dangerous foreign body needing to be dealt with. A single person faced with frozen organizations will be forced into one of two choices: adapt - or get out. That's why change initiatives are well-nigh impossible without a strong guiding coalition as proposed by Kotter: It's not about doing things differently, it's about accumulating a critical mass of people with sufficient power to unfreeze the system before even bothering to really go on with the change agenda.


Liquid = Nonlinear

Getting out of a solid, frozen state in physics requires dissolving the bonds between system components. Depending on what you want to change, you may need to un-link structure, people and processes alike - simultaneously!

Dissolving links means that communication structures will no longer work as before, processes will no longer produce the same outcome - and results on any level may change. The bigger the incision, the more likely unpredicted side effects will occur.

People lose the feeling associated with an order state - predictablity wanes, reliability is reduced and people start experiencing the discomfort of needing to think and connecting dots differently. When essential, strong links haven't been dissolved, the system returns back to the former stable state as soon as possible and may even obtain resistance strategies - inoculation to ward off future change.

Going back to our physics analogy, we need to invest energy into an ordered system to unfreeze it. The amount of energy needed to make a permanent change in an ordered structure directly correlates with the strength of the links which need to be broken.

Let's take the example of water:
It takes a rather limited amount of energy to melt an ice cube.
With significantly more energy, we could split the molecular bonds and turn the ice cube into H2 and O2.
Should we desire to reform the very atomic bonds and turn the hydrogen atoms into helium, we need to invest a lot more energy into a much riskier and complicated process.

In either case, the former state will cease to exist - and that's where reactions such as fear, grief (Kübler-Ross) and entitlement come in.


Volatile = Chaotic

Volatile systems do not display the characteristics of an ordered system. The high energy inherent to each particle allows them to move rapidly - making it extremely difficult to predict the next state of even a single component in the system, much less enabling methodic control on the overall system.
In a chaotic state, change doesn't require an investment - it happens continuously and without trigger. Change isn't something that needs to be initiated, much rather, it would need to be channeled to produce something desirable.

In a volatile state, the system's drive to change is bigger than any component's capability to stabilize, hence no change is permanent.

A molecule in a gaseous state would not consider this state "anomalous" - much rather, it would need to be deprived of its energy before entering any other state.

And this is an important part of volatile systems - the components have high amounts of energy!

Like order is produced by draining energy from components, chaos is produced by energizing components.


State transitions

Material state transitions can be measured and predicted with high accuracy - something not quite as simple when dealing with an organizational system combined of a complex structure including many individual people, highly interrelated processes and potentially an innumerable amount of technological dependencies.

State transitions in a frozen organization are often considered undesirable, as people are afraid that something will get broken for good:

As the  "frozen" state of an organization becomes stronger, the more energy will be required to reach the "un-freeze" state which makes state transition possible. Combined with the complexity of the system, the amount of energy required for a successful defreeze might be roughly the same amount required to enter a volatile state - and depending on the size of the desired change, an excursion into a chaotic state may be required.

When being unfamiliar with the chaotic domain, there is both the danger that chaos gets out of hand or that when the system gets into a stable state again, that state is undesirable.



The false Order-Chaos dichotomy

Frozen systems are stable, but they aren't very malleable. The key characteristic of frozen systems is their lack of energy. An organization with stable structures that lack energy is in constant danger of being stuck in the wrong place - and won't be able to make the necessary move to remain sustainable.

On the other end of the spectrum, chaotic systems aren't stable - and equally un-malleable, albeit for a different reason. The key characteristic of volatile system is their high energy - and therefore, the impossibility to ever nail down anything.

It's not so easy to say whether order or chaos is better - that would depend on what you want to achieve. Based on the points above, we should realize that we're not forced to choose between order and chaos. The choice is a false dichotomy.
Just because water isn't ice, it doesn't mean that you'll die from steam burns.
It could equally be a refreshing glass of cool water or a warm cup ready to make some tea.

The idea "If it's not ordered, then it's chaos" is nothing more than a false dichotomy: There's a large spectrum of conditions between frozen and volatile - and while neither of the two extremes is specifically habitable, the range inbetween very well is.

Get used to change

When you look at a glass of water, you can't tell where every single molecule is. Even if you knew where it was five seconds ago, you can hardly tell where it will be in a few seconds. At the same time, you can tell a few things with fairly high confidence:

  • You're looking at a glass of water
  • The molecule in question is within that glass of water
  • That molecule will still be within that glass of water in a few seconds.


You can even make more reliable predictions:

  • If you throw a pebble into the glass, the water will (for the most) remain where it was.
  • The pebble will not significantly affect the way your water behaves.
  • When you take the pebble out, the water will look like it did before.
Interpreting the analogy, the liquid, non-linear state is both more flexible than the ordered and more reliable than the chaotic state. 

If you want to keep your organization robust to outside interference, you need to abolish the idea that everything needs to be in place: The ordered state can't deal with changing circumstances. 

In an ordered state, Lewin's "unfreeze-change-refreeze" process is essential to adapt.
In a non-linear state, there is nothing to unfreeze, and nothing in need of being re-frozen.


The practical interpretation

We talk a lot about "Agile Transformation", and in the minds of people this is unfreezing a non-agile organization, changing it towards an agile organization, then refreezing it in its agile state.
There is no such thing.

The agile organization is like water, constantly in a nonlinear state. Change in an agile organization isn't a project. Instead, everyone and everything within an agile organization is constantly subject to change.

Every person thinks about new, better ways to achieve things every day - every process can be scrutinized and modified at any time. When outward circumstances change, the agile organization doesn't start a massive adaption initiative, they just do what it takes to deal with the new circumstance.

And that's why you can't "buy agile" - to be agile, you must be in that liquid, transient state - at all times!













Thursday, July 5, 2018

"Googlewins Law" - The Google Argument

Maybe you've had an occasion with "The Google Argument" before. I call it "Googlewin's Law". What is it and how does it damage dialogue?



In homage to Godwin's Law, I call for "Googlewin's Law", and would phrase it like this:

"As a technical discussion grows longer, the probability of a comparison involving Google approaches 1"

I have observed an emerging trend that when meaningful arguments run low, someone "pulls a Google" (alternatively Linkedin, Amazon, Facebook) in an attempt to score an intended winning strike. However, most often, the invocation of Google is nothing more than a fallacy.
Here are the three most common uses of the Google Argument:

The positive Google Argument

When developers want to do something which could be called "nerdfest" and run out of meaningful arguments why doing this is a good idea, they invoke the positive Google argument:
"We could become the next Google with this". 
Typical invocations could be: "With this ranking algorithm, we could become the next Google!" - "With this sales platform, we could become the next Amazon!"

Here is why it's a fallacy:

Google never tried to become great, they tried to do something that happened to work, and because they did that exceedingly well, in all domains - from technical implementation over marketing and sales all the way to customer service - they succeeded. Oh, and they happened to have great seed funding.
Google did not become great because of one good technology, they became great because they happened to do a whole lot of other things right as well in a market where one stupid move can cost you everything.

So the next time someone pulls a positive Google on you, just ask: "What makes you so sure we don't become the next Blockbusters with that idea?"


The negative Google argument

The opposite of the positive Google argument, used as a killer argument against any form of innovation or change is the negative Google argument:

"We don't need this. We are not Google".
Typical invocations sound like: "Continuous Integration? We're not Google!" - "Microservices? We're not Google!" - "Virtual Machines? We're not Google!"

Here is why it's a fallacy:

Not everything Google does is only helpful for Google. Google is using a lot of techniques and technologies that help them achieve their mission and goals easier and more effectively.
Google has even created quite a number of useful tools, frameworks and techniques that are available open source (such as Angular) simply because they are useful.
If everything that made Google successful was anathema, you shouldn't even be using computers!



The appeal to Google

When lacking evidence or sound arguments, what's more convenient than invoking the name of a billion-dollar-company to make your case? Who could argue against an appeal to Google:

"Google also does this." - "Google invented this!"
Typical invocations would be: "Of course we need a distributed server farm. Just look at Google, they also do that!" - "Our product search page needs semantic interpretations. Google also does this!"

Here is why it's a fallacy:

First and foremost, unless you're in the business of selling advertisement space on one of the world's most frequented websites, chances are you're not going to make profit the way Google does.
Second, Google can afford a technology infrastructure that costs billions, because that's what generates revenue as well. There's an old latin proverb, "quod licet iove, non licet bove" (lit. "What is suitable for a God is not befitting an ox")
Third, Google has many billions of dollars to invest. It doesn't hurt Google to make sink $100m into a promising, yet ultimately unsuccessful innovation. I mean, yes it hurts, but it's not lethal. Can your business afford sinking $100m for zero returns? If so, you can appeal to Google, otherwise I'd be cautious.



Summary


The next time someone invokes Google, Facebook, Amazon, LinkedIn or even companies like Zappos, Spotify or whatever - think of Googlewin's Law.

What worked for others has no guarantee of working for you - and even though you are not them, not everything they do is bad (such as, for example, breathing!).
Google is not a reason either way.

Feel free to ask, "Can you rephrase that statement with a comprehensible reason that has a connection to our business?"


Sunday, July 1, 2018

The system: People

To successfully change the culture within an organization, we need to understand the system we are operating in and how people within that system interact. This isn't as simple as it looks, so here is a standard map ...

People who are part of the system
People and Interactions over Processes and Tools ...

The above is a proximity model, where people who are more likely to interact are drawn in close vicinity. YMMV.


Your team

The first circle is your team. In case of Scrum, that would be the developers, the Product Owner and the Scrum Master.
Who interacts with whom, and how should these interactions look like? The Scrum Guide has a few things to say on mandatory Scrum interactions - for example, Planning, Refinement, Review or Retrospectives. Those are the objective, process-driven interactions.
What the Scrum Guide can't help you with: Who likes whom, who can work well together - and: which factors bring people closer together or further apart?
Detractors could be obvious stuff like different work ethics, fandom (Star Wars vs. Trekkies, for example), skill animosities (e.g, tester vs. coder) but also hidden things like subtle bigotry, not liking another person's choice of diet (e.g., garlic/onions) etc.
All of these will affect how well your team can interact. The good news? All of this is within your team's sphere of control - as long as you can talk about it, you can find a way to knit together.


Your organization

The second circle is your organization. Most traditional enterprises have roles such as team leads, middle managers, senior managers, finance, HR, marketing, sales, customer service etc.
All of these people will affect your team one way or another and the potential interaction points are already too many to count.
While the Scrum Guide states that the Scrum Master should protect the team from outside interference, it's hardly practical to create a complete bubble of isolation around the team. Especially in the early phases of an agile transition, the direct and indirect effects of managerial roles may still be very strong, oftentimes in detrimental ways. The solution can't be digging trenches - you'll need to provide education on the effects of any influence taken upon your team.
As management finds their new role within the changing organization, the interactions with classic business departments also change: the team gets closer to functions like sales, CSR or marketing - and these people, too, will need to learn which functions are better maintained within your team and which information needs to be communicated straight between the team and them.

I have heard many Scrum masters talk about "drawing boundaries" - while I personally would favour "blurring boundaries", i.e. integrating the different business people in ways that remove indirection and delay, right to the point where information is freely available and collaboration happens without structural limitation.


Immediate surrounding

Your company doesn't operate in a vacuum. There are many second-order interactions going on, the most important being between the customer and your organization. Depending on whether your team is working as business support or in marketable product development, it's a great idea to move the team as close as possible to (in the best case: directly into contact with) the customer.

In traditional organizations, you will see that sales people and marketing try to bond with the customer - which can be both a blessing and a curse for your team. These people, too, need to learn which second-degree interactions are helpful and which aren't. For example, making promises to the customer without consulting the team is oftentimes a recipe for disaster.

Then, there are suppliers and competitors. Suppliers may or may not have the flexibility to support your team's newfound ways of working, which can become a massive problem unless addressed. In many cases, vendors find themselves in an uncomfortable situation when their client(you) is asking for more flexibility, in terms of both contracts and speed.
Your competitors will strive to be more agile than you, so keep an eye on what they are doing at all times. As long as you can learn something from them, you have homework to do. In complex environments, your suppliers will also be working closely with your competitors - which can create extremely interesting dynamics that can become very dangerous for your organization unless properly managed.

And then there's all of those other second-degree interactions: Friends and family, (social) media. You can't possibly hope to change these people and we're quickly talking about thousands of possible interaction points for a single team, yet these do affect your team as well.

Never forget that these interactions are highly intertwined with all the other interactions previously mentioned: a single press release by Markting can cause hell to break loose within your team, while a careless blog post created by your developers could void a year's work of sales currying favour with a potential client.

At least the good news is that up to this point, the interactions are either directly under your team's sphere of control - or at least, they can be influenced one way or another. That means you need to be finding ways to make those interactions favorable for your team.

The world

Life would be simple if everything could be brought under control - but it's just not that simple: Imagine that you did everything right, and along comes a new law that makes your product illegal: What would you do? It's not as easy as having a chat with lawmakers and undoing the law.
There are so many forces far beyond your control which can devastate everything your team is doing. Thought leaders coming up with new ideas which might imply that you've been working in the wrong way, industry leaders disrupting your market segment, political leaders interfering with your entire industry - and you might have no way to deal with it other than to cope with it!

The impact of "world level" interactions can also be both harmful and helpful - for example, a new invention may boost your team (if management lets you take advantage of it) or favorable changes in the market may boost your sales, thereby your financial resources. There are also cases where new laws or political changes might work in your favour and drive customers straight to your company.
These same effects could be positive or negative - in some cases, both. For example, that new helpful invention might boost your team - but also force you to invest heavily into it, draining resources elsewhere. Or that new political situation might drive flocks of customers right to your company - while the necessary customization changes overburden your team!

The world is beyond influence and control. In the Cynefin Framework, it would correlate with the "Chaotic Domain". You can't shield your team from the world's influence on your organization (and therefore, your team). Unless you're in a highly lobbying segment with massive public influence,  you can't tell the people of the world how they should behave, and you can't educate the world, either.

When it comes to interactions between the world and your organization, your only strategy is - adapt. That's what "being agile" is all about.


Summary

None of these interactions can be completely neglected - yet which of these interactions are crucial for whom, and when, constantly changes, so you're shooting at moving targets.

I hope you liked this small introduction into "People of the System" and what their interactions mean for your team. If you are an agile coach, understand that you can't always work on all of these, so you will need to observe the most important interaction points and work directly to make them favorable for your team.

"Being agile" means making the interactions within your sphere of influence more favorable - and learning to get along better with the interactions that influence you.

Sunday, June 24, 2018

Test Pyramid Explained - Part 2: Measurement Systems

Understanding the "Why" of the Test Pyramid is important in making the right decisions. This article examines the underlying foundation of testing: making statements about quality.
Why do we need to consider the test pyramid when creating our test suite?



How can we know if the software works? Whether it does what it's supposed to do? Whether it does that right?
If not, whether it's broken? What doesn't work? Why it doesn't work? What caused it to malfunction?
These are all different questions - and so the approach to answering the questions also differs. Which approach should we then take? 

Let's take a look at our test pyramid.



In an attempt to answer the questions above, we need to explore:

Measurement Systems

According to Wikipedia, a measurement system includes a number of factors, including - but not limited to - these:

Miss one of the factors, and you might end up with an entirely messed up test process!

Before we can answer how these factors contribute to your testing process, we need to examine why they are relevant - and to answer the "Why" question, we need to answer the even more fundamental question:

Why test?

There are varying reasons for testing, all of which require different approaches:

  1. Ensuring you did things right.
  2. Ensuring you are doing things right.
  3. Ensuring you will do things right.
  4. Ensuring you understand things right.
  5. Ensuring you did the right things.
  6. ...

As you might guess, a test approach to ensure you did things right will look vastly different from a test approach to ensure that you will be doing the right things.
Some approaches are more reactive in nature, while others are more proactive. Some are more concerned with the process of creating software - others are more concerned with the created software.

When no tests have formerly been in place (such as in a Legacy System), you're well advised to start at the easiest level: ensuring that you did things right, i.e. ensuring that the software works as intended.
This is our classic Waterfall testing approach, where testers get confronted with allegedly "finished" software which just needs to be quality-checked.

When you have the luxury of starting with a Green Field, you're well advised to take the more challenging, yet more rewarding route: ensuring that you will be doing the right thing right - before even starting off.
This approach requires "building quality in" right from the outset, using practices such as Behaviour Driven Development, Test Driven Development and Specification by Example.

The advantage of "testing early" is that misunderstandings are caught even before they can lead to faulty software, the advantage of "testing often" is that problems get solved before they proliferate or exacerbate.

The desirable state

A perfect testing approach would minimize:

  • the risk of introducing fault into the system
  • the time required to detect potential fault in the system
  • the effort required to correct fault in the system

When taking a good look at out testing pyramid from the last article, we can realize the following:

Test TypePrevent riskExecution timeCorrection Effort
Process ChainHardly helps:
Often doesn't
even get fixed
before launch.
Might come too late
in the process
Lots of pre-analysis
required, potentially
already proliferated.
SystemVery low:
Only prevents
known launch failure.
Very slow,
often gets skipped.
Slow, 
IntegrationLow:
Only catches defects
from proliferating in
the system.
Slow, difficult to set up.Interrupts the flow of work.
Feature&ContractBDD:
Know risk ahead.
Would run all the time
while working on a feature.
Should only affect
1 method.
UnitTDD:
Know risk ahead.
Neglegible.
Can always run.
Minimal.
Should only affect
1 line of code.


This matrix gives the impression that any test other than Feature&Contract or Unit test don't even make sense from an economic perspective - yet these types of test are most often neglected, and attention is paid to the upper parts of the Test Pyramid. Why does this happen?


Precision and Accuracy

Choose your poison

Let's suppose I turn on Google Maps and want to know how long my daily commute will take.
Imagine that I get to choose between two answers:
Answer #1: "Between 1 minute and 10 hours". Wow, that's helpful - not! It's an accurate answer with low precision.
Answer #2: "45 minutes, 21 seconds and 112 milliseconds". I like that. But ... when I hit the highway, there's traffic all over the place. I end up taking three hours. This answer was very precise - just also very inaccurate.

Do you prefer high accuracy and low precision - or high precision and low accuracy?
It seems like only a dunce would answer "high precision and low accuracy", because that's like having a non-winning lottery ticket.

Approximating meaning

When starting with nothing to begin with, it's a good idea to turn a huge fog of war into something more tangible, more solid - so we start with a test which brings us accuracy at the cost of precision. We approximate.
In the absence of a better strategy, a vague answer is better than no answer or a wrong answer. And that is how Process Chain tests are created.

Knowing nothing about the system, I can still easily answer a simple question, such as: "If I buy lettuce, bananas and napkins - will I have these exact three things shipped to my home?"
This is a typical process chain test. as it masks the complexity of the underlying process. The test requires little understanding of the system, yet allows the tester to make a definite yes/no statement about whether the system works as intended.

Unravelling complexity

When a tester's answer to a process chain test is "It doesn't work", the entire lack of accuracy in the quality statement is thrown directly at the developers, who then need to discover why it doesn't work. Testers then get trained to make a the best possible statement of quality, such as, "I got parsley instead of lettuce" and "The order confirmation showed lettuce" - the tester may never know is where the problem got introduced into the system. In a complex service landscape (potentially covering B2B suppliers, partners and service providers), the analysis process is often "Happy Hunting".

The false dichotomy

Choosing either accuracy or precision is a false dichotomy - why opt for one when you can have both? What is required is a measurement system of finer granularity.
Even in the above example, we hinted that the tester is definitely able to make a more accurate statement than "It didn't work" - and they can be more precise than that, as well. Good testers would always approximate the maximum possible accuracy and precision.
Their accuracy is only limited by logic hidden from their understanding - and their precision is only limited by the means through which they can interact with the process.
Giving testers deeper insight into the logic of a system allows them to increase their accuracy.
Giving them better means of interacting with the system allows them to increase their precision.

Under perfect conditions, a test will answer with perfect accuracy and perfect precision. And that's our Unit Test. The downside? To test for all potential issues - we need a LOT of them: Any single missing unit test means that we're punching holes into our precision statements.


Repeatability & Reproducibility

What's the most common joke among testers? "Works on my machine." - while testers consider this a developer's measly excuse for not fixing a defect, developers consider this statement as sufficient proof that the test was executed sloppy. The issue? Reproducibility.
It gets worse when the tester calls in the developer to show them the problem - and: magic - it works! The issue? Repeatability.

Reproducibility

In science, reproducibility is key - a hypothesis which can't rely on reproducible evidence is subject to severe doubts, and for good reason. In order to make a reliable statement of quality, therefore, is to ensure that test results are reproducible.
This means that given the same setup, we would expect to get the same outcome.
Let's look closely at the factors affecting the reproducibility of a test:
Preconditions, the environment, the code segment in question, the method of test execution - all affect reproducibility.
As most applications are stateful (i.e. the outcome depends on the current state of the system), reproducibility requires a perfect reconstruction of the test conditions. The bigger the scope affected by the test is - the more test conditions need to be met. In the worst case scenario, the entire world could affect the test case, and our only chance of reproducing the same outcome would be to snapshot and reset the world - which, of course, we can't do.

Our goal therefore should be to minimize the essential test conditions, as every additional condition reduces reproducibility.

Repeatability

Another key to hypothesis testing is being able to do the same thing over and over in order to get the same outcome. The Scientific Method requires repeatability for good reason: which conclusion do we draw when doing the same thing twice leads to different outcomes?
When we create an automated system which possibly fires the same code segment millions (or even billions) of times per day, then even a 1% fault ratio is unacceptable, so we can't rely on tests that may or may not be correct - we want the software itself to always respond in the same way, and we want our software tests to do the same.
The more often we run our tests, the more repeatability we need for our tests. When executing a test once a week, having 1% problems in our repeatability means that once in two years, we may need to repeat a test to get the correct result. It's an entirely differnt story when the test is executed a few hundred times per day - even a 1% repeatability issue would mean that we're doing nothing except figuring out why the tests have failed!


Flakiness

Every developer who uses a Continuous Integration (or: Deployment) pipeline has some horror stories to tell about flaky tests. Flakiness, in short, is the result of both reproducibility and repeatability issues.
Tests become flaky when either the process isn't 100% repeatable or there are some preconditions which haven't been caught in preparing the tests.
As test complexity increases, the amount of factors potentially causing flakiness increase - as well as the amount of test steps potentially resulting in flaky results.

Let's re-examine our pyramid:

Test TypeRepeatabilityReproducibilityCauses of Flakiness
Process ChainDifficult:
Any change can
change the outcome.
Extremely low:
A consistent state across
many systems is almost
impossible to maintain.
Unknown changes,
Unknown configuration effects,
Undefined interactions,
Unreliable systems,
Unreliable infrastructure
SystemExtremely low:
Desired feature
change can change
overall system.

Challenging:
Any system change can
cause any test to fail.
Unknown configuration effects,
Undefined interactions,
Unreliable infrastructure
IntegrationLow:
Every release has
new features, so tests
need updates.
Low:
Every feature change
will change test outcomes.
Unknown configuration effects,
Unreliable infrastructure
Feature&ContractHigh:
Feature tests are
changed only when
features change.
High:
Feature definitions are
comprehensive.
Uncoordinated changes in API
definitions
UnitHigh:
The test outcome
should only change
when the code
has changed.
Extremely high.
A unit test always does
the same one thing.
None.


We again observe that testing high up in the pyramid leads to high flakiness and poor test outcomes - whereas testing far down in the pyramid creates a higher level of quality control.

A flakiness level of 10% means that from 10 tests, an average of 1 test fails - so if we include a test suite of 30 flaky Tests into a build pipeline, we're hardly ever going to get a Green Master - we just don't know if there's a software problem or something else is going on.
And 10% flakiness in Process Chains is not a bad value - I've seen numbers ranging as high as 50%, given stuff like network timeouts, uncommunicated downtimes, unreliable data in the test database etc.


When we want to rely on our tests, we must guarantee 100% repeatability and reproducibility to prevent flakiness - and the only way to get there is to move tests as low in the pyramid as possible.


Conclusion

In this section, we have covered some of the critical factors contributing to a reliable testing system.
Long story short: we need a testing strategy that moves tests to the lowest levels in the pyramid, otherwise our tests will be a quality issue all by themselves!




Sunday, June 17, 2018

Test Pyramid Explained - part 1

Let's take a deeper look at what the Test Pyramid is, and how it can help us achieve sustainable, high quality. In this section, we will take a look at the left part of the picture only, as understanding this portion is essential to making sense of the right side.


Yet another model of the "Test Pyramid" - there's more to it than meets the eye!

The five levels

Before we get into the "How", we will examine the "What" - the five different levels, starting from top to bottom. Why top-down? Because this is how the business looks at software.

Process Chain Tests

A process chain is a description of a user-centric feature (oftentimes, a user story), irrespective of where it is implemented. From a high level, a customer might be something like "I want my order shipped home."
Such a process chain may consist of a larger number of technical features realized across a bigger number of subsystems, some of them potentially not even software. In our example, the process chain might look like this:

  1.  User presses "Purchase" (online shop)
  2.  User makes payment (payment provider)
  3.  Order gets sent to warehouse for picking (warehouse system)
  4.  Order is picked (picker's device + warehouse system)
  5.  Package is sent for shipment (logistics + logistics payment system)
  6.  Package is shipped (logistics + logistics tracking system)
  7.  Package arrives (logistics tracking system)
  8.  Order is closed (online shop)


As we can see from this example, it's incredibly complex to test a process chain, as each system and activity has a chance to fail. The potential amount of failure scenarios are nearly infinite - regardless of how many we cover, there might still be another.

The good news is that if a process chain works, it's a guarantee that all subsystems and steps worked.
At the same time the bad news is that - if the process chain doesn't work, we may need to do a lot of trackbacking to discover where the failure was introduced into the system.

Regardless of how much we test elsewhere - it might just be a good idea to do at least one supervised process chain test before "going live" with a complex system. That is, if we can afford it. Many organizations might simply resort to monitoring a live system's process chain in a "friendly user pilot phase".

Duration
A process chain test might take anywhere from a few minutes to many weeks to complete. As a rule of thumb, lacking any further information, an hour to a day might be a solid guess for the execution time of such a test. This explains why we don't want hundreds of them.

System Tests

Slightly simper than process chain tests are the oftenplace common system tests: The system is considered an inseperable unit - oftentimes, a "black box".

A system test would be concerned with the activities and data transfers from the time data enters into one system until the sub-process within the system is closed. Resorting to our above example, a system test of the Online Shop might look like this:

  1.  User presses "Purchase" (Webshop)
  2.  User's order data is persisted as "Payment Pending" (Database)
  3.  User is redirected to payment section (External Payment service)
  4.  Payment is authorized (External Payment service)
  5.  Payment authorization ID is persisted (Database)
  6.  Order Status is set to "Payment Complete" (Database)
  7.  User is redirected to "Thank you" page (Webshop)
  8.  Order is forwareded to Warehouse system
  9.  Warehouse System sends Order Acknowledged message
  10.  Order Status is set to "In Process" (Database)
Here we see that system tests, despite having a much smaller scope than a process chain, are still nearly as difficult to test and stabilize. 

Oddly enough, many so-called "test factories" test on this level, creating complex automation scripts - oftentimes based on tools such as SeleniumIDE - which is seen as a feasible way to automate tests with little effort.
The downside of automating system tests is that a minor change in the test constellation will invalidate the test - in our example, if the "Thank You" is replaced with a modal stating "Your order has been completed.", we might have to scrap the entire test (depending on how poorly it has been written).

I have seen entire teams spending major portions of their time both figuring out why system tests failed - as well as keeping up with all those feature changes invalidating the tests.

Duration
System tests shouldn't take all too long, but 5-15 minutes for a single automated test case isn't unheard of. Fast system tests might finish in as little as ten seconds.

Integration Tests

Integration tests are inteded to check the I/O of a system's components, usually ignoring both the larger scope process chain and the lower level technical details. 

An integration test assumes that the preceding steps in the source system worked - the focus is on the system's entry and exit points, considering the internal logic as a black box.

In our webshop payment example, we might consider the following autonomous integration tests:

  1. When a user presses "Purchase", all items from the basket are stored in the database (UI -> Backend)
  2. When a user is forwarded to the Payment Website, the total purchase price is correctly transferred to the payment service (Backend -> payment system)
  3. When a payment is successfully completed, the payment data is correctly stored (payment system -> Backend)
  4. When an order is correctly paid, it is forwarded to the warehouse system (Backend -> warehouse system)
  5. The Warehouse system's order acknowledge is correctly processed (warehouse system -> Backend)

Integration tests are much smaller than system tests, and the root cause of failure is much easier to isolate.
The biggest downside of integration tests is that they rely on the availability and response of the partner system. If a partner system happens to be unavailable for any reason, integration tests can not be run.
I've seen this break the back of one webshop's test suite who relied on a global payment provider's sandbox that failed to answer during business hours, because it was constantly bombarded by thousands of clients.

Duration
Integration tests don't do all that much, their downside is the response time of the two systems. Good integration tests shouldn't take more than maybe 50ms, while poor integration tests might take a few seconds.

A good way to speed up integration tests is by mocking slow or unreliable partner systems, which can also speed them up massively, but adds complexity to the component's test suite.

Feature & Contract Tests

This group simultaneously contains two types of testing, as these go hand in hand: Feature tests are the internal logic how a system processes data. Contract tests validate how the data is being passed into / exits from the system.

Here's an example of a feature test:

class BasketValidationResponseSpec extends Specification {
   def "information given to customer" (BasketPojo Basket, String message, Boolean status ) {

expect:
   Basket.statusMessage() === message
   Basket.checkState() === status

where:
   Basket | message | status
   [ Bread[1], Butter[1], Book[1] ] | "Valid" | true
   [] | "Empty basket" | false
   [Bread[199] ] | "Too much Bread" | false
   [Bread[1], Butter[199] ] | "Too much Butter" | false
  }
}

(please forgive my indentation, HTML indentation is a pain)

Feature tests don't rely on any external interfaces being available, making them both reliable and fast to execute. Unlike unit tests (below), they don't test each method on their own, but might test a the interaction of multiple methods and/or classes.

Contract tests are the flip side of the coin here, as a feature test assumes that the data is both provided in the right way and is returned in a way that the interfaced component can correctly process. In an ever-changing software world, these assumption are often untrue - contracts help create some reliability here. I don't want to go into that topic too deeply, as contracts are an entire field on their own.

Duration
The good news is that good feature and contract tests execute in as little as 20ms, making them both incredibly fast and reliable.


Unit tests

The bread and butter of software development are unit tests. They test single methods within a class, and good engineering practice dictates that any line of code beyond getters and setters should have an associated unit test.
The purpose of unit tests isn't as much to create feature-level or user comprehensible test feedback, it's to ensure that the code is workable - even when refactored.

Unit tests will ensure your code is loosely coupled, that each method doesn't do too many things (ideal amount: one purpose per method), that involuntary design errors are quickly caught and many other things which help developers.

While well-designed Feature tests answer the question "Why" a piece of code does what it does, a unit test defines "How" the code does it. Separating these two things often neither makes sense - the boundary may be fluid. The main difference is that a unit test never relies on anything other than the method of test, whereas a feature test might rely on full object instatiation.
Their main "downside" is that their lifetime is coupled to the method they test - whenever the method gets adjusted, the unit test either has to stay valid or needs to be modified. If the method gets deleted, the test goes as well.

Duration
Unit tests are extremely fast. There are even tools executing the unit tests of modified code in the background while the developer is still typing. The limiting factors here are pretty much CPU speed and RAM: executing an entire project's unit test suite shouldn't take more than a minute (excluding ramp-up time of the IDE), otherwise you're probably doing something wrong.



Given these definitions, let's do a brief...


Summary


Test TypeDurationAccuracyDurability
Process Chain1h +Very LowVery Low
System1-15minVery LowVery Low
Integration50ms+LowLow
Feature&Contract10-20msHighHigh
Unit< 10msHighN/A

If you ask me if there's any sane reason to test on the higher levels of the pyramid - I'd answer: "It's too slow, too expensive and too unreliable." At the same time, there are reasons to test high in the pyramid, including: Coarse granularity business feasibility testing, lack of lower level automation and/or lack of developer skills.

In the next article of the series, I will explain the right side of the image - the testing metrics in more detail.



Sunday, June 10, 2018

Not Scrum - not a problem

We have been warned in our CSM training: "Scrum’s roles, events, artifacts, and rules are immutable and although implementing only parts of Scrum is possible, the result is not Scrum." - any deviation from Scrum leads to a dangerous "Scrum-But" - or worse ... so, you should stick to Scrum as per Guide!

Is that even a problem? Forget it!

Why would we even care if "the result is not Scrum"?

Here are a few examples of "results that aren't Scrum" ...


Unless you are in the business of producing and selling Scrum - why would it even be a problem if "the result is not Scrum"!

Scrum is but one of many means of achieving better business outcomes. It is neither a desirable outcome, nor the focus of your attention - again, unless you're making your money from Scrum.

As agnostic agile practitioners, we aren't forced to sell Scrum. We're trying to help our clients achieve better relevant business outcomes - more sales, more revenue, new markets, happier customers. If Scrum helps us get there, we're happy with Scrum as far as it helps. When Scrum becomes a distraction or an impediment - we'll gladly throw Scrum as per Guide overboard and do something else that works.

 "If you deviate, the result is not Scrum!" is a kind of fearmongering that only works on those who don't know that there are other, equally valid approaches. There's plenty of them around.

Saturday, June 2, 2018

Things that never meant what we understood

We throw around a lot of terminology - yet we may not even know what we're saying. Here are three terms that you may have understood differently from how the original author's intention:


1. Technical debt

Technical debt has been used by many to denote willfully taken shortcuts on quality.
Many developers use the term to imply that code has been developed with poor craftsmanship - for instance, lack of tests or overly complicated structure.

Ward Cunningham, the inventor of the term, originally saw Technical debt as a means of learning from the Real World - software built upon today's understanding incorporating everything we know at the moment put into use. He took the stance that it's better to ship today, learn tomorrow and then return to the code with tomorrow's knowledge - than to wait until tomorrow before even creating any code!

In his eyes, code should always look like it was consistently built "just today", never even hinting that it had looked different years ago. Technical debt was intended to be nothing more than the existence of things we can't know yet.

Technical debt always implied high quality clean code - because that is the only way to incorporate tomorrow's learning in a sustainable way without slowing down.

2. Kaizen ("Continuous Improvement")

Kaizen is often understood as an approach of getting better at doing things.
While it's laudable to improve - many improvement initiatives are rather aimless. Especially Scrum teams easily fall victim of such aimless changes when each Retrospective covers a different topic.

Taiichi Ohno, known as the father of the Toyota System which inspired Lean, Six Sigma - and Scrum, stated, "Where there is no standard, there can be no Kaizen".

Another thing that many of us Westerners seem to be unaware of: there's a difference between Kaizen and Kairyo - with Kaizen being an inward-focus exercise of becoming the best we can be - which in turn enables us to improve the system - and Kairyu being the exercise of improving the system itself. This, of course, means that Kaizen can never be delegated!

Kaizen requires a long-term direction towards which people desire to improve themselves. Such a direction is often absent in an agile environment - short-term thinking prevails, and people are happy having done something which improved the process a little.

What this "something" is, and how important it is in comparison to the strategic direction may elude everyone. And there's a huge chance that when we consider what we actually want to achieve, our "improvements" might even be a step in the wrong direction.
Have you ever bothered talking about where you yourself are actually heading - and why?


3. Agile

"Agile" is extremely difficult to pinpoint.  It means something different to everyone.
Some think of it as a project management methodology, while others claim "There are no agile projects".
Some think of a specific set of principles and practices, while others state these are all optional.
Some confuse a framework with agile - some go even as far as thinking that "Agile" can be packaged.
Some are even selling a certain piece of software which allegedly is "Agile".

Yet everyone seems to forget that the bunch of 17 people meeting at Snowpeak were out to define a new standard for how to better develop software - and couldn't agree on much more than 6 sentences.
Especially in the light of Kaizen above - What do 'better ways' even mean, when no direction and no standard has been defined?
A lot of confusion in the agile community is caused by people standing at different points, heading into different directions (or: not even having a direction) and aiming for different things - and then telling each other what "better" is supposed to mean.

The Agile Manifesto is nothing more than a handful of things that seemed to be consistent across the different perspectives: It answers neither What, How nor Why.
To actually make meaning from that, you need to find your own direction and start moving.




Thursday, May 31, 2018

Does "being agile" reduce IT cost?

Which truth is to the claim, "using the agile approach could cut banks' IT spending by 20% to 30%" as proposed by BCG and other consulting companies? 

I doubt that it will reduce a single penny in IT spending. Let me explain why.

Perspectives matter

Let me start with a metaphor.

As a hobby gardener, when I buy a new rosebush, I go to the garden shop and I can get one at around €10. It takes me about 1 hour to drive from and to the store and select a bush - and another 2 hours to plant it (hey, nobody said I'm a pro digger - a pro could do it in half an hour!)
Assuming that working time costs €80 per hour, let's compare the Cost of "planting 1 rose bush".
Hobby gardener:  €250.
Professional gardener: 50.

Superficially, you'd obviously turn to the professional gardener to plant your rose then.
Unfortunately, if I want a rose for €50, I would need to hire the gardener - full time.
And that means spending €60k annually on gardening, while I currently spend €1k.

So - it depends. Do you want to run one IT project, or do you want a cost-efficient IT that can deliver a hundred?
If I want only one rose, the professional gardener's cost is a few hundred times higher than the inefficient DIY approach.

As long as you focus exclusively on one project, cost for that one project is all that matters. Looking at IT overall, which constantly and consistently produces value, it's an entirely different story - which we will explore now.

The bottom line

Let's suppose I run an IT organization with 250 people. They cost me €20m in salary every year. Add 1000 servers that cost me €10m in hardware+maintenance every year.

In the past, I was working Waterfall.  How much did I spend? €30m per year. 
Let's say I switch to an agile approach.  How much do I spend? €30m per year.
Now where is the cost reduction? 

Truth is: Bottom line cost doesn't go down. 
Where are my savings? Do I fire people or turn off my servers to reduce cost? 

How agility reduces cost

There are many ways in which an agile organization has lower cost than a traditional organization:

Optimizing the work itself

When people are restructured into cross-functional teams where all people working to bring a feature into production cooperate in close proximity, a lot of activity dissipates. For example, we're no longer organizing a meeting to get information from each other - we just talk. We no longer write detailed specification documents - we collaborate to draft out something that everyone can work with, and then document only that which is needed for posteriority. We don't need to review DSD's any more, as we discuss until everyone has the same understanding. Oh - and since we no longer work based off a dead document, we can talk to the person. We're no longer tormenting our brain, "What did they mean?" - we just ask. And when we turn understanding into executable tests first - we're no longer getting into arguments whether the tester tested the right thing or the developer developed the right thing.

We can save a lot of time by reducing all of this scheduling, coordinating, intermediary documentation, reinventing the wheel and pointless arguments.


Optimize the flow of work

Traditional organizations often create huge batches of work, called "projects" or even "programs". These are intended to be delivered at a certain specified date.
Agilists would cut this batch into small, independent units and purposely descope all but the single one they work on and get that delivered. At best, they wouldn't even let the big furball of undone work accumulate and start working on every single item as soon as it become the highest priority.
By only having one thing to worry about at a time, we save efforts on task switching, status tracking and coordination.
We make work items independent and enable different teams to deliver autonomously. This massively cuts down on coordination overhead as well.

Optimize the content of work

Traditional project contain a lot of things which were considered to be important when the project was defined. Nobody really knows how much these features will actually be used until they went live - i.e., until the project is completed. By default, all project features are "Must-Have" (everything else isn't delivered anyways). A project must deliver the entire scope to be fully successful, so this is what the project manager will ensure.
Studies have shown that a good 50% of software features are "rarely used" or even "never used". Any effort invested into such unused features is burnt money.
As agilists, we would constantly apply the Simplicity principle and start incrementally increasing the value delivered. If the first simple version of the feature isn't even being used, we would stop building and focus on more important things.
We reduce the waste of building useless features.

Optimize value streams

When people can eliminate useless activity, they can deliver more features in the same time. By spending less time on delivering useless features, a higher percentage of developed features will actually help the business. By delivering in small increments instead of big batches, value gets into the hands of business earlier.

Optimize responsiveness

When new information comes up that invalidates old information, in a classic project, we have three options:
1. Ignore the new information. Do another project in the future.
2. Escalate as the project's goals are in danger.
3. Scrap the project and go back to the drawing board.

In environments where new information comes up on a daily basis, it's hard to fix something for months in advance. Many project managers have gone back to option #1, as that's the only way to ever conclude any project under such circumstances. Unfortunately, this means that the business always gets sub-optimal, outdated solutions.
While IT can perform well with this option, business performs terribly.
By reducing the lead and cycle time as well as delivering in smaller, incremental amounts, we reduce the risk that a specific new information devastates whatever we have been working on - and even if that happens, we reduce the loss incurred by incorporating the change.

We become more responsive to change, the main purpose of being agile.



All of this means - IT has a chance to become more efficient and more effective.
Oddly enough, none of this means that IT will be any cheaper.


Conclusion

The math is not "When you're agile, your IT cost goes down". It's "When you're agile, your business ROI goes up".
IT cost cuts are only possible when the organization is in the comfortable situation that they have too many developers and too little things that could be developed - a rather hypothetical stance, as not even corporations like Google or Amazon ever run out of work for developers.

What matters is not how much IT costs. What matters is whether an investment into IT is good. IT should have a high business value. And agility helps a lot there.

Cost - is the wrong focus. ROI is better.

Sunday, May 27, 2018

Three statements separating poor and great leaders

Many people think they are great, but when push comes to shove, they succumb to fear - and avoid doing the very thing that would be required for others to move forward. This applies to workers and managers alike - it's everyday leadership that we should all exhibit, or, as Tobias Mayer coined the term - it's merely "thoughtful citizenship". 
Here are three statements that many leaders might be too afraid to state:

I don't know

Smart people can easily come up with a plan or explanation which sounds so feasible that others will nod in appreciation. People with high charisma can make others believe even the most ludicrous things, such as for example: "clouds are actually giant, floating marshmallows".
This is very dangerous, as there is a huge potential of leading people on rabbit chases, and in extreme cases - even on witch hunts against those who disagree. The more respect a person receives, the more ready they should be able to state, "I don't know". Prominent leaders exercise great caution when making claims or providing instructions, as they have experience that a statement affecting many people can lead to massive problems, even when made from the best intentions.


I was wrong

"Hindsight 20/20". It's easy to be wise after the event. Statements about the future are always subject to error. Great leaders know this, and when they discover a dead end, they should be the first to pronounce, "I was wrong, we can't go on like this." Similarly, when they haven't seen the problem by themselves and are made aware by others, the three words will be like a rain of relief after a scorching summer: "I was wrong." - no lingering, no justification. Just closure. This open the door to progress. It frees others to go on.


I need help

Many people feel that exposing vulnerability is a sign of weakness, something that can and will be used against them. Great leaders don't care about weakness as much as about strength. Instead of saying, "I can't do this" - they say, "I need your help to do this!", knowing full well that everybody is good at something else.
In some cases, they will even ask others from help not because of their own need - they do this in order for them to grow and build them up! 



Summary

"I was wrong." - "I don't know" - "I need your help." 
These three sentences sound like they describe a weak person, even though saying these three things requires an incredible amount of courage. A person using these three sentences when appropriate displays massive strength of character and integrity. On the other hand, a person too afraid to speak these simple words when needed isn't worth following.

Do you have the courage to stand in front of your team, your company - even your own family - and say these words?
What consequences do you expect when you utter them?