Showing posts with label Planning. Show all posts
Showing posts with label Planning. Show all posts

Sunday, June 4, 2023

Little's Law and the Hidden Variable

Did you know there's a hidden variable in Little's Law, and that the traditional equation L = λW - is missing something?
Well, it doesn't tell you something important, and it used to bug me a lot until I could pinpoint it - so let's explore.

What Little's Law says

Quoting Wikipedia: "In mathematical queueing theory, Little's law is a theorem by John Little which states that the long-term average number L of customers in a stationary system is equal to the long-term average effective arrival rate λ multiplied by the average time W that a customer spends in the system."

For example, if we take a restaurant - the average number of guests present (L) is equal to the average arrival rate of guests (λ) multiplied by the average time a guest spends in the restaurant (W).
Let's say, if the average arrival rate is 10 guests per hour (λ = 10 customers/hour) and the average time a guest spends in the restaurant is 30 minutes hour (W = 0.5 hour), then according to Little's Law, the average number of guests in the restaurant (L) would be 5 guests.

Sounds all good - so: what is missing?

When Little's Law doesn't work

A few years ago, I had a hunch that Little's Law was missing something: Imagine that our restaurant has 5 tables and one waiter who can serve 12 guests an hour. Guests take half an hour between getting seated and paying their tab.
Does Little's Restaurant follow the tradtional formula of L = λW, ie., W = L/λ?
Would a reduction of seats lead to guests dining faster?
Would a reduction of maximum dining time generate more guests, or would people dine longer if there were more guests?
Probably not.
Is Little's Law broken?

No. But it's missing something - it doesn't account for system capacity!

Fixing Little's Law?


This modified version of Little's Law accounts for capacity: L = λW / (1 - ρ)

Now, that alone doesn't make sense, so I need to explain this variable ρ:
ρ represents the utilization of the system, calculated as λ / μ, where λ denotes the average arrival rate into the system, μ is the service rate capacity of our system, i.e. the reciprocal of the average service time (1/μ = Ts), where Ts is the average time required to serve a single customer (note the difference between "service time (net time)" and "time spent in the system (gross time)" - it's critical!) The "hidden factor" that Little's Law hid in plain sight was the relationship between the average number of customers (L) and the arrival rate (λ) needs to consider the impact of utilization and system capacity on the system performance! In underutilized systems, an increase in arrival rates has no impact on queues - whereas in overutilized systems, a reduction in system load won't have a visible effect until we get close to the actual capacity limits.

Returning to our restaurant example: Our restaurant's capacity is currently constrained by our amount of tables. As long as we have empty tables, a reduction in customers will not speed up anything. As long as all tables are full, adding more tables to the restaurant won't slow anything down. This view is complete counter-intuitive with the original Little's Law - but it makes sense, even in real life. Oh yeah - at some point, the waiter will be overburdened. At this point, the system capacity is no longer defined by tables, but by the waiter. So it's still all about system capacity.

Some examples

Example values for our Restaurant
μλWρL = λW / (1 - ρ)
510.50.21
520.50.42
530.50.64
540.50.810 --> bigger than capacity!
1010.50.11
1020.50.21
1060.50.68
1070.50.712 --> bigger than capacity!
310.50.31
320.50.73
340.51.3-6 --> negative!
What's important to note are the three invalid records: When the arrival rates get close to or exceed the system's capacity, the numbers "break down." The real-life effects we would observe here:
  • When arrival rates start to approximate the restaurant's capacity, the number of guests present gets bigger than the capacity, which in fact can't be - these invalid numbers indicate that variability could cause a backlog (queue) to form at peak times which can only be serviced when peaks are over.
  • Customers arriving to a full restaurant can't get serviced and might leave - i.e., the negative number hints at unserviceable demand, i.e. lost opportunity.
The important observation may sound trivial, but is often ignored in management: only when you have more seats than the arrival rate of guests can you avoid having to send anyone home. And the fewer tables, the more likely someone will have to wait. Which is why a WIP Limit of 1 is just as impractical as not controlling demand influx.

Conclusion

While Little's Law has its merit, and we've been using it for years in order to explain the relationship between throughput, cycle time and Work in Process - we can't use Little's Law to optimize our systems properly if we don't account for System Capacity.
Taking System Capacity into account allows us to predetermine whether increasing or decreasing WIP will have a significant effect - operating significantly below capacity is capacity waste, whereas operating above system capacity causes overload waste.
Thus, the "hidden variable" is more than just important to apply Little's Law - it's crucial!

Further reading

There's an interesting whitepaper by Dr. Little who also highlights the key point of my blog article: as arrival rates approach service rate capacity (as outlined in this blog article,) WIP and processing time skyrocket, "hitting a brick wall" close to the system's capacity as queuing overhead reaches infinity.

Wednesday, October 14, 2020

How to resolve the Planning Conflict

There's a seeming conflict that might become apparent: On the one hand, "delivering early and often" is an Agile principle - and on the other hand, "deferred commitment"  is a Lean principle. This might create a planning conflict. How do you resolve it?



Planning purpose

First, we must realize that there are different reasons for planning. 

Within the team / development organization, the purpose of planning is to make sure that we have a realistic goal and we all understand what we need to do.

Towards our customers, the purpose of planning is different. They don't care who does what, and when. They care when they'll get what.

Towards other stakeholders in our organization, the purpose of planning is again different. They need to know when they're expected to contribute, and when they can get a contribution from us.


Defer commitment?

First thing to realize here is: "Who are we committing towards?" Are we committing inside the teams to maximize value - or are we committing a certain date or scope to our customers or stakeholders?

Customers and stakeholders plan their actions based on our commitment, so in this regard, we shouldn't commit anything that we can't keep, because otherwise, we may be creating significant, non-value-adding re-planning and organizational overhead. Broken customer commitments will damage our trust, If you can deliver without having to give a commitment, that's better, and even when you need to commit so that others can plan, try to commit as late as possible.

The trust issue

"Deferred commitment" requires trust. 
  • Trust in the team, that they do the best they possibly can. 
  • Trust in the organization, that they enable the team to succeed.
  • Trust in the customers and stakeholders, that they want the team to succeed.
Asking for early commitment hints at a lack of trust. The solution is not to enforce strict commitment, but to build trust. In a trustful relationship, deferred commitment shouldn't be an issue for anyone.


Deliver early?

Inside our team, we plan to deliver as much value as early as possible, because "you got what you got". To minimize risk and to avoid falling for Parkinson's Law, we should avoid keeping activity buffers that allow us to "do extra work", and we should remember that early delivery is our feedback and learning trigger.


Resolving the conflict

There is no conflict.
We work towards two separate events: 
The team's first point of feedback, and the point of business completion.
  • The first date is the earliest point in time when we can get feedback. It allows us to validate our assumptions and to verify our product. There is no guarantee of completion or finality. For internal planning, we look for earliest possible dates, so that we can reduce risk and deliver value quickly.
  • The second date is the latest point in time when we can complete a topic. We communicate this date as late as possible and try to avoid having to lock it in if we can. This minimizes the danger of expectation mismatch. For external communication, we look for latest feasible dates, so that other people's decisions don't rely on our unvalidated assumptions.

Addendum

Based on the feedback that "deferred commitment" in a Lean context is referring to decisions:
The statement "Scope X will be completed at date Y" consists of two decisions made today: a decision about what, as well as a decision about when. If there is no need to decide this today, we should not.
We try to avoid locking in a decision that has a significant risk of being wrong.
That is not the same as "we won't deliver any value until some undefined date in the future." It means, "we can't guarantee you the value until we know more."

Wednesday, February 26, 2020

ART Design - and: Solving the wrong problems

It's amazing how good organizations are at solving the wrong problem, i.e. "doing the wrong thing righter". Here is a great example of how not to design an ART:



This was the outcome of a PI-Planning. Their planning Retrospective led people to conclude that "electronic tools are much better for developing a Program Board than physical boards, because the dependencies are way too difficult to correlate with all those strings floating around, falling off, getting intertwined and so on."

This Train operates based on the assumptions that they have the right ART, the right teams - and are working on the right things. Never did it cross anyone's minds that any of these assumptions might be wrong.

The wrong ART

The first question we need to ask ourselves: do we have the right Agile Release Train?
If there is an extensive dependence on people outside the Agile Release Train, or there's a massive capacity bottleneck within the ART, while people outside the ART do have capacity to do the blocked work, then we might want to slice the ART different.

The wrong teams

It's okay for teams to occasionally depend upon one another. It fosters learning and information exchange. Some Product Managers even go as far as to purposely define features in a way that "swarming" across teams allows the ART to generate value faster. When teams choose to split work  and collaborate in real time to maximize value generation, that's a plus.

What is not okay, and a clear indicator that we have the wrong teams: When no team can get any work done without relying on actions of other teams.
Even Component Teams should be able to release value straight into Production, when teams can only do piecemeal work, those are specialist teams that inhibit the flow of value.

I have seen more than once how teams, guided by experienced SPCs and insightful RTE's, have spontaneously used the PI-Planning to "disband and regroup" - reforming as teams capable of delivering end-to-end value: It's possible!

The wrong work

The above example is a clear example of what leads to doing the wrong work. With so many dependencies, "dependency management" is already a full-time job. It shouldn't be. It should be effortless.
The prime directive of dealing with dependencies is: "Minimize."

When I see two or three dependencies on a PI Planning board, I'm happy - it means, we have decent team constellations and the right skills in the teams.
When I see more than ten dependencies, I will express concerns about team constellation.
When I see more dependencies than features on the board, I would ask the ART to work on resolving their dependencies rather than figure out better ways to manage their dependencies.





Friday, September 20, 2019

Plans are useless - really?

Sparked by a recent thread on LinkedIn, I would like to dig a little bit into the idea of planning in an agile environment.

Useless, worthless, pointless plans

These three terms are often conflated, even though they do not mean the same thing.

Utility

Let's start with a metaphor.
Sand may be considered to be useless in a desert.
Then again, sand is very useful when making transistor chips from the silicon.
We need sand to sustain our production of computers, which in turn sustains our modern lifestyle. While sand (as an intermediary input) is useless, the results created by the sand are highly useful.

Such are plans.
A plan's usefulness is the extent in which it is used.
A plan is only useless if it isn't used at all.

Calling plans "useless" is misleading - more interesting is the question which aspects of the plan are being used and which aren't.

Value

Again, let's go to a metaphor.
Water has no value to a person living on a lakeshore, and even negative value to a person who is drowining. It's precious to people living in an arid environment. It's the same water - only in a different context.

Value depends on market forces - supply and demand.

A plan's value lies in how far it is actually demanded.
A plan is worthless when nobody wants it. It is also worthless when it leads us to places we don't want to go.

Calling plans "worthless" is an oversimplification. More interesting is the question how to maximize the value of planning.

Return on Invest (ROI)

An extension of value is the Return on Invest - that is, the investment into the plan. Irrespective whether you calculate cost by creation effort or TCO, it does affect the net value of the plan.
Note: A plan may have a negative ROI even though it does have value!


Purpose

Let's skip the metaphor.
A plan in and of itself has no purpose. The purpose of a plan lies in achieving a goal.

A plan has a purpose if it helps us reach our goal better than we would otherwise.
A plan is pointless if it either doesn't help us reach our goal or distracts us from our goal.

Calling plans "pointless" is a hasty generalization. More interesting is the question how to ensure our plan serves its purpose.


Consequences

The best plans maximizes all three attributes: utility, value and purpose.

A plan is ...

  • beneficial, if it helps make better choices.
  • worthwhile, if it results in more benefit than it did cost.

The plan is, however ...
  • useless, if it doesn't get used or in a way that doesn't produce benefits.
  • worthless, if it isn't needed.
  • pointless, if it doesn't lead us to our goal.


Better plans

A plan is always made within a context.
As we gain more information, we need to integrate this into our plan.
As information changes, we need to update our plan.

All these activities are part of planning.

Planning

There are two types of planning - the initial plan creation, and replanning, i.e. updating the plan.

Spending more time on planning is...

  • a benefit if it increases utility, value and purpose of the plan.
  • worthwhile if the benfit outweighs the cost.

Spending more time on planning is...

  • useless, if it doesn't add further utility.
  • worthless, if it doesn't add further value.
  • pointless, if it doesn't help us reach our goal better.

Replanning

The best plan isn't the plan with highest accuracy, but the plan which maximizes utility, value and purpose while minimizing cost. It is optimzed for cost both during initial creation and future changes. It therefore likewise minimizes the deterioration in utility, value and purpose while simultaneously minimizing cost in each step. 

The easiest way to achieve this is by not planning things that do not need to be planned yet, as this implicitly reduces the risk of sinking money into re-planning.

An agile planning strategy

Agile plans are extremely context sensitive. 
Depending on how fast your environment changes, your planning horizon can be longer or shorter.
Depending on how much information of a plan helps in achieving your goal, your planning scope may be bigger or smaller.



Conclusion

Both planning and plans can be useful, worhwhile and purposeful. And each can be neither of these three things.

Avoid conflating or generalizing these issues. Instead, when you think that your plan is useless, inspect and adapt on this. Likewise, when you think your plans are worthless or pointless.

Tuesday, August 27, 2019

All Unknowns aren't equal

The Stacey Matrix has invited many agile teams to use the term, "Unknown" as a reason for using an "Agile Approach", i.e. winging it and finding out what's happening.
In some cases, that's the most economical or even the only option. But Unknown isn't unknown, and it makes sense to classify the term a little bit further.




The types of Unknown

Folly

Some things are just common knowledge. For example, everyone who has a smartphone knows that you need a secure password for everything that's on the Internet. To claim that "we didn't know that 'root' doesn't make a good root password for our web server." is just plain foolish. It's not Unknown, and there is no sensible reason to treat it as one.

Unqualified

If you work in a field, for example, software development, a few things can be assumed as standard knowledge. It doesn't take much except reading a few blog articles, a beginner's book or attending a really basic class to know these things. People who don't put the effort to know the very basics of what they're dealing with are sloppy and therefore unqualified. Absence of basic knowledge means you hired the wrong people, not that people should learn by trial and error.

Silly Excuses

Things wouldn't be common knowledge unless they were understood by a majority of the population. Claiming that it's not possible to research what happens when you disconnect a router is just a silly excuse. These also don't qualify as Unknowns for anyone who intends to keep their job - as a company that accepts such unknowns is very unlikely to survive for long.

Knowledge gaps

As soon as we get to field specific knowledge, not everyone can know everything. For example, you might have hired a music PhD for your development team who contributes a lot when it comes to synergizing ideas - but they may be unaware of how to write proper tests. That's an acceptable Unknown, which can be readily covered with available information. Throw in a bit of practice and you're set.
These Unknowns are fairly predictable and can be planned quite well.

Unfeasible

You may encounter areas where nobody on your team has any knowledge, and only a few people on the planet know what they're talking about. Maybe the thing you need to know hasn't even been explored yet or it's just uneconomical to acquire enough information upfront.

Such Unknowns are quite typical when doing innovation work, and are often hard to plan, as nobody knows exactly what the result will be and which steps will lead there.
This is the domain of experimentation, adaptation, trial and error.

Unknowable

The most difficult realm for prediction is knowledge which can't possibly exist, such as information about the future. While we can either research or explore by trial and error how customers use our product, we have no way of knowing whether a policy change in a country somewhere half across the globe will plunge our business into an economic crisis tomorrow.

The Unknowable warrants an entirely different approach. Usually, deliberately ignoring unknowable circumstances until they occur is the best strategy (as trying to know the Unknowable can consume an infinite amount of energy with no return on invest). When something Unknowable pops up, we need to deal with it - and that's being agile.


Dealing with Unknowns

If your development team claims that something is Unknown, first classify why the thing is unknown.
  • Folly, unqualified action and silly excuses should not be tolerated. If it's a people thing, address the behaviour openly.
  • Knowledge gaps should be brought up as early as they become known, so they can be filled effectively.
  • In software development, - acquiring all the knowledge is unfeasible: either too complex, too slow or too expensive. "Experiment, Inspect and Adapt" is the most economical, pro-active strategy.
  • Everything related to the future is unknowable. Accept it. The further you look into the future, the larger the Unknowable becomes. The impact of the Unknowable on your work determines how flexible you'll need to be. 
A plan far into the future should be very crude  ("a roadmap") lest it breaks when all the Unknowns unravel: The further you plan into the future, the more unfeasible knowledge becomes part of the plan and the larger the big ugly blob of the Unknowable becomes.

Sunday, July 14, 2019

When will the Agile Project be done?

"When will the project be done?" - or: "We need this project to be finished by (this deadline). Can you do this?" are typical questions which senior managers expect project managers to be able to answer. And the answer to this question will not go away in an agile environment - because other initiatives may be linked to the project. When there is no project manager, the team (and in a Scrum text, the Product Owner) should still be able to provide an answer.


Bad answers

"We're Agile. We don't do Projects any more."

Ha ha ha. Very funny. There will still be projects, although they will be within a different context. Instead of pulling up random groups of people to deliver a prepackaged bunch of work, we rely on dedicated teams with a clearly delineated responsibility for products, components or features. These agile teams will take those work packages called "projects" and deliver them.

A "Project", in an agile context, is nothing more than a label attached to a number of backlog items.
The project begins when we pull the first backlog item and ends when we complete the last.
If you so like, you might put the entire project into a single backlog item of pretty big size, something many teams call "Epic" and the result is often referred to as a "(Major) Release".


"We're Agile. We don't know"

What could be a better way to brush up a manager (and/or customer) the wrong way? Imagine if you were a developer and would order a new server and asked the admins and/or web host "When can we have this server?" or "Can we have this server by next Friday?" - would you consider this as a satisfactory answer? If not, you can probably empathize with a sales manager or customer for not being satisfied with this answer, either.


While there is definitely some truth that there are things we don't know, there are also things that we do know, and we are well able to give information based on the best information we have today.

And here is how you can do that:

A satisfactory answer

Based on the size of your project in the backlog and your velocity, we can make a forecast.
To keep things simple, we will use the term "Velocity" in a very generic way to mean "Work Done within an Iteration", without reference to any specific estimation method. We will also use "Iteration" to mean "A specific amount of time", regardless of whether that amount is a week, a calendar month or a Scrum Sprint. Even outside a Scrum context, we can still figure out how many days elapse over a fixed amount of time.

If we have an existing agile team, we should have completed some work already. As soon as we have delivered something, we can make forecasts, which we shall focus on in this article.

Disclaimer: This article does NOT address sophisticated estimation methods, statistical models for forecasting and/or progress reporting. It discusses principles only. It is easily possible to increase the level of sophistication where this is desired, needed and helpful. Information on these topic can be found on Google by searching for "Monte Carlo Method", "Agile Roadmap Planning" and "Agile Reporting".
For those interested in a mathematically more in-depth follow-up, please take a look at this post by William Davis.



We need some data

Maybe we haven't delivered anything yet, and then it's all guesswork.
In this case, the best thing we can do is spend a few weeks and get things done. Let us collect data and make a forecast from there, so that we get data to build our forecast upon it.

What would be the alternative? Any forecast not based on factual evidence is pure guesswork.
The more time we spend without delivering value, the later our project will be.

We can make a forecast

We can use our historic data to make various types of forecasts.
The most relevant management forecasts are the date and scope forecast, which will also allow us to make a risk forecast. We will first examine the date forecast.

Let's take an example project:
  • Our project requires 200 points.
  • Our Velocity over the last 5 Sprints has been 8,6,20,7,9. 
Management Questions:
  • Can it be delivered in 22 Iterations? (Due Date Status)
  • When will it be delivered? (ETA)
  • What will be delivered in 22 Iterations? (Project Scope)
Let's look at a schematic diagram based on Cumulative Flow, and then examine the individual forecasting lines in more detail below:

Fig. 1: A schematic overview how we can forecast dates and risks for Agile Projects

Averaged Completion

Probably the most common way to make a forecast is to average the velocity over the last few intervals and build the forecast from there. The good thing about this "average forecast" is that it's so simple.

Our example gives us an Average Velocity of 10.
Our forecast would be:
  • Due Date Status: Green.
  • ETA: 20 Iterations.
  • Project Scope: 100%
The consequence will be that the team will be given 20 Iterations for completion.
The big issue of Average Completion is that any small change or unpredicted event will devastate the forecast.


How to use Average Completion

Average Completion is possible based on historic data. The Average Date would be the earliest suggested ETA date that you may want to communicate to the outside world. The Averaged Scope would be the corresponding committable scope on the Due Date - you have a 50% probability to make it!

How not to use Average Completion

The Average Completion Date will likely shift with every Iteration, and as soon as we fall behind plan, there's a significant risk that we won't catch up, because the future isn't guaranteed to become more favorable just because the past hasn't been favorable. As soon as our Velocity is lower than average, we're going to report Amber or Red and must reduce scope.
Management will have to expect these bad news on every single Iteration Review.
This puts our team between a rock and a hard place, becoming known as a constant bringer of bad news, so you may prefer to not rely on the averaged completion date.



Minimal Completion

We can also take the best velocity achieved over a single Iteration and forecast the future from there. It gives us - at least - an earliest possible delivery date for the project. It's purely hypothetical, but it allows us to draw a boundary.

Our example would give us a Maximum Velocity of 20.
Our forecast would be:
  • Due Date Status: Green.
  • ETA: 10 Iterations.
  • Project Scope: 100%

How to use Minimal Completion Dates

The "Minimal Completion Date" is a kind of reality check. If management would expect the project to be delivered in 8 Iterations, and your Earliest Forecast already says 10, you don't even have the potential means to succeed. 
Every date suggested before the Minimal Completion Date is wishful thinking, and you're well advised to communicate Status: Red as soon as this becomes visible.

How not to use Minimal Completion Date

Some organizations like best case forecasts and then make plans based on these. 
I don't need to go into the madness of building our finance forecast on winning the lottery 20 times straight in a row - and using best cases is exactly this kind of madness.
You will fail if you plan for earliest completion date.


Expected Completion

The best way to use our historic velocity is by removing statistical outliers. Unusually large amounts of work completion are normally based on "spill-over", that is, work that wasn't finished before and therefore wasn't really done within the iteration period. Alternatively, they might have been the result of work items being unusually fast to complete, and common sense dictates to not consider unusual events to be usual. Therefore, we create a purified average eliminating unusually high figures.


Our example would give us an Expected Velocity of 7.5.
Our forecast would be:
  • Due Date Status: Amber.
  • Earliest ETA: 27 Iterations.
  • Project Scope: 80%
This means that we can commit to delivering either 80% of the scope by the due date, or that we need to move the Due Date by 5 Iterations to deliver the 100% of the scope. This creates options and opens room for negotiation.


How to use Expected Completion

The realistic completion date is what we can communicate to the outside world with a decent amount of confidence. Unpredicted events that are not too far out of the norm should not affect our plan.
While many stakeholders would try to haggle around the Expected Completion Date in order to get an earlier commitment, we have to state clearly that every calendar day earlier than forecasted increases the risk of not meeting expectations.
We can indeed reduce the project's scope in order to arrive at an earlier date, and if there is a hard deadline, we can also slice the project into two portions: "Committed until Due" and "May be delivered after Due".
The good news is that in most contexts, this will satisfy all stakeholders.
The bad news is that Part 2 will usually be descoped shortly after the Due Date, so any remaining technical debt spilled over from Part 1 is going to be a recipe for disaster.

How not to use Expected Completion

Some organizations think that the interval between average and expected completion date is the negotiation period and if the due date is between these dates, they will call it a match.
I would rephrase this interval to be the "period that will predictably be overrun".


Worst Case Completion

The absolute worst case is that no more gets finished than we have today - so the more we get done, the more value is guaranteed, but let's ignore this scenario for the moment.

It's realistic to assume that the future could be no better than the worst thing which happens in the past. We would therefore assume the worst case completion to be based on the minimal velocity in our observation period.

Our example would give us a Minimum Velocity of 6.
Our forecast would be:
  • Due Date Status: Red.
  • Earliest ETA: 34 Iterations.
  • Project Scope: 60%


How to use Worst Case Completion

The Worst Case scenario is the risk that people have to take based on what we know today.

Especially in environments where teams tend to get interrupted with "other important work", receive changing priorities or suffer from technical debt, it may be wise to calculate a worst case scenario based on minimum velocity in the observation period.

Worst Case Completion normally results in shock and disbelief, which can be a trigger for wider systemic change: The easiest way to get away from worst case completion dates is by providing a sustainable team environment, clear focus and unchanging priorities.


How not to use Worst Case Completion

If you commit only to Worst Case Date and Scope, you're playing it safe, but you're damaging your business. You will lose your team's credibility and trust within the organization and may even spark the question whether the value generated by your team warrants the cost of the team, so you risk your job.

Quantifying risks

You can derive data as follows from the dates and numbers you have:

  • The expected completion date is when the project will likely delivered as scoped.
  • We have an predictable overrun my the duration which the average date moves past the due date.
  • The predictable scope risk by the due date is the full scope minus the expected scope.
  • The maximum project risk is the full scope minus worst case scope.
  • The maximum project delay is the diration which the worst case date is beyond the due date.

Managing risks

We can manage the different types of risks by:
  • We increase confidence in our plan by working towards the Expected Date and Expected Scope.
  • We reduce date overruns by adjusting our Due Date towards Expected Date.
  • We reduce scope risks by adjusting project scope towards Expected Scope on Due Date.
  • We reduce project cost by reducing project scope.
  • We reduce project duration by reducing project scope.


Dealing with moving targets

Changes to the project are very easy to manage when we know our project backlog and velocity:
  • The addition of backlog items:
    • moves our date forecast further to the right,
    • reduces the % scope on a fixed date,
    • increases scope risk and overrun risk.
  • The removal of backlog items:
    • moves our date forecast further to the left,
    • increases the % scope on a fixed date,
    • decreases scope risk and overrun risk.


Agile Project Reporting

Based on the above information, you can communicate the current and forecasted project status in an easy matrix form, just taking this one as an example:

DescriptorDateScope on Due DateStatus
Due DateDecember 2020100%
Amber
Change PeriodJune 2019+5%
Amber
Expected Date March 2021
( +3 months)
80%
(-20%)
Green
Worst CaseDecember 2021
(+12 months)
60%
(-40%)
Green
Known Risks 3 months overrun20-40% missing
Amber
Fig. 2: An example status report for Agile Projects


This gives stakeholders high confidence that you know what you're doing and provides the aforementioned options of moving the project back to "Status: Green" by moving the Due Date or by reducing scope, or a combination thereof.

Since you have access to your backlog, you can even propose a number of feasible suggestions proactively, for example:

Option 1
1. Accept moderate likelihood of running late.
2. Ensure continued Project funding until completion.

Consequence: Due Date December 2019 might overrun up to 3 months.
Option 2
1. Add Project Phase 2 from January 2020 to March 2020
2. Move Backlog Items #122 - #143 into Phase 2
3. Provide Project funding for Phase 2.

Result 1: Due Date December 2019 met.
Result 2: 100% planned Scope delivered by March 2020.
Option 3
1. Descope Backlog Items #122 - #143

Result: 100% reduced Scope delivered in December 2020.
Fig. 3: An example risk mitigation proposal for Agile Projects


Most likely, stakeholders will swiftly agree to the second or third proposal and you can resume working as you always did.

Outcome Reporting

The above ideas simply rely on reporting numbers on the "Iron Triangle", and in some cases, executive managers ask for this. In an agile environment, we would prefer to report outcomes in terms of obtained value and business growth.

Even when such numbers as above are required, it's a good idea to spice up the report by providing quantified results such as "X amount of new users", "Y amount of business transactions", "Z dollars earned" wherever possible. This will help us drive the culture change we need to succeed.




Closing Remarks

As mentioned in the initial disclaimer, this article is merely an introduction to things you can try, pragmatically, and then Inspect+Adapt from there until you have found the thing which works best from there.

The suggested forecasting method, risk management and status reports are primitive on purpose. I will not claim that they result in an accurate forecast, because the biggest risk is in the Unknown of our plan: We could be working towards a wrong goal, the known backlog items could be missing the point or the backlog items could be insufficient to meet the project target.

It's much more important to clarify the Unknowns than to build a better crystal ball around huge unknowns, and I believe that it's better to keep estimation and forecasts as simple and effortless as possible, and spend more effort into eliminating the Unknowns.

The best possible agile project delivery strategy relies on the early and frequent delivery of value to eliminate critical Unknowns and maximize the probability of a positive outcome.

Frankly, it doesn't matter how long a project takes or whether the initial objective has been met when every Iteration has a positive Return on Invest - and neither does it matter that a project's initial objectives were met when it's a negative overall Return on Invest.


Wednesday, January 9, 2019

Setting meaningful goals

Tasks, Stories, Features, Sprints, Products, the company strategy - they all have goals, and each of these goals has an impact on overall success. Beyond SMART and INVEST which are focused solely on the definition of a goal, let's look at some important considerations for in the timespan between definition and achievement, i.e. the implementation phase.


The Goal Factors

Here are six factors which will help you define helpful goals:

Clarity

In discussions, people shouldn't be talking at cross purposes. During implementation, we need to know whether an action will be progress or distraction to act accordingly.  Clarity also affects how we determine whether a goal has been achieved or there is residual effort.

Clear goals offer little space for interpretation when the goal is achieved - and partial solutions are likely a step in the right direction. Unclear goals cause people to constantly stab in the dark.

Significance

A goal should always be so important that success as well as failure have a significant impact somewhere. If there is no impact, other things are probably higher on the list. Significance depends on the bigger picture. For example, a task or a feature are only as important as their contribution to the strategy.

Highly significant goals create a sense of urgency and importance and thereby provide a boost to motivaton.

Traceability

Goals exist for a reason. Regardless whether a goal describes a specific customer need or a strategic objective, there should be tracibility of where the goal comes from, what it contributes to and who is involved.

Traceability is a two-way street, so goal definitions need to be traceable downward into implementation as much as they need to be traceable upward into strategy.

Relatability

Every person involved with a goal should be able to relate to this goal. They should be able to figure out their contribution towards success as well as the impact the achievement or failure has on them.

Goals that are highly relatable are much more likely to be achieved than goals which contributors can't relate to.

Constancy

Goals should mean the same thing from the time they are decided until met. In rhetorics, the metaphor of "Moving Goalposts" describes a surefire way to derail any effort to make progress. A goal should not change. If it becomes different, then that is a different goal.

Constancy reduces waste, as detours are evitable. Any shift in goals turns all activity towards meeting the previous definition into waste.

Flexibility

Goals should be as flexible as possible in ways of achieving a favorable outcome. When a plan doesn't work out, alternatives need to be found without compromizing either the goal or its traceable line in the organization.

When circumstances change, flexible goals only require changes to the corresponding action plan, whereas inflexible goals might cause inefficient replanning and adjustment at multiple levels.

The importance for management

Goals which meet these six factors can be monitored effectively in multiple dimensions because of enhanced transparency in the following dimensions:
  • Success - was the goal met?
  • Progress - are we getting somewhere?
  • Blockages - are we moving?
  • Delays - what affects what else?
  • Waste - are we on the right track?
These items are equally important to individuals, self-managing teams and the overarching organization.

The importance for workers

Goals which meet these six factors will help workers in multiple ways. Working on such goals boosts:
  • Motivation - why is our work important?
  • Alignment - are we talking about the same thing?
  • Creativity - which options do we have to contribute?
  • Performance - how far have we gotten?
  • Accomplishment - what have we achieved?


Summary

Regardless of whether you're setting goals for the day, for the Sprint, for a Product or a Project - for a transformation program or for the entire organization, keep in mind that these goals should offer:
  • Clarity
  • Significance
  • Traceability
  • Relatability
  • Constancy
  • Flexibility
Without creating yet another futile metric, goals which are better on these six items will be more likely to contribute more to your overall organizational success  than others.

And most of all, if you miss setting goals - you're losing out both on the factors and on the benefits.


Thursday, May 3, 2018

The "Technical Triangle" of Effort Distribution


"The TQB Iron Triangle" has long since been overhauled. Still, there is a tradeoff that we need to make - and we all do this, every day. Unconsciously. The Effort Distribution Triangle is one way to bring the tradeoffs we make into visibility, so we can have a meaningful discussion whether we're making the right choices.


The Model

The label indicates "We aren't investing into it" and the opposing line indicates "Everything invested into resolving this".
The normal condition would be somewhere in the middle.

We have 100% of our effort to distribute. We can invest it any way we choose into delivering features, improving technology and discussion about goals and methods.
Regardless of how we distribute our effort, 100% is 100% - so choose wisely!

Opportunity Cost

By delivering features that might add value to our product, we do what development teams are supposed to do: create tangible business value. We always need to keep sight of business value, as this is the purpose of doing development work.

At the same time, excessive focus on delivery might cause us to neglect the underlying technology or communication.
In turn, the risk we run by spending too much time aligning and optimizing our technology is the time we lose on innovating and delivering things others need. This is our opportunity cost.

Technical Debt

Phrased positively, "Continuous attention to technical excellence and good design enhances agility."
It allows us to get more things done faster, and make changes with less effort.

Taking a pessimistic view, everything that isn't as good as technically possible can be called "technical debt". We always have technical debt, so at best we can control whether we consider the debt pressing or not.

Sometimes, we just want to get something through the door, and we're cutting corners on a technical level. We might have skipped the occasional refactoring, or anything like that.
Without accusation, I have observed many developers who prefer spending time on improving technology over time spent in meetings. While there are bad meetings, the consequence might be that some people don't understand things they should understand - communication debt!


Communication Debt

As mentioned in another article, "communication debt is the communications we should have had that we didn't have".
Good communication provides alignment, transparency and clarity of purpose. It's the basis for autonomy and self-organization within a company.

Communication takes time. Scrum, for example, dedicates five Dailies per week, plus one Planning, a Review and a Retrospective plus occasional Refinements - for a total of roughly 15% of your calendar to pure communication, and it's incredibly difficult to function as a team by going any lower. And that doesn't even include work floor communication, such as whiteboard design sessions, pair programming, code reviews and whatnotever.

Communication effort is an integral part of your total capacity to do things.
While some effort put into communication is in fact part of your ability to optimize both delivery and technology - there is also communication aside from that: Like letting other people know what you're doing, letting them learn from you or learning from them.



Make your choice

Take your pick anywhere on the triangle. Discuss the long-term consequences of this choice with your team. This choice is never once-for-all, you can always change it. Yet, there are choices of strategic nature and choices of tactical nature.
Strategically, a delicate balance is most suitable to maximize sustainability. Tactically, the team might decide to go all-in on technical improvements or feature delivery. That's not viable - so the Scrum Master and Product Owner should keep an eye that the triangle doesn't get too lopsided for a prolonged period of time.

Closing question

Who in your organization is making that choice for your team - and who is aware of the consequences of this choice?

Sunday, April 29, 2018

Environmental influence on Planning

"How can we make an Iteration (Sprint) plan to deliver an Epic when we still have too many Unknowns?" - this question plagues many teams on their journey to agility. Let's explore!

Someone asked me this question, and I have observed a similar thing in other teams before. In a (formerly) Waterfall-Oriented environment, a developer claimed, "We can't tell you how much effort this is until we have the Specification document!" It's the same question, just in different form.

Environmental influence

There are so many environmental factors in here that contribute to the statement being a problem which needs to be solved that I am tempted to write an entire book about it. Without digging any deeper, here are just a few factors which hint that the initial question is merely the symptom of a more severe problem:

  • Fear. For example, fearing the Unknown, fearing to not deliver, fearing to fail, fearing to disappoint - the stronger the fear element is, the more likely people will demand a reliable plan.
  • Ambiguity intolerance. People or structures with low ambiguity tolerance prefer a bad plan, doing the wrong thing right over no plan and doing the right thing.
  • Priority issues. If we're really working on the most important thing, the question isn't as much how long it takes as what is the best way forward.
  • Alignment issues. An organization which expects teams to create reliable plans upfront, shouldn't be confronting developer teams with completely unknown topics just a few days before visible results are expected. 
  • Slicing issues. There are always opportunities to deliver something based on an upfront plan and show some form of result, although the total amount of work required to get the expected final result doesn't decrease by putting more effort into creating thin slices.
  • Scoping issues. An Epic that takes more than an Iteration can't be planned down to a Sprint. The best thing we can do is deliver a portion of the work.
  • Push process. When stakeholders push work into the team and expect a specific result in a fixed time, we end up with the typical "Iron Triangle problem": Fixed time, fixed budget, fixed scope. What's left is a compromise on quality.
It should be the responsibility of management, ScrumMasters and coaches alike to create an organization where these issues aren't so relevant.

All that said, there's always the chance that something happened and we need a kind of solution quickly. In the first step, it's important to understand which of the above environmental constraints we have, and how strong or weak these are in relation to the team's work.

For example, if the fear factor is high on team side, we need to approach planning and delivery different from how we would if the main issue is unfamiliarity with slicing.



In my next article, I will explore some techniques in complete disregard of these external constraints that can help a team confronted with planning a completely unknown topic.


Sunday, September 10, 2017

DIRFT is misleading - Philip B. Crosby got it wrong!

Maybe you heard the slogan "Do It Right the Rirst Time" (DIRFT) before? Well, I have - and in the past, I firmly believed in it. No more. Let's explore this together.

What is DIRFT

To quote Wikipedia,
Crosby's response to the quality crisis was the principle of "doing it right the first time" (DIRFT). He also included four major principles: 
  • The definition of quality is conformance to requirements (requirements meaning both the product and the customer's requirements)
  • The system of quality is prevention
  • The performance standard is zero defects (relative to requirements)
  • The measurement of quality is the price of nonconformance 
His belief was that an organization that establishes good quality management principles will see savings returns that more than pay for the cost of the quality system: "quality is free". It is less expensive to do it right the first time than to pay for rework and repairs.


Problematic assumptions in DIRFT

DIRFT is based on four "principles", which I have to reduce to the level of "assumption" - because they are somewhere between shortsighted and incorrect!

#1 - Quality isn't conformance!

There's at least three major problems hidden in the very simple statement "quality is conformance to requirements" - and each of them can be disastrous!
Can we assume that people can objectively and comprehensively phrase all requirements at all - much less before validating them against the outcome?

Objectivity: I leave it to you, dear reader, to determine what happens when those who specify, those who implement and/or those who verify the requirement have a different understanding of the others! I then challenge you to create a surefire method of ascertaining this before even beginning with the implementation.

Comprehensiveness: The need for objectivity would force us to create an explicit, comprehensive list of requirements. Remember: "Quality is conformance to requirements" - and if it's not defined, it's not part of quality!
 I now challenge you to create a comprehensive list of requirements for even the most mundate item of daily life: a piece of paper! Don't forget that it should be fit to write upon (don't forget to specify the type of ink that may be used and at what temperature - maybe you should also include how the paper should behave when an eraser is used?)

Validation: Validation occurs on many levels, let's just get into the kind that DIRFT is considering implicit: How do you ensure that requirements are specified correct, comprehensive and consistent? Of course, you validate them. So - if you want to build the product right the first time, you have to absolutely and unconditionally ensure that each and every requirement, as well as their entirety, is valid. Only then do you have a chance to meet them.
I leave it to you to figure out a method of ensuring that a product's requirements are exhaustively valid without having even started implementation!


#2 - You can't prevent all defects

DIRFT assumes that all potential sources of defects are known and controllable. That's another two strong assumptions that hardly hold to a practice test. Honestly; what good is a product that is "of high quality" if it simply doesn't work the way the customer expects it to?


Omniscience: There's a proverb, "If you make something foolproof, someone will make a better idiot." Google for "Use for intended use only" - a helpless plea of designers who already realized that once the product is in the customer's hands, people will get ideas that nobody could ever anticipate!

Omnipotence: You may have gotten it completely right, but because the world behaves differently from how designers anticipated, the product still doesn't work. The only way to prevent that kind of defect is by making the product so robust that it'd effectively be omnipotent.


#3 - Zero defects are arbitrary

As explored above, requirements are an incomplete subset of what makes a product valuable, even potentially incompatible with the definition of value. By adding any implicit requirement (such as: "paper shouldn't be lethal to the touch"), the product may appear defective again. The only chance to fix the defect is by making changes - both to the design, and to the product!

Finality: One of the core, hidden and implicit assumptions of DIRFT are that there is ever a point in time where the final design is "known", and that this point in time is before the beginning of implementation.

Perfection: Along with the finality of the design comes the assumption that this design is "perfect" in relation to whatever requirements exist. While one can reasonably argue that a "good enough" design can be procured, one will be hard-pressed to argue that the first attempt will result in a design that leaves no room for improvement anywhere.


#4 - Nonconformance is a pointless measure

As tester, I always found joy in discovering nonconformance - more so, if the nonconformance was somehow "critical". It didn't take me much longer than a few days into my first job to discover that nobody cared for nonconformance in the presence of other business-relevant metrics, such as the cost of delay or the cost of change.

Time-Value discount: The above statements come into play again. With every added requirement, the implementation effort increases. It's possible to conform to 1% of requirements now, and to 100% at an undefined point in the future. Getting it right the first time assumes that the Time-Value discount is Zero. This is diametrically opposed to the idea that "the marginal value of every technology is Zero", i.e. as time progresses, the ROI on technology decreases.

Measuring nothing: The entire hypothesis of "DIRFT" is that somehow, magically, the created product is done right the first time, i.e. the idea of measuring nonconformance is supposed to measure the empty set. Does it sounds odd to anyone else to base a core business metric on an empty set? The mere idea of measuring nonconformance implies that the Crosby is somehow aware that this is a fictional, impractical approach!


Okay, after setting the groundwork that pretty much every single word of "Do it right the first time" relies on unfounded assumptions, let's shred the final conclusion:


Quality is never free!

DIRFT assumes that when someone somehow has looked into their crystal ball to create a flawless design of a perfect product and those creating it somehow have all the required time and money on their hands to build something that matches this design, then somehow magically, "quality is free".

Let's ignore even the (infinite) costs for building, and ponder for a minute how much time and money is required to produce the perfect design which then just needs to be "done right". Any design document you produce, I will find at least one unlisted requirement, at least one relevant edge case, at least one "bad case" that can wreck the product. And the longer your requirement document is, the easier this will be.


Conclusion


We end up with "quality is free" only when we invest an infinite amount of time, effort and money into upfront work - otherwise, the entire idea of "DIRFT" isn't even applicable.

"Do It Right the First Time" is a nice ideology, but impractical for anyone doing product work, where deadlines, budget constraints and uncertainty about the customer's needs are the key constraints.
In development work, quality is an optimization goal, and while it makes a lot of sense to maximize quality, it is never free - quality is where the money and effort goes!


Saturday, February 4, 2017

Guide: Sprint Planning with the Task Design Board

When attending planning sessions, I often realize that teams are stumped about the idea, "What is a meaningful task?" They come up with the standard tasks, "It needs to be developed, tested, deployed" - well, that's true. Yet, it's not very value adding. And it makes the planning meeting quite boring.
Here is a suggestion how you could approach task extraction in a way that adds a significant amount of value - and fun.

An example Task Design Board


The task design board

Let's check out what a "task design board" is: Instead of hudding around a meeting table and looking at a projector, developers make their work visual. They draw a design model representing their area of work - in our case, the architecture. Then, they add task cards in places where they intend to do work, denoting what they want to change. 

Step 1: Create a model

As we discussed above, in our case, we created an architecture model. Depending on what your backlog items are, you may also look into causal-loop diagrams, entity-relationship models, flowcharts or whatever floats your boat (pun intended). The model doesn't matter - as long as everyone understands why the model is useful and agrees that the model adequately displays the problem which the team is working on.
It's important that people understand the difference between "what is" and "what will be", because that is where the work-to-do comes in. In some cases, "what we still need to explore" is also a valid option. A suggestion may be to use different colored pens or small icons to denote these differences. 

Step 2: Defining tasks

After we know where we need to do some work, we add tasks which define what work we intend to do in order to move "from here to there", i.e. to the desired future state that solves the problem at hand. In the first draft, tasks can be very crude, such as "Website" - which is enough to indicate that there's work to be done.

Step 3: Refine tasks

This step is completely optional in a cross-functional team with close collaboration. If the team consists of specialists with limited interaction points, some tasks may need to be refined in order to be "workable". Let's take our example "Website". Maybe we need to break that down into "Set up webserver", "Create HTML", "Create CSS". The key in task refinement is not to have microslices of work as much as creating task slices that can be delivered without causing handovers delays in the process. The level at which you slice is defined by the team's skill distribution.

Step 4: Slice tasks

This is yet another optional task, depending on how quickly the team delivers. To decrease the risk and impact of blockages, tasks should not take more than 1-2 days. If the team has discovered that some task cards they created are probably significantly larger than 2 days, it may be a good idea to slice them down even further using techniques such as FURPS+, Impact Mapping or Specification by Example.

Step 5: Reduce tasks

You've probably run wild creating 100 tasks for a single Sprint now. Congratulations! That was fun. Now, we need to get pragmatic: Which of these tasks are really needed - and how many of them can we even do in a Sprint? Keep value and simplicity in mind. Your most important tool in this step is the trashbin, where all tasks go that have been created in a design frenzy. 

Step 6 - Order tasks

After you know what you really want to do, put tasks into a sensible order. A few constraints in ordering will go a long way.
1 - Cluster around value: Put tasks together that result in deliverable, visible customer value.
2 - Arrange by value: Start with the clusters that deliver the highest customer value.
3 - Arrange by feasibility: Inside the cluster, first do the tasks that could be done right now, i.e. that don't have unmet prerequisites.

Step 7: Assign tasks

This step is highly arguable. I personally don't recommend doing this in Planning already - yet some teams find value in this. Take a look at the task cards and ask around who will do what
Task assignment needs a few rules to work out properly:
1 - Pull: People take tasks. Nobody is allowed to "give" a task to anyone. 
2 - Consider Capacity: Nobody should take more tasks than they are confident they can handle.
3 - Think team: Look for ways to collaborate and do tasks together in order to proceed faster.

Step 8: Confidence vote

To conclude, let everyone take a step back and take a look at the created Sprint Plan. 
Does anyone see some unfeasible tasks? Is someone overburdened? Have we missed something? 
Everyone raise their hands in a display of confidence: Can we do this?
5 fingers = Yes!!!
3 finger = I have some concerns.
0 fingers = No way.
Numbers inbetween are ok.

If we get less than 4 fingers from most team members, we should discuss and resolve the problem.



Summary

Are you fed up with boring projector shows and ineffective planning sessions?
Try the Task Design Board as a simple, effective and energetic tool to run a Sprint Planning on low-tech, tipping deeply into the brains of developers to come up with a feasible, exciting plan.

Monday, January 30, 2017

User Story Writing - what does that mean?

I keep hearing the question "How to write good/better user stories?". A quick search on Google reveals over 25 million hits, with the first page linking to Scrum gurus such as Mike Cohn and Roman Pichler, giving examples of "good user stories" and guidelines for writing them.
Let's dig deeper. You want to write better user stories? What does the term "user story" even mean?
"I don't think it means ..." courtesy of memegenerator.net

The Connextra template

As a <ROLE> I want <FEATURE> so that <REASON>
Some guys at Connextra figured out in 2001 that a good way of formulating user stories is with their template. It helped them solve their problem and now people make that a (near) essential part of Scrum.
An entire industry has been created helping Product Owners "write better user stories" based on this template. There are many good tips around, including clarity of reason, specific acceptance criteria, separation of concerns and many others. All of them miss the point. The Connextra template is an "agile template for writing requirements". Using that template will not result in a "User Story".

So, what's a user story?

As heretical as this may sound: can you imagine a user telling a story to the developers?
Someone has a problem or a need and talks about it. We decide to create software to solve this problem.

What's the PO's role in that?

The PO has the main responsibility of deciding which item in the backlog is the most valuable and therefore, should be delivered first. For this, it's a good idea to understand what the user's problem is, how big it is - and how much value the user gets from having it solved. This means you need to listen to the user and ask questions helping in the prioritization process. 
It's OK to act as a mouthpiece for the user in cases where users don't have a voice.
In other cases, the PO has the responsibility of ensuring that developers get first-hand information and a thorough understanding of the problem.

What's the team's role in that?

In Japan, there is a philosophy that it's the student's responsibility to understand their teacher. In western circles, students blame their teacher when they can't understand their teacher. Let's just say that a blame culture helps nobody - and that users often don't understand why they have the problem they are facing.
So, the team has the responsibility of figuring out what the user means. And there is no better way of figuring that out than by interacting and discussing with the very person who is concerned.
Rather than point fingers at the PO for requesting better stories, the team should learn to understand their users. 
Asking questions is a plausible way of learning. Creating common models is another. Blaming leads nowhere.

Experiences?

When I work as Product Owner, I'm not stuffing information into computer-aided ticket systems. And I'm not "writing user stories" at all. I create doodles. Every story card I create is a doodle. And when we get around to it, the first question my team asks is: "What do you mean with this one?
That's where the discussion starts. It ends when we're all on the same page. 

Take-Away

If you want to write requirement specifications, please do so. Just don't call them "user stories".
If you want to work with user stories, try telling stories and asking questions.





Wednesday, January 11, 2017

Do you really want high utilization?

Let's end the discussion about whether we should optimize for maximum utilization right here, right now - with a metaphor. Ponder your own answers for the questions.

Your features are the cars. Your teams are the lanes.

Lane 1 is optimized for maximum utilization (80%+).
Lane 2 tries high utilization (50%).
Lane 3 actively minimizes utilization (as close to 0% as possible).


Question: If your goal is to get from A to B as fast as possible - on which lane would you travel?

Question: What happens when a car suddenly needs to brake? (i.e. an impediment occurs)

Question: What happens when a car needs to enter your lane? (i.e. new information becomes available)

Transfer-Question: What is the fastest way to obtain business value in product development?

Concluding Question: Since minimal time-to-market maximizes ROI - which utilization strategy should you pursue?

Tuesday, January 10, 2017

Normalized Story Points - what's that?

SAFe4 suggests that Story Points should be normalized across the Release Train. Additionally, it provides a method for estimating the first Sprint that could be considered inconsistent with the idea of Story Points. Let us take a closer look at the idea.


What are Story Points?

Story Points are, in short, an arbitrary measure quantifying the expected effort to get a backlog item "Done". They are expected to help the team plan their capacity for each iteration and to help the Product Owner in getting a rough understanding of how much the team might be able to deliver within the few next months. This can be used, for example, to calculate estimated Release dates and/or scope.

There is an additional purpose that is also suggested by Mike Cohn in his blog: When you know the amount of Story Points that can be completed per Iteration, you can assign cost estimates to backlog items helping the PO make better business decisions.
For example, a backlog item might turn into a negative business case once the cost is known, and can then either be reworked for better ROI or discarded entirely.

SAFe picks up this idea in the WSJF concept, i.e. prioritizing features that have a good ROI/effort ratio.

The most important thing about Story Point estimation is that every member within the team has an understanding of what a Story Point means to the team. It can mean something entirely different things to other teams, hence caution should be exercised when they are referenced outside the team.


What are Normalized Story Points?

SAFe's delivery unit is the Agile Release Train (ART), effectively a "Team of Teams".
Just as a Story Point is intended to mean the same thing to one team, it should mean the same thing within a Team of Teams.
Otherwise, the Product Manager would receive different estimates from different teams and is completely unable to use these estimates for business purposes. This would render the estimation process useless and estimates worthless.

As such, SAFe suggests that just like in a Scrum team, the individual team members need a common understanding of their Story Points - the ART's individual teams require a common understanding of their Story Points to make meaningful estimates.

Why can't each team have their own Story Points?

In SAFe, all teams on the Release Train pull their work from a single, shared, common, central Program Backlog. This Program Backlog serves to consolidate all work within the ART, regardless of which team will actually pull the work.
A key Agile concept is that the work should be independent of the person who does it, as specialization leads to local optimization.
From a Lean perspective, it is better if a slower team starts the work immediately than to wait for a faster team.

Especially when cross-team collaboration is an option, the slower team can already deliver a portion of the value before the faster team becomes available to join the collaboration. This reduces the overall time that the faster team is bound and hastens final completion.

If Story Points differ among teams, it might become necessary that every single backlog item needs to be estimated by every single team in order to see which team takes how long to complete the item. This type of estimation is possible, yet leads to tremendous waste and overhead.

If Story Points are normalized across teams, it is sufficient to get a single estimate from a single team, then look at the velocity of each team to get an understanding which team would take how long.

Another benefit of normalized Story Points is that when Team A needs support from Team B to meet a crucial deadline, Team B's product Owner knows exactly how much they need to drop from the backlog in order to take on some stories from Team A without wasting effort on re-estimation.

How does SAFe normalize Story Points?

In the first Program Increment, the ART is new. Both the individual teams and the ART consist of members who have not collaborated in this constellation before. Teams are in the "Storming" phase - as is the ART itself.
This means Working Agreements are unclear. The DOD is just a vague ideal that hasn't been applied before and might have unexpected pitfalls. Depending on the product work, the environment is also new and unknown. Effectively, the teams don't know anything about how much work they can do. Every estimate is a haphazard guess.

One approach might be to have a discussion first to identify a benchmark story, assign benchmark points and work from there. This discussion will lead to further discussions, all of which provide no customer value.

To avoid this approach, SAFe suggests the following approach:


Start with Relative Estimates

In the first PI Planning, teams take the smallest item in their backlog and assign it a "1". Then, using Relative Estimation (based on Fibonacci numbers), they assign a "2" to the next bigger item, a "3" to an item that is slightly bigger than that one - and so on. Once, they have a couple of references, they can say "About as much as this/that one".

Of course - all of this is guesswork. But it's as good as any other method in the absence of empirical data. At least teams get to have a healthy discussion about "what", "how" and potential risks.


How is Velocity calculated based on Normalized Story Points?

Again, in the first Program Increment, we have absolutely no idea how many Story Points a team can deliver. Since we have rough Person-day estimates, SAFe suggests a very simplistic approach for the first PI Planning:

We know how many team members we have and we also know how many days they *expect* to be working during the Iteration. (nobody knows when they will be sick, so that's a risk we just take).

A typical SAFe Iteration is 2 calendar weeks, so it has 10 Working Days. We multiply that number with the amount of team members. 
Base Capacity = 10*Team Members

From that iteration capacity, we deduct every day of a team member's absence. 
Adjusted Capacity = Base Capacity - (Holidays * Team Members ) - (Individual Absence)

Finally, we deduct 20% - as planning for 100% utilization is planning for disaster. We round this down.
Initial Velocity = Adjusted Capacity * 0.8

Here is an example:
Team Trolls has 6 developers. There is a single day of vacation and Tony needs to take care of something on Friday.

Base Capacity = 10*6 = 60 SP
Adjusted Capacity = 60 SP (Base) - 1*6 SP (Holidays) - 1 SP (Absence) = 53 SP
Velocity = 53 SP * 80% = 42 SP

So, Team Trolls would plan Iteration 1 with 42 Story Points. If the numbers don't add up, it's better to err on the lower side than to over-commit. They might choose to fill the Sprint with 39 Points, for example.


What happens to Velocity and Normalized Story Points over time?

In Iteration 1, we merely guessed. Guessing is better than nothing. We learn, inspect and adapt. For example, Team Trolls has discovered that they can slice throuh Stories like butter and take on more points in the future - while Team Badgers has discovered they need to do significant support work for other teams (such as knowledge transfer), slowing them down. They would then take on fewer Story Points in subsequent Sprints.

Here is a sample of how an ART velocity may develop over time like:

Tracking velocity over time

As we see in this example, teams Inspect+Adapt their own plan, feeding back to Product Management useful values to I+A the overall PI Plan and (if applicable) Release plans.

No re-estimation of previously estimated backlog will be needed. As new work becomes available, and "Done" Stories can be used as benchmarks for additional backlog items in the future to keep in line with current backlog items.



Caution with Normalized Story Points

Story Points are not a business metric. Neither is Velocity. They are simplified planning metrics intended to minimize planning effort while providing sufficient confidence in the created plan.
The metrics are subject to the same constraits as in single-team Scrum, i.e. the following anti-patterns need to be avoided:

Do not:
  1. Assume estimates are ever "correct". They are -and remain- estimates.
  2. Measure progress based on "Story Points Delivered". Working Software is the Primary Measure of Progress.
  3. Compare teams based on their velocity. Velocity is not a performance metric.
  4. Optimize the ART structure based on velocity figures. An ART is a highly complex adaptive system.
  5. Try to maintain a constant/increasing velocity. Capacity planning is intended to minimize the risk of failure and subject to reality. Velocity is just an indicator to improve the reliability of planning.


Conclusion

The normalization of Story Points solves a problem that does not exist in a non-scaled environment, i.e. the question "What happens to overall progress when another team takes on this backlog item?"
This helps the ART shuffle backlog items among teams in order to maximize for overall product value, rather than team utilization.

In the absence of better information, we use a crude rule-of-thumb to get the first Story Point figures on our backlog. When we have completed stories, we can determine which of these as useful as reference points. The initial tie between a Story Point and a developer-day moves towards a rather intangible virtual unit really quick. This must happen to keep the Story Points consistent across teams.
The understanding of a Story Point needs to remain consistent across teams.

In the absence of better information, we use a crude rule-of-thumb to get initial team velocity. When we have completed an iteration, we use the real results as reference points for the future.
Within a few iterations, velocity's correlation to capacity-days shifts towards the intangible virtual unit of Story Points that are disconnected from time. This must happen to maintain Velocity as a functional, consistent planning tool in the ART.

In an ART, it is even harder than in Single-team Scrum to resist the urge to evaluate teams based on Velocity. The RTE has the important responsibility to maintain the integrity of Story Points by stopping any attempts (usually by management) to abuse them.

Friday, May 13, 2016

Story slicing 101

A repeating theme in agile transformations is: "Our stories are too large, and we can only deliver all or nothing - we have no idea how to slice them". The consequences? Teams can only do few, but large stories, variation and failure probability is high - predictability and flexibility are low. None of these are desirable from a business perspective - regardless of agility. Story slicing solves this.

Since we will be dealing with fairly abstract concepts, let us create a specific example to make the subject more tangible. Let us start with the "too big" story:
As a user of the platform, I want to have a response time of less than 2 seconds, so that I can spend more time actually making progress.
This is a typical case of nonfunctional requirement, phrased as a user story with clear success condition and clear indication of business value. Unfortunately, for large systems, this pretty much means rewriting the entire code base - there is no way to get this done in a few days!

Step 1 - Start asking questions

The above story leaves plenty of room for interpretation. The first misunderstanding is that "We have to do everything, otherwise it might not work". Teams end up creating seemingly endless to-Do lists for changes that all need to be made. But they don't really ask questions.
A good way is to break up the team in a Refinement session and let them actually build a model around the story to help them discover questions, independently.
Here is what the team might come up with:
Which type of users do we have?
Do admins have the same needs as application users?
Does it really hurt if user creation takes a bit longer?
Which function actually takes the longest?
Is 2.1 seconds a problem?
We could make transactions faster by splitting into multiple minor steps, but that has more clicks - would the users accept that?
Regardless of what questions come up, what you need is that these questions are written down explicitly, not a detailed discussion to answer these questions (yet).

Step 2 - Bring the questions together

Different sub-groups will probably discover different questions. There is no "right" or "wrong" at this time, only different models, resulting in different questions. All questions are good, because they reveal how people think. By having each group bring their questions to the board, we can start to cluster questions. Most likely we will have more than one cluster.
For example:
Users and their needs
Function specific boundaries
Worst-case scenarios
What you need now is not the similarities within the clusters, but the differences between the clusters.

Step 3 - Slice based on clusters

Slicing is best done across differences, but keep real user value in mind. None claim to solve the entire problem, all will contribute a meaningful partial solution. At the moment, let us "forget" about the overall problem we have and specifically focus on partial delivery.
Here are examples for extract user-relevant, deliverable stories from the basic story:
As a new user, I want to create a new account in less than 2 seconds.
As admin, I want to wipe a user account in less than 2 seconds.
As transaction user, I want to complete a transaction in the system in less than 2 seconds.
These stories are still quite different in size, but they are much easier to handle than the entire block. After we are "Done" on the first two stories, there is still a large amount of work to be done - but also a tangible result. Some of these stories might be discarded immediately, because the team realizes that this specific need is already met.

Step 4 - Drill in, Rinse + Repeat

Looking at our example, most of the work will probably be in the third sub-story. We can drill into this sub-story in exactly the same way we drilled into our initial story. Drill-in can be channeled by moderating the team to look for specific aspects.
Here is a small list of aspects to look for:
  • Workflow: Steps, user goals, scenarios
  • Transactions: activities, operations (CRUD)
  • Users: personae (user types), roles, responsibilities
  • Technology: configuration, context, data streams (& interfaces)
  • Data: content, types, subsets
With this list, you could instruct one group to look for workflow aspects and another group might examine data.

Here is an example of what the "data" group might come up with:

  • 60 second timeout when the database is down.
  • Stuck in a "Waiting" dialog when the Internet connection is unstable.
  • Mass update speed is proportional to amount of updates.

Step 5 - Verify & Engage

Depending on how far you take this, you can slice down any large topic to as many small topics as needed until the team arrives at the following two conclusions:

  1. We have discovered relevant areas for change
  2. We can resolve a few stories in a fairly short amount of time
Put the most important, workable stories high in your backlog, preferrably starting with the first stories right in the next sprint - and sort the remaining relevant items into the backlog at an appropriate place. If you want to, you can keep the "master story" in the product backlog, but it's priority will be lower than that of the lowest identified story.
After all known stories are closed, the master story will pop up again - at that time, the first question is: "Do we still have a relevant problem?"

Conclusion

World hunger stories are common. The most common problem teams encounter is that they dive into the solution space before working on the story definition. However, since the story with it's Acceptance Criteria defines "success", it is most important to have a realistic goal in mind. Otherwise, there is no way to succeed.
The next time you encounter a backlog item that is "too large to do in a Sprint", try asking tough questions and slicing along the differences between the questions.