Friday, January 31, 2020

Double Queues for faster Delivery

Is your organization constantly overburdened?
Do you have an endless list of tasks, and nothing seems to get finished? Are you unable to predict how long it will take for that freshly arriving work item to get done?
Here's a simple tip: Set up a "Waiting Queue" before you put anything into progress.

The Wait Queue


The idea is as simple as it is powerful:
By extending the WIP-constraint to the preparation queue, you have a fully controlled system where you can reliably measure lead time. Queuing discipline guarantees that as soon as something enters the system, we can use historic data to predict our expected delivery time.

This, in turn, allows us to set a proper SLA on our process in a very simple fashion: WIP in the system multiplied with average service time is when the average work item will be done.
This allows us to give a pretty good due date estimate on any item that crosses the system boundary.
Plus, it removes friction within the system.

Yes, Scrum does something like that

If you're familiar with Scrum, you'll say: "But that's exactly the Product Backlog!" - almost!
Scrum attempts to implement this "Waiting Queue" with the separation of the Sprint Backlog from the Product Backlog. While that is a pretty good mechanism to limit the WIP within the system, it means we're stuck with an SLA time of "1 Sprint" - not very useful when it comes to Production issues or for optimization!
By optimizing your Waiting Queue mechanics properly, you can reduce your replenishment rate to significantly below a day - which breaks the idea of "Sprint Planning" entirely: you become much more flexible, at no cost!

The Kanban Mechanics

Here's a causal loop model of what is happening:


Causal Loops

There are two causal loops in this model:

Clearing the Pipes

The first loop is negative reinforcement - moving items out of the system into the "Waiting Queue" in front of the system will accelerate the system! As odd as this may sound: keeping items out of the system as long as possible reduces their wait time!

As an illustration, think of the overcrowded restaurant - by reducing the amount of guests in the place and having them wait outside, the waiter can reach tables faster, there's less stress on the cook - which means you'll get your food faster than if you were standing between the tables, blocking the waiter's path!


Flushing Work

The second loop is positive reinforcement - reducing queues within the system reduces wait time within the system (which increases flow efficiency) - which in turn increases our ability to get stuff done - which reduces queues within the system.

How to Implement

This trick costs nothing, except having to adjust our own mental model about how we see the flow of work. You can implement it today without any actual cost in terms of reorganization, retraining, restructuring, reskilling - or whatever.
By then setting the work you permit within your system (department, team, product organization - whatever) to only what you can achieve in a reasonable period of time, you gain control over your throughput rate and will thus get much better predictability into forecasts of any type.



Footnote:
The above is just one of many powerful examples of how changing our pre-conceived mental models enables us to create better systems - at no cost, with no risk.

2 comments:

  1. @Michael I agree with you, this unburdens the local system. Start less, lower WIP, finish faster.
    There are so many scenarios where this works:
    Trendy restaurant. People queueing makes the restaurant more in demand.
    Hospital emergency department: Makes people go to another hospital if their case is too urgent.
    Airports: Think web checkin, pre printed bag tags.

    If it sits inside a larger system, like a functional area performing work before a handoff, then this is a local optimisation. In context of a larger system this is simply burden shifting and the impact needs to be understood.
    (For the record I am generally opposed to handoffs unless there are specific reasons that it makes more business sense than the alternative).

    Anyway, nice post. I think it is worth placing this post in context of larger system and local vs global optimisation.

    Cheers
    Anthony

    ReplyDelete
    Replies
    1. Hello Anthony,

      thanks for commenting.
      The true magic of this approach is that it's not "simply a local optimization" - it's an implementation of the Lean-Agile principle of "deferred decisions".
      When you take a look at the causal loop diagram, doesn't just reduce the local processing time - it reduces the overall wait time *for each unit of work*, regardless where and how it is processed.

      The local optimization is the opposite - someone pushing work into an overburdened organization "your problem now". It is a global optimization to make the delay visible where it is produced, rather than where it occurs.

      Delete