Sunday, December 29, 2019

Telemetry Canvas - figuring out the right metrics

To create transparency for the key information about your company, your system, your product - you need to align your metrics. Here is a simple canvas that can help you sort your thoughts and start your journey to data driven decision making:

The Telemetry Canvas

The canvas is simple to understand. There are two main dimensions:


In an IT platform, there are mainly two types of events: Those created, conducted, orchestrated, managed or performed by automation - and those performed by humans (platform users).
Anyone whose work is affected by events should have a say in defining the most relevant items for their work.

Technical events

Everything the system and/or platform does by itself, or in support of user activity, is a technical event. We can measure technical factors such as incoming or completed transactions, inventory levels et cetera. Of course, we can categorize these by transaction type, technical component or business scenario, depending on what we are looking for.

Quality checks, build failures or network alerts are also technical events that occur frequently, and require attention.

User actions

Whatever users do may also be relevant to the performance of our organizations. If our goal is to grow our userbase, new registrations are a great metric to look at. On a marketing campaign, the next logical extension would be lead conversions. Or trending products. We might also look at revenue generation - and whatever has an impact on things we care for.
Even abandoning our product is relevant for our business performance, and can be classified as "action by inaction", potentially being a relevant user action.

Business outcomes

Events by themselves are meaningless. They derive their meaning by their impact on our business.

Good for business

We are looking for certain events, such as the successful start or completion of a transaction or the generation of revenue. Many of these events fall into the category of "The more, the merrier". The best events are those that cause no work, yet generate profits.

Bad for business

Some events are always bad news, for example complaints, technical errors or system outages. Even if nobody likes to have these, they are part of working reality, and we need to pay attention to the effort we sink into them.
In many organizations, the invisibility of the "bad news" metrics on the radar causes the organization to accumulate technical debt that may eventually kill the product or even the entire company!
The best businesses aren't those who successfully ignore the bad news - it's those who know that they have less bad news to handle than they can stomach!

Deriving metrics

Once we know which events we're looking at, we can determine how we measure them.
For example: When a transaction arrives in the system - we also want to know when it is completed: we measure not just our transaction rate and inventory, we need to know the throughput rate as well. This gives us visibility into whether we're accumulating or reducing backlog, whether we're sustainable or unsustainable!


Once we have defined our metrics, we can set optimization goals. Some events are good for our business, others are bad. The general optimization direction is either "lower is good" or "higher is good". In rare cases, we have range thresholds, where neither too high nor too low is desirable.

The easiest way is to start by capturing data on the current state of a metric, then answering the question: "Is this a problem? If so, how big is it?" - determining whether the current value is good, acceptable or inacceptable.

Using the Telemetry Canvas

The canvas is a discussion facilitation tool, so don't use it on your own.

Step 1: Invite stakeholders

Bring all stakeholders in your product together, preferably not all, but representatives from each group. This is a non-exhaustive list of people you might want to involve:

  • Salespeople, who generate income from the product
  • Marketeers, who drive the product's growth
  • Finance, who validate the product's revenue
  • Developers, who build the solution
  • Operations, who have to deal with the live system
  • Customer Service, who have to deal with those who bought it
  • UX, who design the next step
  • Legal, who definitely don't like to have trouble with the product

The more of these functions rest within your team, the easier this exercise will be - although typically, most will be located somewhere else in the organization.

Step 2: Brainstorm events

Give everyone the opportunity to draft up events that are important to their work. There is no "right" or "wrong" at this stage, and there are no priorities, either.
It's important to remember that not all events occur within the platform, some occur around the platform, and that some events can also be caused by inaction.

Get people to write each event on sticky notes.

Step 3: Locate events on the matrix

People tend to have a pretty good understanding whether an event is good or bad, so where to place the event on the vertical should be easy. In some cases, it's unclear whether an event is good or bad - then default to "Bad", because every event means data processing and work, and work that's not good is probably a bad thing.

Likewise, define the horizontal category. In some complex systems, it's unclear whether it's a user action or a technical event. Try defaulting to "user action" - you haven't discussed where to get the data from, anyway.

Step 4: Define measurement systems

As events themselves are of no value, we need to define the measurements that we want to derive from events. These can also be combination metrics, such as "Lead Time" or "Inventory Growth". What matters is that everyone in the room can agree on what would be measured.

Write each of the measurements onto post-its and put them into the field corresponding to one of the event(s) they rely on.

Step 5: Prioritize

Not all metrics are sufficiently important. Let each stakeholder name up to three metrics that matter to them - you still need to put work into setting up data collection, and it doesn't help you to have five hundred things on your "toDo" list. 
This is not a point-based system, so it's not about dot-voting, so you end up with a bunch of individual priorities. 
Although it's good if multiple stakeholders value the same metrics, since that reduces complexity, it's not necessary that stakeholders agree on the value and importance of metrics.

Step 6: Validate

You should have a number of metrics in each quadrant now. If you're missing one of the quadrants, your measurement system is probably biased. Should that be the case, ask, "What are we missing?" Try reprioritizing metrics until you have at least two in each segment.

Step 7: Agree and align

Get everyone to agree that they have their most important metrics on the canvas. Address potential concerns. If necessary, re-iterate that this is not intended to replace current measurement systems nor a final version - it's just the beginning of a journey to align on data transparency.

Step 8: Invite for follow-ups

Once the metrics are agreed, let everyone know that there will be different sessions to define the metrics in more details, that is: how the data will be collected, how it will be interpreted and how it will be represented. This consumes more time and is not in full detail interesting for everyone.

Step 9: Agree on Next Steps

The Canvas is ready, but it's just a canvas. Make sure you have an action plan of what will happen next. Here's what is suggested:
1. followup sesssions for defining the metrics,
2. do some implementation work to measure them,
3. present the metrics in a Review,
4. start using the available metrics in decision making,
5. Inspect, Adapt and Improve.

Joint Metrics - aligning business and development

"How do we bring business and development closer together?" - the key is to create transparency into what others see.

Start the discussion about what you should make transparent, so that everyone can draw the same conclusions. Making both technological and business information visible to everyone in real time will help you cut down a lot of pointless discussions about the best course of action.

Everybody is right!

Every person has their perspective of what is the most important thing, and often, each perspective is valid. For example, a technical person will consider that technological stability and high quality code are important. Salespeople care for neither - they want to close as many good deals as fast as possible. People in service support feel stuck with tons of trouble tickets, social media marketers want campaigns to go viral. And the CEO just wants a smooth, expansive operation.

A developer can only use their time once. So - what should they focus on? How can a Product Owner know whether it's more important to boost sales or to fix defects?

Classic HIPPO Prioritization

Most organizations prioritize activities like this: Either we do the thing demanded by the person who shouts loudest, or the Highest Paid Person will give their opinion on what should be done ("HIPPO Priority").

Unfortunately, neither the people with the loudest voice, nor those with the highest paycheck, tend to have a full understanding on the implications of their demand. Follow the HIPPO, and everyone else will be unhappy. Disregard the HIPPO and risk being laid off.
Either way, the organization gets stuck under the tyranny of the Urgent - shifting attention between a series of disasters and escalations to fix.
There is no freedom to think of the Big Picture, maximize business value or consider what will happen a few years down the line.

A systemic view is needed

In a healthy organization, developers in their right mind wouldn't want sales to fail - and marketing wouldn't want the technology to fail. They are often simply unaware why their personal goals have such drastic consequences elsewhere!

The solution requires overarching transparency, laying all the cards on the table.
 In most modern organizations, there is some kind of data that people utilize, yet everyone gathers their data from a different source and interprets it in their local context. This is not to advocate a central Data Warehouse, Master Data Management or a specific data representation tool here - the problem can't be fixed technologically: A sysop would look at logfiles, developers at source code, sales at transaction records and marketing at campaign information.

None of the data is related on the surface. Yet, in a closed system, all of these are sides of the same coin (probably, an "infinity dice" would be a more applicable metaphor).

Breaking the local optimization

Every stakeholder can define metrics for their specific area of expertise. Sales is very adept in defining what is great, what's okay and what is intolerable when it comes to closing deals. Hence, it's very easy for them to define a metrical system that creates overall visibility on how healthy sales are. Developers can do the same for their systems, finance for revenue - and so on.

And then we lay all of this information on the table. When everyone has a say in what we're looking at, we have objectivity in whether we're doing great, alright or meh in the big picture. And make it visible to everyone.

Bringing the puzzle together

Imagine you log into your company account - and the first thing you see is where your company is doing great - and where it just plain sucks. From the customer service rep, all the way to the CEO, from technology to business, everyone will have at their fingertip the information where your biggest strength and where your biggest weakness is.

It will become very intuitive and easy to make key decisions, and even when diplomatic compromize is needed, people will at least understand the impact of their choices.

Leaving the hamster wheel

Many organizations are challenged to break free from the hamster wheel of tasks and activities. Product evolution is often nothing more than putting band-aids on cracked pavement. Systems are fundamentally broken, because it's all about meeting short-term goals, and rarely about larger long-term beneficial change. The future gets sacrificed for today's needs.

Planning strategically

We need to figure out where we're constantly fire-fighting, where we're in calm waters, and which problems correlate. We can use the transparency of the data to introduce measurable strategic objectives, such as, "reduce technical debt from 50000 years to 100 years", or "increase conversion rate from 1.2% to 2%". It's totally valid to have multiple strategies with multiple objectives and multiple targets in place at the same time.

Building empathy

Another great advantage of building a common metric portfolio for everyone in the organization is that we start to get empathy for one another. Developers see when sales is struggling, marketing realizes when development is drowning. The discussion moves away from "What do I need next?" towards "Who has the biggest problem and how can we contribute to improvement?"

Friday, December 13, 2019

The nonsense called "Enterprise Agile Development"

The "Manifesto for Agile Software Development" is the basis for many "agile approaches". Enterprises worldwide have been sold to the idea that they need to become "Agile" in order to remain competitive. And anecdotal evidence of the success of "Agile" is abundant.

There's a dirty little secret that most organizations, coaches and consultants are either unaware of, don't understand - or they just don't realize the impact thereof: "Agile", as originally proposed, is a local optimization, intended to improve the work of software developers! This begs the question: "What if Software Development isn't even the problem?"


If we look at organizational processes from end to end, we realize that even if Product Development were a flawless, instantaneous activity, not much would be different in the big picture. Why do we spend so much time and energy to make irrelevant changes?
95% of changes made by management today make no improvement. -Peter Scholtes
"Agile Transformation" is often one of these ineffective changes.
To find a better solution, we need to frame the problem differently.

The scope of "Agile"

"Agile" is intended to bring the entire IT development organization together. It's irrelevant whether we're talking Scrum, Crystal, XP or whatever, this idea is fundamental to "Agile".
From the time a "requirement" (or: "user story" - or whatever) is scoped for delivery until it's delivered, people from IT collaborate, involving business stakeholders, to minimize cost, overhead, quality risks and delay in the process.

Why "Agile" looks so appealing!
It sounds very appealing to any IT manager worldwide to reduce defect rates, cost and cycle times by margins of fifty percent plus each, while making IT employees happier, especially since those benefits can actually be achieved compared to siloed approaches!

But what if I told you that none of this matters at enterprise level?

The real picture

In a 100.000+ people enterprise, there are hierarchies, reporting lines, budgets and highly complex dynamics going on. Not everyone talks to everyone else.
In an enterprise context, it's fairly normal that from idea until approval, different rounds of experts, stakeholders and boards are involved - long before an item lands in IT's actual toDo list.

The idea goes through some kind of process, where people who are not involved in actual development clarify what's actually going on, whether it's worth doing and have to make sure it will get done.

While Scrum specifically has a "Product Owner" and Scrum proponents might argue that this is "refinement" for which the Product Owner has this responsibility - in an enterprise, that's too much work for one person, so it will be delegated: We end up with handovers in the process, multiple queues and waiting lines. 

This picture depicts what's actually going on in most larger organizations:

Careful - "Agile" doesn't even talk about the big picture!

The lead time of the yellow chevrons, in a Waterfall world, tends to be 3-6 months, and in an Agile universe, it's much shorter. 

Let's do a thought experiment: What if the yellow part of the process were conducted by a little fairy with a magic wand that could complete all these activities at zero cost, in zero time and without any defects? 

How agile, how fast, how cheap would this process now be?
In large enterprises, each of the gray chevrons takes weeks to months, is quite expensive and error-prone. Considering the end to end process chain, even if we disregarded everything happening in the "Agile" parts of the process - we'd still have a slow, inflexible and expensive process! 
The overall effect of introducing "Agile Software Development" into Enterprise processes is neglegible.

End-to-end Agility

Now, we get to the revolutionary idea of simplifying the entire process by means of moving to "Business Agility": bringing developers straight to the users!

A paradigm shift: entirely removing the men-in-the-middle!
Once an item is prioritized, we jump straight into development. There's just a small fly in the ointment: how do we prioritize it? In order to determine whether it's the most valuable thing, it still needs to be refined - so the same process still applies!

The problem, unfortunately, in the Enterprise world, is not even whether we have a handover in the process between a "Refinement Team" and a "Development Team". The real problem is overall lead time!

As long as it takes weeks to do the clarification rounds, the umpteen mandatory boards only meet once in a full moon, every new feature requires comprehensive user trainings and there are manual compliance audits, there is no way for a change in development to significantly affect overall outcomes!
Unless you're part of the problem, you're not part of the solution.

The illusion of improvement

When overall processes retain their massive overhead, and communication structures remain untouched, Agile Development alone is irrelevant to the overall business outcomes. Even if we get it right - which is extremely hard in the governance straight jacket so nicely provided by many organizations - the enterprise as a whole will not feel much of an impact by the introduction of "Agile Development" into a cluttered, overcomplicated jungle of processes, rules and regulations.

If we are looking for significant improvements to the enterprise, Software Development is often not the right place to look - yet agilists have spent nearly a decade developing increasingly intricate ideas for optimizing exclusively this part!

Moving Enterprise Software Development from "Waterfall" to "Agile"  is like using white sand instead of yellow sand to mix concrete - we feel as if we made a difference, while an outside observer wouldn't recognize any change!

And that's why Enterprises who have invested heavily into "Agile Software Development" often feel underwhelmed with the outcome and become entirely disillusioned.

A change in mindset

"Agile" brought a massive change in mindset, it gave developers a voice - and it's people from Technology who are now educating the business world how to change their ways of working, in order to take advantage of Agile processes.
Unfortunately, we see that many agilists are stuck in their little Technology world, not realizing that optimizations made exclusively from a Technology perspective alone are exactly as ineffective - and potentially equally harmful - as those made by Finance (namely, Cost Accounting - but that's another story, to be told another time).

Much more important than doing an "Agile Transformation" and converting legions of decently functioning teams with decently performing individuals to "Agile ways of working" would be to look at the real problem the organization is facing - it's not a development problem. It's a flow problem, in end to end value.

I believe that not just agilists, but struggling organizations as a whole, will find significant value in moving beyond the new structure and processes suggested by Agile Frameworks and tackle the real issues their organization is facing, which are hidden in plain sight when adopting Agile Frameworks.

In the past years, I have come to treasure TameFlow as an eye-opener, to move beyond IT, beyond Technology, beyond Product Development, to see the need for a Unity of Purpose, optimization at the bottleneck and relentless improvement. TameFlow suggests not making any change to anything other than the bottleneck: Development can work however they consider fit, unless and until that's where the bottleneck is.

Closing remarks

Yes, this post is clearly promoting TameFlow - not as yet another Agile Framework or as an alternative to certain Agile Frameworks, but as an alternative way of framing the problem organizations today face.

Let's just not make changes that don't matter. Too much harm and grievance has been inflicted upon managers, developers and entire organizations in the name of this "better way of working". So many "Agile Transformations" caused great people to lose their job, their sanity or the respect they deserve, while giving the organization nothing to show in return. It's time to stop this madness.

Let's put Agile Frameworks where they perform best: into Product Development. Let's not try to stretch their purpose beyond their applicability, and let's not overzealously convert everyone to one specific way of working. Give people the freedom to work in whatever way they consider is best for them, Offer people Scrum, Kanban, SAFe or LeSS as an option if it makes them happier - but don't force anyone into any of this, especially not when development isn't even the problem.

TameFlow taught me this.