Sunday, June 27, 2021

The Code Review Department

Let met tell you the story of a company that had problems with their code quality, and solved it the Corporate way ... well, ehum ... "solved."

Poor Code Quality

When I met that organization, their development process basically looked like this:

Developers took their specifications, completed their coding and passed the code for testing. Simple as that.

After a major initiative failed, the Lessons Learned revealed out that the root cause for failure was the poor code quality, which basically made code un-maintainable, hard to read and difficult to fix defects without adding more defects elsewhere.

A problem that would require rules and regulation to fix. So, it was decided that Code Reviews should be done. And - of course, people complained that code reviews took time, and if developers would do the Reviews, they would be even slower in getting their work done.

Stage 1: Introducing: the Code Review Department

What would be the Corporate way, except to introduce a new department of specialists dedicated to the specific task? So, they opened the Code Review Department. Let's call them CRD.

To keep Code Reviews as neutral and objective as possible, the Review Department was strictly separated from the others: physically, functionally and logically. They were not involved in the actual development process, and only reviewed the code that was actually delivered.

With the backing of senior management, the CRD adopted a very strict policy to enforce better code quality and defined their departmental KPI's, most notably:

Remarks made per reviewer: How many review remarks each reviewer had found, per commit and per month. This allowed the CRD management to see which reviewers were most keen and could best spot problems.

Now, how could this possibly go wrong?

Stage 2: Code Quality improves

Since it's easy to objectively measure this data, the reviewers got busy and lots of review comments were made. Proudly, the CRD management presents statistics to IT leadership, including their own KPI's as well as the obvious metric they can present to measure the software delivered by the Software Development Department (SDD):

Code Quality per developer: How many review remarks were made, per commit and month, per developer. This allowed the SDD to put a KPI on the quality of code provided by developers. And, conveniently, it would also justify the existence of the CRD.

With the blessings of IT leadership, SDD Management thus adopted the KPI.

So, things were going well ...

Stage 3: Code Review has a problem

Now, developers aren't dumb people. They aopted Linting and basically got their Review remarks down to Zero within in pretty short time. Now, the CRD should be happy, shouldn't they?

Turns out, they weren't. And here was the problem: The Reviews per reviewer metric tanked. Not only didn't reviewers suddenly fail their quota, CRD management figured out they probably weren't looking in the right places.

So what did the CRD reviewers, not being stupid, either, do? When they looked at code, they screened for patterns they considered problematic and introduced a new rule.

The numbers for Review remarks per reviewer rose again, and management was doubly happy: not only were the numbers of the CRD fine again, reviewers were continuously improving their own work.

Great! Things are even getting better!

Stage 4: Developers have a problem

Developers started to get frustrated. Their Lint rules were no longer working in getting them 100% passed reviews. What was worse, that they found whenever their Linting got updated, code was rejected again, and they needed to figure out the new rule to add to their Lint configuration. Not only did this process consume a lot of time, it distracted from actual development work.

Well, developers caught on that meeting the rules wasn't going to get them 100% reviews any more, so they began introducing honeypot infringements: They made obvious mistakes in their code so that reviewers would remark them, and they'd already have the fix in place right when the review came in.

Everyone happy: CRD met their KPI's and developers were no longer forced to constantly adopt new rules. Except ...

Stage 5: Management catches on

CRD reviewers were content, because they had plenty review comments again until the CRD management started to measure policy vialotations by type, and figured out that developers had stopped improving and were making beginner mistakes again. Of course, they reported their findings to higher management. And thus, a new KPI was born:

Obvious mistakes per developer: How many obvious review infringements were made by team, with a target of Zero and published transparently throughout the SDD.

Well, again, developers aren't stupid people. So, obviously, they would meet their KPI. How?

You might have guessed it: they would hide their, ehum, "mistakes" in the code so they were no longer obvious, and then placed bets who could get most of them past Review without being caught.

Guess who won the game?

Stage 6: Code quality deteriorates

The CRD reviewers got stuck in a game of whack-a-mole with developers, who constantly improved their tricks of hiding more and more insidious coding errors, while updating their Linting rules right away when reviewers added a new rule.

Until that day when a developer hit the jackpot by splipping a Zero-Day exploit past Review.

The CRD management no longer trusted their own Reviewers, so they added peer review reviews and another KPI:

Issues slipped past Inspection: Reviews were now a staged process where after review by a Junior Reviewer, a Senior Reviewer would review again to figure out what the first reviewer had missed. Every Reviewer would get a Second-Review Score, and that score would need to be Zero. So, they started looking deeper.

You can guess where this is leading, right?

Stage 6: Code quality goes to hell

Now, with four-eye reviews and a whole set of KPI's, nothing could go wrong any more?

Code Reviewers were doing splendidly. They always had remarks and the department's numbers truly validated that a separate, neutral Code Review Department was absolutely essential.

So the problem was fixed now.

Well, except one small fly in the ointment. Let's summarize the process from a development perspective:

When developers make mistakes, they are reprimanded for doing a poor job.
When developers make no mistakes, new rules are introduced, returning to (1).

Developers now felt like they were on a sinking ship. It was easier to simply continue making mistakes on existing rules than to adopt new rules. They came to accept that they couldn't meet their KPI anyways.

Since they could no longer win, they stopped caring. Eventually, the review department was relabeled to complaints department, and nobody took their remarks seriously any more. Developers would now simply add buffer time to their estimates, and called it the "Review buffer".

By now, the CRD was firmly established, and they were also fighting a losing battle: try whatever they might, they got more and more overloaded, because truly necessary review remarks and more and more horrible code got more and more common. They needed to add staffing, and eventually outnumbered the developers.

The Code Review Department became the last bastion against poor code quality. A bulwark defying the storming seas of bad code.

So, what's your take ...

is a Code Review Department a good idea?

How would you do it?

Wednesday, June 16, 2021

A day in the life of an Enterprise Coach

"Michael, what does an Enterprise Coach do?" - it's a good question that people sometimes ask me, and frankly, I can't say I have "the" answer. So, I will give you a small peek into my journal.

ECDE* (* = European Company Developing Everything) is a ficitional client.
Like the company, the day is fictional. The described events are real.

Disclaimer: This day is not representative of what all enterprise coaches do, nor of all the things an enterprise coach does. There is no routine. Every day is different. Every client is different. Every situation is different. Connecting the dots between all the events is much more important than all of the activities combined.

Before we start

"Enterprise Coaching" isn't simple or straightforward. There's often more than one coaching objective to pursue simultaneously, and progress requires diplomacy, patience, tons of compromises and long-term strategic thinking. Some topics can be solved in a single sessions, while others may take a long time to change. It may take years until people understand the things they were told on day 1 of their Agile training.

Whereas I typically coach for effectiveness, sustainability and quality, there's a potentially infinite amount of potential enterprise coaching objectives, including - without limitation - the introduction of frameworks, methods, practices, cultures, mindset and so on. I see the latter as means to an end, not as viable objectives to pursue.

My definition of coaching success is therefore not "X amounts of teams doing Scrum", "Y amount of Scrum Masters certified" or "Z Agile Release Trains Launched." I define success as, "The client has the means (attitude, knowledge, learning, innovation) for achieving what they want."

On the average day, I jump a lot between all levels and functions of the organization, from team member all the way to senior management - from IT over business towards administrative areas - and simultaneous work on short-term as well as long-term topics.

While I'm trying to "work myself out of a job", it's usually the lack of knowledge and the experience regarding certain concepts or practices that may require me to involve longer and deeper than initially bargained for.

A coach's day

8:00 am - Getting started

I take some time to reflect. Yesterday was an eventful day at ECDE. A critical member in one of the teams just announced they would be leaving, we had a major production incident - and management announced they want to launch a new Agile Release Train. Business expressed dissatisfaction with one of the Release Trains and there are quarrels about funding. Okay, that's too much: I have to pick my battles.

So I choose to completely ignore the head-monopoly issue, the incidents and the business dissatisfaction. I trust the teams that they can handle this: I am aware, I wasn't asked for support.

There are no priorities for coaching in ECDE. My trains self-manage their Improvement Backlogs. I haven't gotten senior management to adopt a company-wide "ECDE improvements" backlog yet, which would create much more transparency about what's actually important.

The Tyranny of the Urgent is ever-present. I have to make time for strategy, otherwise I'd just run after the latest fires. Most of the stuff I came for are long-term topics anyways, but there are always some quick wins.

So, what are the big roadblocks my client needs to overcome?

Ineffective organization, low adaptivity, lack of experience, and last but not least levels of technical debt that might exceed ECDE's net present value.

I check my calendar: Personal coaching, a strategy session and a Community workshop. Fair enough.

9:00 am - Personal Coaching / RTE

In a SAFe organization, Release Train Engineers (RTE) are multipliers of agile ways of working, practice and mindset within the organization, which is why I spend time with them as much as I can. They often serve as culture converters constantly struggling to protect their Agile Release Train from the continuously encroaching, pervasive Tayloristic, Command+Control mindset in the areas of management not yet participating in the transformation efforts.

With this come hundreds of small challenges, such as rethinking phase gates and reporting structures, decentralization, meeting information needs of developers and management alike, and driving changes to the organizational system to foster self-organization, growth and learning.

Some topics go straight into my backlog because they're over-arching and I need to address these with higher management. For others, we determine whether the RTE can handle these, needs methodology support (tools, methods, frameworks, canvases etc.) or self-learning resources (web articles, books etc.) I clarify the follow-ups and send some links with information.

The RTE role is challenging to do well, and oftentimes is pretty ungrateful. It's essential that the RTE has those precious moments where the seeds of change turn to fruition.

10:00 am - Sourcing Strategy

ECDE has outsourced to many different vendors scattered across the globe. And of course, every piece of work goes to the lowest bidder, so senior managers are looking at a development value stream as fragmented as it could possibly be. The results are underwhelming... "but hey, we're saving costs!"

I'm not a fan of cost accounting, but here I am, discussing cost of delay, opportunity costs, hidden costs, sunk costs and all of the costs associated with the Seven Wastes of Lean, re-writing the business case for the current vendor selection strategy and make the Obvious visible. We can't change long-term contracts on a whim, so we need a strategy. We schedule a follow-up with Legal and Procurement to explore intermediate options.

When you know what you need to do, and can't do it.

12:00 pm - Lunch time

The business dissatisfaction explodes into a spontaneous escalation. The line manager insists the teams must do overtime to meet the deadline for the expected fixed scope. I politely invite him to an Agile Leadership training. He declines. He demands that we must, quote, "get a proper Project Manager by the end of the month" and ends the call.

One step forward, two steps back. Happens all the time.

1:00 pm - Finally. Lunch.

A Product Owner pings me, because she's unclear about team priorities. Turns out the team wants to apply Clean Code principles, but the PO is concerned about Velocity. While I have my meal, we're having a conversation about the impact of quality upon speed and quantity. She decides to give the team room for improving their engineering practices. We agree to follow up in a month.

I shake my head. ECDE developers still need permission to do a proper job.

2:00 pm - Product Workshop

I'm joining a Product People Community workshop to introduce the concept of a Demand Kanban. I gather some materials, prepare a Mural board and grab a cup of tea. During the workshop, I explain some basic concepts, and we spend most of our time design-thinking some changes to the current process. We create a small backlog of experiments they would like to try.

The "knowledge" these POs got from their certification training is a laughing stock. I do what I can, although this will take a lot more than a handful of workshops.

5:00 pm - Let's call it a day.

A Scrum Master spontaneously calls. I'm really happy to have 1:1 conversations with people close to the teams, so I pick up despite my working day being over. Her question is inconspicious. Instead of giving a quick answer, I'm curious what her team tried and learned. I smell a systemic issue of which she only barely scraped the surface.

I suggest that she could run a Topic Retro with her team. She's stumped. For her, a Retro was always a 30-minute, "Good/Bad/Improve" session focused on the last Sprint, so she asks: "How do I do a Topic Retro?" This turns into a two-hour call.

ECDE provides abysmal support for new Scrum Masters. I decide to let it go, because there's a dedicated team in charge of Scrum Mastery. I feel bad for a moment, but my energy is limited.

7:00 pm - Finally, done.

Almost. Now, all I need to do is organize my coaching Kanban, then do the administrative stuff.

I take a look at the board and scratch my head: "Solved two problem today, found five additional problems." I take a few stickies from the "Open" column and move them straight into the trashbin.

It's almost 9pm when I turn off the computer. I reflect and once again realize that while emphasizing "Sustainable Pace" all the time to my clients, I can't continue those long days forever. I should spend more time exercising.

Tomorrow, I'll do better.

Fail Fast, Move On

Pages