Monday, March 9, 2020

The testing bottleneck

Test appearing as a bottleneck is a recurrent theme across many organizations. In this article, we will explore why test often becomes the constraint - and ways out of the situation. Adherents of the "Theory of Constraints" will recognize this article as steps 2 and 3 of the "Five Focusing Steps". All of the proposals improve test performance - yet none of them rely on investing a single additional cent!

The test bottleneck

Test execution is a necessary activity between development and delivery - there's no way to avoid this, and no amount of "Agile" or "Shift-Left" is going to change that. Hence, the question is not so much "Where to test?", as it is, "How to approach testing?"

Test Execution - typically a bottleneck by design!
Development, Build, Test, Delivery and deployment - an inevitable sequence. How and why does test execution become the bottleneck then?

By looking away from the testers and the software package as a whole, any specific product change work item is in a "Wait state" whenever nobody is actively working to process it. Hence, most of the activities listed below block the flow of work with no value added.

Note of caution - the entire article is written with the assumption that test is the constraint - the solutions can't be applied if the constraint is known to be elsewhere!

Stop doing the wrong things

This section is a list of traditional tester activities that may quickly consume all available test capacity - and the consequence is that there's often no time left to do the work that would actually matter. 
So here are things that testers shouldn't even be doing. 


Test Setup

Traditional software test may be part of a "push process" where developers provide code and the backlog item then immediately goes onto the tester's desk: creating a running build, installing the version on a test environment, getting it to run - all the tester's problem. 

This paints a straightforward ideal picture: there should be zero tester activity and zero delay between developers providing a new version and the start of test execution.

The solution space here is simple and obvious: all the above mentioned activities should follow an automated standard process. Unless we have a 100% repeatable and reproducible process for these activities across development, test and operations, we do not have a proper guarantee that this process will yield the same outcome anyway.

There are plenty of tools out there that can be used to automate this process, and if your organization hasn't done this yet - automating build and installation is the simplest quick win for your test execution.
A little bit more challenging, but still almost effortless is the automation of smoke tests - doing automatically what needs to be done anyway each time a new product version is installed.

How to do it?

When test is a bottleneck anyway, you don't benefit from adding more burden onto testers, so instead of pushing another backlog item into Testing, use the developer's time to let them automate whatever they use to create a build on their localhost, and whatever the installation manual says. If that's not enough, let developers observe what testers and ops do, and automate that as well: move towards Continuous Delivery!
The development time invested into setup automation typically pays for itself within weeks - and starts saving money ever after! Plus, everyone in the organization will wonder how you could ever have lived without it.


Test Case Creation

Functional and Acceptance Testing rely on test cases - depending on the complexity of the change on an entire test case catalog. It's not unusual that properly defining test cases takes as much (or more) time as development.

Depending on the order of delivery, test cases may not be prepared by the time the first delivery of the product arrives. This creates two problems: first, the delivery must wait until the test case is prepared - and second, testers have to re-prioritize their work, leaving whatever else they were doing in a wait state.

A typical problem caused by the asynchronous creation of test cases and development is that testers may not have written the test cases to match exactly that which developers have already delivered (especially if increments are really small), making the test case fail upon execution, resulting in unnecessary "false positives" and communication overhead. 

The reflex solution

Many organizations defer testing until both all test cases are created - and the entire test object is completed. Depending on the size of the backlog items in question, the consequence is "big batch" and asynchronous processing: There is no longer a direct connection between development work and testing. We end up with a postponed, prolonged "Test Phase", which oftentimes also results in a "bugfixing phase" - which is disruptive to everyone, and unpredictable in both duration and outcome. Most organizations that choose this route inevitably compromise both on quality and sustainability.

An improved solution

Approaches like ATDD and BDD, combined with Design Workshops, allow for an early and aligned specification of acceptance criteria, test approach and test scope. Since these collaborative approaches ensure that the right questions are asked before development, people can align both on tests and development outcomes early on in the process. This ensures both that there is less discrepancy in understanding between developers and testers (which means there will be fewer defects) - and also that there will not be time passing between receiving a delivery and beginning to test. Likewise, since tests are defined before development starts, a delivery will no longer lead to interrupts and blockers on other work caused by missing test cases.


Bug Tracking

Another reason for putting product work into a "Wait State" that is all too common in large organizations is bug tracking - the longer the list of known bugs, the more test effort is diverted to managing the defect backlog and doing re-tests for fixes. This time eats down on test execution time, and also delays the delivery of value. And this delay becomes exponential: the more defects need fixing before a delivery can be cleared for release, the more time a backlog item spends in Wait.

The reflex solution

When bugs are an issue, the solution is often to introduce a dedicated test manager, who does nothing other than prioritize, track, monitor and report defects. This fixes neither the defects nor the problem of missing capacity. Instead, we detract from this dissipation of capacity by institutionalizing it in a formal role.

An improved solution

As ridiculous as it sounds - the easiest way to reduce bug tracking efforts is not to create bugs. As an alternative where this is not (yet) an option, the best possible choice is to produce fewer defects, and to introduce reliable mechanisms for ensuring that defects are actually resolved.
Smaller changes, i.e. smaller increments, will contain fewer opportunities for defects, and the optimal size of a delivery should have the potential to contain a maximum of one defect - the one change that was made. This comes back to Continuous Integration / Continuous Delivery.

Another part of the problem is that in traditional test management, any reported defect is a "promise" - that means more work later: both developers and testers will have more work with this defect at some point in the future. Ideally, though, developers don't only provide a fix, they also provide evidence that the fix was effective and the defect doesn't return. That's where automated regression testing comes in. Developers should automate the test that yielded the defect, use it to verify the presence of the defect in the described scenario, then use that same automated test to verify the correct behaviour. This, too, removes the capacity drain on testers.


Test Management

Another drain on test capacity is the question of which test cases to run when, tracking how many of them did run - and how many of them were successful. As long as there are defects, bug tracking (see above) comes on top, and with that, Go/No Go Recommendations, which require both preparation and meetings. And of course, with that, a hefty load of compromise, politics and technical debt.

The solution

A consequent use of BDD/ATDD means that all functional tests will be automated as part of the development process, and evidence of their correctness will be provided as part of the build process. 
When all functional tests are automatically executed the minute that a code change is made, including both regression and changes - and there is no way to proceed in delivery as long as even just a single defect occurs, this eliminates a multitude of test management jobs:
  • There is no need to track defects, because developers can't proceed with defects.
  • Go/No Go Recommendations are always "Go", because the test suite provides evidence that there are no known functional risks.
  • Reports and evidence is collected by the system, eliminating manual effort from testers

Test Automation

Testers often have to make an either/or decision between automating tests and executing tests. In a traditional testing mindset, the decision will most often favor doing the tests manually, and "automation will be done later". Note that "later", in this context, translates to, "as soon as there is time", which, in a bottleneck situation, is just a euphemism for "never".
The result is a vicious circle: Lack of automation means tests consume more capacity, which means there is less time for automation, which means we need to do more manual testing, which means there will be less automation. Additionally, there will be slower feedback for developers, which means there will be more defects - the consequences are already described in the other sections!

The reflex solution

Knowing the problem and understanding the vicious circle, most organizations simply decide to invest into test automation. Since in many cases, testers are not developers, they resort to the use of specialized tools for creating this automation that do not rely on developer knowledge. 
In almost every case, the automated test suites created in this way will eventually give rise to some critical problems that make the entire approach unsustainable:
  • There's a disconnect between the code and the test cases which can yield both false positives (reporting a defect when there is none) and false negatives (not finding a defect).
    • False positives create significant effort for defect analysis, which again drains testing capacity.
    • False negatives reduce confidence in the test automation and lead to further effort.
  • Automated test suites require continuous maintenance. If the test code is not Clean Code, the maintenance effort will eventually become prohibitive. Most organizations eventually come to the point where they need to trash their Test Automation created exclusively by testers.
  • Automated test suites created with tester tools often test at the wrong level, making the tests slow - on many cases, so slow that executing these tests after every code change is not an option.

A better solution

Instead of having testers, who are already constraining the performance of the development process, spend time on creating automated tests of questionable test code quality, use testers to define which scenarios can and should be automated, and use a testing framework close to the source code for creating tests that maximize execution performance. Apply rigorous Clean Code Practices, including Refactoring, to move every piece of test execution to the best possible level in the Test Pyramid. This significantly speeds up test execution. It likewise reduces the amount of effort required to maintain and update the test automation suite.


Functional and Acceptance Tests

We have learned from ISTQB what kind of tests are required in software testing: from happy path over branch coverage, edge cases all the way to negative tests.  Why do testers do execute these tests? Because there are (probably) defects. And the new delivery can't be released until we know what the defects are, where they are, and how bad they are.

The reflex solution

When testers find many defects in functional testing, the obvious solution is to have the testers do more testing. This "more testing", in practice, means either postponing the delivery until defects are fixed (a theoretical, yet rare solution, because it is so undesirable) - or adding more testers. Neither addresses the root cause, i.e. why there are defects. Eventually, we get into the vicious circle of bug tracking and big batch delivery.

A better solution

None of the above mentioned tests need to be executed by testers. Why are there defects? We come back to having a disconnect between development and test, i.e. having built the wrong product to begin with.
Again, the solution is to ensure that quality criteria are clearly available to developers, consistently understood by everyone - and verified before software even enters testing. This sets testers free to do the tests that can not sensibly be automated: for example, one-off tests, UX testing or exploratory tests.


Work-Arounds

Testers often spend hours to set up an intricate scenario in the system that would allow them to press that one button which would determine success or failure of the test, and therefore, make or break of an entire Release. They may be spending time to reverse-engineer the database, copy+paste data into web service requests, manipulate files on the system and many other things, just to be able to run their tests. None of these activities should ever be required to be done manually - and mostly, they shouldn't be the responsibility of testers. Every minute testers invest into these activities is a minute testers waste in regards to doing the things they really should be doing.

The reflex solution

Many organizations set up special data and configurations on their test environments which, under no circumstance whatsoever, must be used for any purpose other than the tests they are intended to be used for. In some cases, painstaking effort is invested into creating both surrounding governance and maintenance scripts that only exist to maintain the integrity of the tests.
This approach diverts massive test capacity from doing the work that matters. Every minute spent on this "solution" is a high-risk investment into an unsustainable test approach that still drains test capacity.

A better solution

The organization should have a serious discussion about what the best way to provision test environments is. The ideal situation is a "No-Touch Bootstrap", which provides a pristine test environment that is optimized to conduct all automated and manual tests with minimal effort and delay. Required data and configuration should be injected via the product's own capabilities, i.e. "design for testability", as part of the development process.
To create an optimally testable software is an exercise that involves testers communicating testing needs, designers and architects conceptualizing a way to achieve testability, and developers creating code that minimizes the effort of doing the right tests in the right way. 
Even when a legacy system doesn't offer proper testing capabilities, developers are the right people to provide scripts and other software solutions that allow testers to focus on that which matters in testing. 


Doing the right things

If this article leaves you wondering what testers should be doing instead, and whether they'll still be needed at all, the short answer is "Yes".
The long answer is just barely scraped in many other posts - for example: engaging in product discoverydesigning better test approachesOptimizing the test pyramidImproving existing tests, constantly improving the understanding of "how to create better tests", pushing for Zero Defect quality and shifting the test paradigm.

If we consider this short list as the value testers bring to an agile team, we'll just leave it with the short question: "How much time do testers have left do those things after we've subtracted all of the time they're doing the things they shouldn't be doing?" All of these would have a scaled and sustainable value for the team, the product and the organization.
And still, in most organizations, the ratio is abysmally low. Because people just push more work onto testers instead of finding ways to enable them to bring the value they could!

So here is my challenge: Do a pie chart and let your testers draw a slice for how much time they spend on each activity described in this article and use the outcomes as a reflection opportunity.

Summary

The intuitive "solutions" to capacity and performance problems in testing are neither helpful, nor sustainable. A paradigm shift is required, and part of that paradigm shift is to allow the available testers to work with maximum efficiency.

Some key activities that can maximize test efficiency include, without limitation, the ability and capacity of every team member and:

  • "Stop the Line" when the "Waiting for Test buffer" spills over, and not start more work until the pipeline clears - to reduce the amount of coordination effort required for testing.
  • Examine every activity done by testers and asking, first, "What would be required to make this activity no longer needed?", and if it's inevitable: "Can someone else do this, or at least parts of it?"
  • Reduce (or: removing) the possibility for defects by aligning early on in order to eliminate all tester activity related to handling defects,
  • Engineer the software itself to ease testing,
  • Automation of functional and acceptance tests as early as possible, ideally before any software is delivered (ATDD approach) and no later than before the first delivery,
  • Automation of time-consuming repetitive activity (especially functional regression test),
  • Moving test automation work to developers in the simplest, best possible way that is most consistent with the product's code,
  • Coach people in test execution, such as to share the workload.
  • Separate tester activity into "sustainable" and "unsustainable", and relentlessly push for higher sustainability.
Depending on how much work described in the main section of this article your testers are doing, and how much delay is incurred in testing, you will quickly see substantial benefits in outcomes by doing the things above: And you don't need to invest a single additional cent!

No comments:

Post a Comment