Testing Horror Story - Selenium Mania

Selenium Mania

This is a story about a team that I joined as a test automation expert and then dropped out of very quickly (you'll see why). I continued to work for the company on other projects taking on different roles, so I got to see the tragedy unfold from the sidelines.

The team had about 5 developers and three test engineers (four with me). They were maintaining a legacy system, which had initially been developed by an external agency.

This system had next to no unit or integration tests (single digit amount of unit tests), and the "test strategy" was to test everything via Browser E2E tests using Selenium. If you are familiar with the test automation pyramid, you might also know about the test automation ice cone. It is a testing pyramid flipped on its head: everything is tested with E2E tests and (usually) heavily relies on manual testing, because automating E2E tests is just not feasible.

The test engineers had written close to 1000 E2E tests with Selenium. One test run took several hours and blocked most of the resources that were also needed for building the software and executing manual tests. As a result the automated test suite was only run on weekends.

You see this sometimes with teams that do not have a good understanding of test automation strategies:

  • Tests are developed using one E2E framework or another
  • Initially tests can be written very quickly for simple scenarios
  • More and more tests are piled on over time
  • As scenarios become more complex, the team has to work around limited test data, testing resources, etc.
  • Test execution takes longer and longer, tests are hard to maintain and flaky with many tests being disabled
  • Tests are executed infrequently (at night, weekends, before a release, etc.)
  • Eventually the test suite no longer serves any purpose, as execution takes too long for day-to-day work and test results are unreliable
  • Sunk-cost-fallacy prevents the team from scrapping the tests and the test suite is kept around and maintained half-assedly

Another predictable problem was the maintainability of the test suite (or lack thereof). E2E Tests are hard to maintain because their scope is so large.

Because they need so much setup (test data, browser, network resources, etc.), they often use a Shared Fixture. This makes them prone to leak (test data is not properly cleaned up before starting the next test case, browser stores cookie settings from previous tests, etc.), this makes them unpredictable and flaky.

The test engineers spent most of their time debugging test runs from the previous weekend. By the time they could confirm whether the failing tests were false positive, it was already too late - the application had already been deployed, bugs included.

Both test engineers therefore had to spend their remaining time manually testing the application before any release date. The backlog of test cases that needed to be automated grew, as they were simply not able to A) debug all the tests, B) change tests as needed, C) manually test the application, and D) design and code new test cases.

This is where they asked me to help them.

I know what you will think now: "Aha! You bolted because you thought those issues unsolvable. Coward!"

First of: wow, harsh!

But also: not quite. Because, you see, they did not ask me to come in as an expert to address those problems. They asked me to help them automate all those test cases they were behind on.

When I pointed out that this simply does not scale and will only result in more maintenance work, more flakiness, and longer test runtimes, the test engineers confirmed that, yes, that was going to happen. I suggested that they might instead talk with the developers about implementing other types of tests together: e.g. unit tests, tests against the API and performance tests.

That was when it was made clear to me that any further communication with the individual team members had to be approved by the test lead first. I saw the writing on the wall and moved to other projects.

What happened after

The test lead eventually left the team - a year or so after my short stint with them. The team was left with the mess. (I might also add at this point that this test lead did not do any coding himself, so did not feel the pain.)

As far as I know, they never resolved their issues. Last time I spoke with the (then new) person responsible for their test strategy, he confided in me that they still spend a lot of time maintaining and debugging the tests. The team also saw it necessary to polish the numbers of bugs found, because they knew they would be in big trouble if management ever found out how many bugs all those E2E tests actually found: Close to zero; practically all bugs were found via manual testing sessions.

E2E Testing Frameworks

I have worked with multiple E2E testing frameworks (mostly for browsers), that all face the same issues. Be it Selenium, Cypress, Playwright, or any of the others, none of them can get around one simple fact:

GUIs are made for users, not machines

We as humans interact with software completely differently than software interacts with other software. That is why I recommend to automate primarily against APIs and only add tests interacting with the GUI whenever there is no other reasonable way to automate. And, for that matter, only automate against the API when a unit test is unsatisfactory.

A natural result of that principle is, that most of the tests I end up with are unit tests, some higher-level automated tests against the API, and very few GUI tests. So, in essence, the distribution that the test automation pyramid recommends.

The Moral of the Story

  • When you decide to introduce test automation for existing software, do not take the seemingly easy way and automate all tests via the UI. You will end up paying for that decision in the long run.
  • If you get an expert on the team, take their advice on board, and make sure they are not fleeing the project? And if they do, take that as a hint.