How my team and I avoided flaky tests in TestCafe

Have you ever struggled with flaky tests? Did it ever feel like “mission impossible”? Well, brace yourself: this article will take you through how my team and I drastically reduced test flakiness in TestCafe!

Published in

TUI Tech Blog

7 min readAug 4, 2023

My name is Claudia Colacicchi. I work as a Software Quality Assurance (QA) Specialist at TUI Musement and in my journey with testing automation I learned that not only are there amazing advantages of automating test scenarios but also that the main struggle is represented by flaky tests. For this reason, I thought it would be nice to share with you all how my team and I put in place a strategy against them!

Although I am sure the majority of you know, I would like to start by defining what a flaky test is and why it is bad in Quality Assurance. A flaky test is a software test that yields both passing and failing results despite no changes to the code or test. The reason why having flaky tests is bad for a Quality Assurance Team is that this leads to instability and lack of confidence, which is a combo to absolutely avoid. Our job is to provide the product teams with a trusted level of confidence about their features and our test runs should always have the highest possible level of stability and reliability.

Let’s get to the juicy part though: how did we fix this problem? Our strategy consisted of three main approaches which I am going to describe one by one: custom retries inside TestCafe methods, test retries with TestCafe’s Quarantine Mode, and retrying missing tests.

1- Custom retries inside TestCafe methods

The very first flakiness we struggled with was our UI tests not being able to find web elements to interact with in the DOM. Because of this, our tests often did not go through some specific steps. We initially thought such elements were actually not displayed in the web page at times, however the screenshots showed them completely visible and also during our debugging sessions we saw them and could interact with them. We were clearly dealing with flakiness. We decided to address it with custom retries inside the specific functions that interacted with such web elements.

Here is an example. In the payment page of our e-commerce website, there are fields to fill out with personal details before completing the purchase. One of them is the first name field where customers need to write their name. Well, for some reason, our tests often failed to validate the existence of this field. You can see an example of such failure below.

The first approach we put in place to reduce the occurrence of this flakiness was a retry that executes the assertion three times before failing. To do so, as part of the checkFirstNameField() method, we implemented a for loop to allow for repeating the assertion. Inside the loop, we inserted an if…else statement to check whether the maximum number of desired attempts was reached and the repetition should stop. We set a retry limit to 4 and decided to throw an error in case the limit is reached, this means that if the assertion is repeated 3 times and still fails, the test fails and prints an error message saying “Maximum number of attempts reached. The First Name field could not be reached”. While the maximum number of attempts is not reached, the function keeps repeating the assertion in case it fails, because it enters the else block and executes a try…catch statement. The assertion to verify if the first name field exists is performed inside the try block, if it succeeds then the firstNameFieldIsAvailable variable is set to true, the for loop is exited and the test can continue on to the next method. If the assertion fails, the catch block is entered, the firstNameFieldIsAvailable is set to false and the loop continues. Below, you can see the related snippet of code:

We had to apply these custom retries on top of an already present built-in feature by TestCafe which performs retries itself: the Built-in Wait Mechanism. Let me explain better what it is and why, despite that, we had to come up with a customized solution to fix our problem. In the case of an assertion, which is our example, TestCafe executes the Smart Assertion Query Mechanism which does not statically wait for page elements to appear but repeats the evaluation process until the assertion yields a successful result, or the assertion timeout expires. Was this not enough for preventing our flakiness? Unfortunately not.

You might be wondering why we did not just extend the assertion timeout and let TestCafe do its intelligent job behind the scenes with a little extra time. Well, we did, but it was not enough to have significantly stable tests: until we implemented our custom retries, we still faced a certain degree of flakiness which we could not afford. The really nice thing is that, right now, we have a customized retying mechanism implemented on top of a smart built-in feature by TestCafe that repeats evaluating if the element exists inside each of our custom retry. Pretty nice combination, huh?

2- Test retries with TestCafe’s Quarantine Mode

In certain cases, implementing retries within TestCafe’s methods was not enough. At this point, we decided to strengthen our strategy by making use of a very nice functionality powered by TestCafe: Quarantine Mode.
Quarantine Mode is a feature aimed at reducing false negatives and instability. When a test fails, TestCafe will quarantine it and will repeat it until considered successful.

How can TestCafe eventually confirm that the test is successful? For this scope, TestCafe provides a configurable option named “success threshold”, which is the number of successful attempts necessary to confirm the test’s success. The value of this “success threshold” can be decided by the QA Team. My team and I agreed to set it to 1 and it has proven to be reliable.

Are you now wondering what happens if the test is rightly failing because of a bug? Will TestCafe keep repeating the test forever because the success threshold is not met? Of course not! There is another option that covers this need and it is called “attempt limit”: it is the maximum accepted number of test attempts. Again, its value can be decided by the team, and in our case we set the attempt limit to 3, making TestCafe repeat our failed test 3 times at most and then mark it as failed in case it does not succeed once during the retries.

Below, you can find our own configuration of Quarantine Mode. To summarize, by setting it this way, TestCafe’s Quarantine Mode quarantines a failed test and repeats it for a maximum of 3 times (the attempt limit). Within these repetitions, if the test reaches a success threshold of 1, meaning that it succeeds at least once, TestCafe will mark it as passed, otherwise it will consider it failing.

Be aware that if the success threshold is met before reaching the attempt limit, the test will be marked as passed and TestCafe won’t keep executing it.

3- Retrying missing tests

On top of false negatives and general problems making our tests wrongly fail, my team and I faced issues with tests not running at all. Sometimes, our reports showed incomplete test runs, where one or more test scenarios did not run.

You could argue that having incomplete test runs is not a real test flakiness, which is true. However, we considered it fitting with the whole definition of having our tests fail to produce the same outcome with each individual run, which caused unreliability. That is why we included a solution to this problem in our strategy to tackle flakiness.

We addressed this issue by implementing a check to verify which test is not present in the report, if any, and re-trigger its run. Below, you can see the script that helped us do that, and since our reporting system is Allure, you will see references to it.

Since applying the three described solutions, my team and I saw a significant improvement in our test runs. We have many less false negatives than before and the reliability of our testing strategy increased greatly. We have never had an incomplete test run again and we have gained much more trust from our development teams. It was an amazing achievement that we reached as a team.

For this reason, I want to express my sincere gratitude to my team leader Aurélien Lair and my teammates Andrea Zito and David Contreras for their continuous support, their effort and passion to achieve our shared goals. This article would not exist if it was not for our amazing teamwork, thank you guys!

We really hope that the solutions described in this article can be helpful for you Quality Assurance Engineers out there. In case you are struggling with the same issues we faced, we strongly suggest trying out our solutions!

How my team and I avoided flaky tests in TestCafe

Have you ever struggled with flaky tests? Did it ever feel like “mission impossible”? Well, brace yourself: this article will take you through how my team and I drastically reduced test flakiness in TestCafe!

1- Custom retries inside TestCafe methods

2- Test retries with TestCafe’s Quarantine Mode

3- Retrying missing tests

Written by Claudia Colacicchi