Testing Under Pressure: How We Built an Autonomous Testing Framework

Published in

GoPenAI

8 min readMar 2, 2024

A while back, I worked on a project where we needed to perform automated sanity tests after deploying to a production instance. Sanity testing is a fast check to ensure that recently added or changed features in a software application are working as expected, without doing a full test. It helps make sure the essential functions are working before going into more detailed testing, acting as a quick quality assurance step. The test setup had to change dynamically based on the release type because frontend and backend releases happened at different times, needing different test setups.

The test team had only a one-hour window for sanity testing and reporting any issues. We ran our automated test suite for the sanity check, containing close to 100 test cases. Due to the application’s size, we needed to check the basic functionality of all important components, as requested by the business.

Imagine you’re getting ready to launch a rocket into space. Now, before that happens, you want to make sure everything on the rocket works perfectly — the engines, the gadgets, everything. That’s a bit like what we’re doing with our app, but instead of a rocket, it’s our computer program.

Now, here’s the tricky part — we only have one hour to do this check. Why? Well, think of it like catching a train. If you miss it, you have to wait for the next one, and that can cause delays. For us, if we miss our one-hour window, it could mean delays in getting our app out there for people to use.

So, that one hour is like our super-important deadline. We need to check everything quickly to make sure our app is good to go. If we take too long, it could cause problems for everyone involved — the team, the company, and even the people waiting to use the app. That’s why every second counts in our one-hour testing mission!

Although the tests were executed in parallel, it took a minimum of 30 to 45 minutes to complete the test execution. The test suite comprises simple, medium, and some complex test cases that assess critical functionalities in the application. The complex tests will take 30 minutes to complete due to the nature of the application, not because of poor test performance. We faced several challenges in completing the sanity within the one-hour window.

Challenges Faced

Problem 1

When the tests ran, most of the time, there were no issues, and we could finish the sanity check on time. However, occasional changes occurred in the frontend, and the test team might not have been notified about these changes, resulting in numerous test failures. The team faced pressure to analyze all failures and report any new issues within the remaining 15 minutes. Minor changes in the UI, such as modifying web element text or properties, occurred after the regression cycle, significantly affecting automation testing.

For instance, every test must pass through the Login page. If any characteristics of the web elements on the Login page change, all the tests will not work correctly. In this case, the automation tester needs to recognize and fix the tests within the sanity window and then rerun them. The Login page consists of web elements like the UserName text box, Password text box, and Login button. If the properties of the Login button, like Text, ID, Name, or Tag, change, the test may not be able to recognize the login button due to this modification.

In this situation, if the Login button changes, it takes only 5 minutes for all tests using the Login page to fail. The automation tester spends the next 5 minutes figuring out what went wrong and another 5 minutes fixing the affected tests. So, a total of 15 minutes is already used up because of the Login button change. Now, there are only 45 minutes left in the sanity window.

Problem 2

Deployment-related issues sometimes caused many test failures, requiring quick analysis and reporting to stakeholders. Some deployment issues were repeating problems that could be solved quickly, but analyzing all test failures within 15 minutes and identifying the type of issues was challenging.

For instance, if the search tests fail with a specific error ‘xyz’ on the screen, based on our experience, we understand that a particular dependency ‘abc’ is not configured correctly. In such cases, we inform the developers to properly add the dependency ‘abc’. However, manually categorizing the test failures quickly is a challenging task.

Let’s consider the example where there are 20 test failures, and only one testcase failed due to the search error ‘xyz’ on the screen. This error is situated in the middle or end of the failure list, while the remaining failures are attributed to various reasons such as ‘success message missing,’ ‘unable to update,’ ‘element not found,’ and so on. If we were to manually analyze these failures, we’d need to go through each failed test one by one. Finding the specific test failure due to the missing ‘abc’ dependency would take at least 10 to 15 minutes.

Problem 3

The sanity test occurred after business hours, with one person acting as the point of contact sending out emails when deployment was completed. Each test team, including mine, involved at least 10 to 12 teams testing their functionality. Deployment delays or waiting for hours before starting the sanity tests, especially in late-night scenarios, added pressure. The person initiating the automated tests needed to constantly check emails. Sometimes, they would miss deployment emails due to extended waiting times, leading to reduced testing duration. Due to pressure, there were instances of starting tests with the wrong configuration, requiring stopping and rerunning with the correct configuration.

If the automation tester checks the deployment mail 10 minutes after receiving it and, due to anxiety over missing the mail, initiates a test run with the wrong test configuration, realizing the mistake mid-execution, they have to stop the process and restart with the correct configuration. This can consume almost 20 to 30 minutes, leaving only 30 minutes within the sanity window.

We brought these issues to the attention of management, seeking to either extend the duration of the sanity tests or decrease the number of tests in the suite. However, prolonging the time would mean higher expenses since numerous stakeholders and teams, besides my team, are part of the sanity process. On the other hand, reducing the number of tests could jeopardize the quality of the application. In response, management decided to assign two additional individuals to assist us in the sanity process.

So, we looked for solutions to address these challenges:

A test framework capable of adapting to web element changes.
A solution to quickly analyze numerous test failures and provide a summary of the test results.
A solution to trigger automated tests immediately after receiving deployment emails, capable of understanding the emails and selecting the right configuration for test execution.

Solution

We continued our research for a significant amount of time while also handling our usual tasks and handling sanity testing despite the challenges. Over time, we found solutions for each problem, leading to the creation of an autonomous test framework.

Mail Reader

Mail Reader addresses the triggering of automated tests immediately after receiving deployment emails, capable of understanding the emails and selecting the right configuration for test execution.

Now, let’s delve into the scenario outlined in the Problem 3 section. Now, there will be no instances of triggering the test with the wrong configuration. This mail reader triggers the test execution immediately upon receiving the deployment email. It employs basic machine learning techniques to classify emails, distinguishing between those related to frontend changes and backend changes. Additionally, it adapts the test configuration based on the type of deployment — utilizing different configurations for frontend and backend changes. The mail reader swiftly comprehends the content of the email, selects the appropriate configuration, and initiates the test execution accurately and promptly

Self-Healing

Self-Healing addresses the dynamically changing UI.

The following video illustrates how self-healing addresses dynamically changing UI elements

Now, let’s reflect on the example of the Login button discussed in the Problem 1 section. If the properties of the login button change and the web element selector in the test script is unable to locate the login button, the self-healing mechanism is triggered. It identifies the new login button and updates the element selector in the test script, allowing the execution to proceed. This self-healing solution can be accomplished within 1 to 2 minutes, as opposed to the 15 minutes required for manual resolution.

Test Result Analyzer

Test Result Analyzer addresses the quick and easy way of analyzing and categorizing test results.

The video below demonstrates how the Test Result Analyzer facilitates rapid analysis

For the example provided in the Problem 2 section, utilizing the Test Result Analyzer streamlines the process. Rather than manually inspecting each test failure for errors, the analyzer compiles a list of unique errors from the test execution and associates them with the respective failed test cases. With the basic version of the test result analyzer, analysis can be completed swiftly in less than 5 minutes, significantly saving time. Additionally, the Advanced version automates the analysis process, completing it within seconds. Further details on the Advanced version are available in the below article.

The following article provides detailed information on each of the solutions

What is Autonomous/Unsupervised Test Automation Framework?

This article is about the introduction of the Autonomous Test Framework and how we can build one with any open source…

www.linkedin.com

Conclusion

In wrapping up our journey through the twists and turns of automated sanity testing, we’ve learned that quick thinking, adaptability, and good communication are key. From pesky frontend changes to deployment hiccups, we faced it all. But, the star of our story is the autonomous test framework we built. It transformed our testing game, proving that being proactive pays off. This article spills the beans on our challenges and victories, showing how we evolved to a more efficient and reliable testing approach. I hope our journey would help you and give some ideas for your own testing adventures.