To Automate or Not to Automate? Deciding Which Tests to Automate for Maximum Efficiency

5 min readMar 16, 2023

As we continue to expand our automation testing efforts, it’s critical to establish a reliable system or approach for determining which test cases should be automated.

Building on my previous article where I discussed the differences between manual and automation testing, I would like to continue by sharing a principle-based method for assessing the suitability of a test case for automation.

This approach is not a groundbreaking revelation, as several large organizations utilize it to scale their automation tests. The technique was initially presented by Angie Jones, a former Senior Automation Engineer at Twitter, in a conference talk five years ago, long before Elon Musk’s wild era began.

Let’s begin with the objectives. The aims of applying this method in our QA process are the following:

Uncovering the essential factors for choosing which tests to automate effectively
Acquiring the necessary data for informed decision-making
Developing a universal method based on principles to assess whether a test should be automated or not.

To assist in our decision-making process, we can use the table below which outlines several key factors that should be taken into account when considering whether a given scenario or test should be automated. Each column represents a unique aspect that can help guide our decision, including factors such as complexity, impact, and historical data.

By using this table as a reference, we can be sure that we are taking a well-informed and objective approach to our testing process.

G for “Gut Feeling”

As you write your scenario or test case, you may feel a bit like a psychic, relying on your “gut feeling” to determine whether it should be automated or not. Don’t worry, we’ve all been there. But let’s face it, our gut feeling can sometimes be a little too trusting. I mean, have you ever had a gut feeling that you left your stove on, only to come back and find it wasn’t even turned on in the first place? Yeah, me neither.

Anyway, relying solely on our gut feeling can lead to a dangerous path of automating every test case, even the ones that shouldn’t be automated. It’s like trying to fit a square peg into a round hole. Sure, it might work, but it’s not the most efficient or effective way of doing things.

That’s why we need to look at other factors to determine if a test case should be automated or not. By doing so, we can optimize our efforts and focus on the tests that truly matter. So let’s put our psychic abilities aside and use some other factors to make the best decisions for our automation testing strategy.

U for Usage

This factor revolves around the frequency of usage by customers. At our organization, we can leverage Mixpanel to analyze the volume of events and compare them to previously identified functionalities to determine if usage is high or not. Additionally, we can gather feedback from our Customer Success teams or explore relevant dashboards to gain insights into the usage of these features.

However, if the feature or functionality being tested is relatively new and doesn’t have any production data yet, we can turn to our Product Owners for input. They can provide insights on the potential usage based on their product research and the frequency at which customers have requested the feature.

I for Impact

One important factor to consider when determining if a feature or functionality is important is its impact on our customers. For instance, if a user is unable to log in or access their data, it can greatly affect their overall experience with our app. Even if we don’t have any data to support this, it’s crucial to keep in mind the impact on our customers and prioritize testing accordingly. After all, happy customers are what keep our business going!

C for Complexity

When deciding whether to automate a test, it is crucial to consider the complexity of implementation. However, it is important to note that this factor alone should not be the sole determining factor. Other factors, such as the frequency of use and potential impact on customers, must also be taken into account to ensure that the tests being automated align with business goals and objectives. For instance, although liking a post may be simpler to implement than creating a post, if the liking feature breaks, it may not cause panic among users as they can still perform other crucial tasks such as reading, editing, or commenting on posts. Nonetheless, it is essential to evaluate the overall significance of each feature and functionality before deciding whether to automate or not.

H for History

This is a crucial factor that requires concrete data. The historical data of a feature, in terms of the number of bugs reported, is what we refer to as its history. In the case of a new feature that lacks any historical data on bugs or failures, we can examine a similar or related feature to get an idea of what to expect. However, if no related data is available, we must ensure that we cover the feature with either automated or manual testing, or both, as ignoring it could have dire consequences. In the absence of any historical data, we can assign a score of 0 for this factor. In our case, we have these bug statistics that we are monitoring that show volume by bug index per component

To help us make more informed decisions, we can assign a score to each factor or category, except for the subjective “gut feeling” factor. A scale of 0 to 10 would suffice, as it is both substantial and easy to interpret.

As an illustration, during our recent workshop, we applied this approach to several test scenarios, and I’d like to share some examples of the scores we assigned for each one.

In conclusion, through this exercise, we have gained a more objective understanding of which test cases we should prioritize for automation, which ones we can delay, and which ones we should avoid. It’s important to note that NOT all factors need to be considered, but rather a combination of relevant factors. For example, as for the newer features, we could use a combination of Gut Feeling, Impact, and Complexity. And perhaps use History or Usage for older, established features that have not yet been reviewed for automation.

By taking a more systematic approach to determining which tests to automate, we can improve the efficiency and effectiveness of our testing process.