AI/ML in Software Test Automation

Tool vendors claim a 1000x increase in productivity. Here’s an alternative view and some critical advice when considering AI/ML test automation tools.

Blake Norrish
Slalom Build

--

The incredibly optimistic claims of the benefits of AI/ML in automation tools are hard to ignore: write automated tests 3x faster! 90% cheaper! Get a 1000x productivity improvement! (yes, one-thousand)

While all marketing teams embellish, the degree of improvement claimed by AI/ML tool vendors seems unprecedented. If these improvements were true, no engineering leader or automation architect could afford to ignore them.

Unfortunately, finding an unbiased, critical evaluation of these tools is challenging. Some tools are so new there is not yet significant community feedback, and most existing documentation comes directly from the vendors selling the product or from pundits without industry experience.

Our position as a service company building software for clients across dozens of industries puts us in a unique position to evaluate these tools. Our clients expect us to bring best-of-class solutions — if these AI/ML tools did provide the claimed benefits and we did not leverage them, we would quickly be out of a job.

While there isn’t space in this article to do an exhaustive and detailed evaluation of every AI/ML tool, we do want to provide our opinion on the value of these tools in general, and give you several things to keep in mind when considering them for use in your own organization.

For context, we’ve copied some of the marketing material used by AI/ML tools below. These quotes are taken verbatim from different vendor websites or advertisements. We did not include links as we do not want to single out any tool specifically.

It is sometimes hard to separate out the claimed improvements due to AI/ML from other features (low-code, etc.), but all the examples below are from tools that explicitly advertise their use of AI/ML in some way. Here they are:

  • [this tool] authors tests with a few clicks: Choose a flow in your app — our AI will prebuild the test for you. Add assertions, change test data, reorganize the flow and run in minutes. Tests are fully customizable with little-to-no code required.
  • 90% faster authoring of low code E2E tests with AI-powered stability
  • Create automated tests that survive almost any UI change and are literally “unbreakable.” Create tests with a virtual and constant retestID, that replaces ugly CSS or XPATH classifiers. No page-object pattern, no manual abstraction.
  • [this tool] lets testing teams easily create and execute exponentially more tests than they can with current automation. AI Scripting: 1000x productivity improvement, Test Designer: 10x productivity improvement, Unified Testing: 3x productivity improvement.
  • [this tool] uses AI to empower each QA person with the equivalent of 100 brainpower. Whether engineers, analysts or others, [this tool] makes them vastly more productive versus their single brainpower with current test automation. Thus, [this tool] lets you get more testing performed with far fewer resources, or much more testing performed with current resources.
  • Gain up to 250% ROI in 2 years. Reduce testing man hours by up to 92%. Cut time spent on regression testing by 87%. Save up to $7,425 for every test cycle.
  • 90% average increase in test coverage. 3x Faster test creation. 40% fewer bugs in production.

How should engineering leaders and automation architects react to claims like these? Here is our opinion on the true merit of these tools, and four specific things to consider as you evaluate using these tools into your organization.

ONE: Avoid tools that don’t explicitly describe how they are using AI/ML and what problem the tool is using AI/ML to solve

There are many challenges within testing and test automation that AI/ML can potentially benefit. While some vendors and framework authors explicitly describe how they leverage AI/ML to solve specific problems, others only vaguely attach AI/ML to their product without providing supporting information. For example: “AI bots generate your tests for you in a fraction of the time!”.

Specific areas where AI/ML is being used include:

  • Visual Regression: identification of visual differences between pages after code changes.
  • Log Analysis (AI-Ops): processing large amounts of log data to identify anomalous events.
  • Element Identification: using AI/ML to identify specific UI elements within an HTML page, rather than using normal selector algorithms.
  • Test Identification: determining which tests within a large suite are most likely to find issues given a set of changes
  • Test Authoring: Using natural language processing or recorded user actions to automatically generate tests for an application.

Some of these are more amenable to AI/ML that others. In our opinion, the nature of visual regression and log analysis make them good candidates for AI/ML algorithms, and many vendors are providing interesting tooling in these areas worthy of your attention.

While element identification is a problem that can be aided by AI/ML, the application of AI/ML in this situation is not addressing the root cause. It’s a band-aid over a symptom. If your UI automation cannot find deterministic and simple selectors for critical elements, address that problem at its source. “Self healing” selectors suffer from the same criticism. We are not convinced of the value of using AI/ML for element identification.

Test identification and test authoring are areas where AI/ML claims sometimes border on fantastical. The few legitimate tools in this area position themselves as augmenting human activities, not replacing them. The determination of what to test will always be a contextual question that relies on subjective value judgements and understanding of human behavior. These are not things that AI/ML are well suited for (well, at least until the singularity).

Regardless of which specific areas the tools are addressing, all legitimate tools explicitly tell you exactly how they are applying AI/ML. Be skeptical of all tools that can’t explain exactly what problem AI/ML is being used to solve and how it’s being used to solve it.

TWO: Question the qualifications of the author

Claims of the benefits of AI/ML in software testing should be taken in proportion to the qualifications of the author. Unfortunately, even outside of the vendor marketing teams, most content on the application of AI/ML in software testing seems to be written by people who do not have an understanding of AI/ML, or have no experience with professional software testing and test automation.

For example, a Google search will return many results from academics who have deep understanding of AI/ML but very little understanding of the software development and testing industry. They can go into great detail on deep neural networks or linear regression models, but struggle to explain or completely ignore why these approaches are superior within the industry as they are ignorant of the actual challenges and pressures faced by software testing and automation teams.

Another large source of content on AI/ML are blog authors, journalists, and other pundits. These authors are incentivized to make strong statements simply to generate clicks and drive traffic. They have little expertise (or even interest) in the field they are covering and articles usually contain no original analysis. In fact, some are simply regurgitations of vendor marketing materials. They are compensated by reader engagement, which is more easily created with grandiose claims than realistic ones.

However, just because content comes from a vendor, academic, or journalist does not make it wrong or without value. As we have said before, there are legitimate companies attempting to leverage AI/ML to improve test automation. For example, Jason Arbon and Tariq King of Test.ai are experts in both AI/ML and software testing, and regularly publish informative and interesting content on both their specific tool and the industry in general. We recommend following them even when we disagree with some of their conclusions.

Regardless of where the content comes from, understand the author’s qualifications before taking their assertions on the benefits of AI/ML in test automation at face value.

THREE: Are you testing software, or testing software development?

There is a significant difference between testing software after it has been developed—as an external party charged with answering the question: does this work?—and participating in active software development as a quality engineering organization. The first is testing quality, the second is building quality.

In the first, the software under test taken as-is, and test teams only need evaluate this software as efficiently and economically as possible.

In the second, testing is interwoven with development in small if not indistinguishable feedback loops. If performed by different people, there is tight collaboration between those testing and those building. The purpose of testing is not only to answer “does this work” but (more importantly) to help build quality into software as it is created.

Why is this distinction important when evaluating AI/ML test automation tools?

In in the first situation, it is perfectly reasonable for testing and automation tools to hide code and implementation details from the people building the automation. In fact, it’s preferable as it expands the size of the talent pool qualified to do this work. Codeless or low-code tools, leveraging AI/ML features, can be very effective at making automation easy and approachable to non-technical resources.

This is a huge selling point for most AI/ML tools and in this specific situation those tools can be beneficial.

However, in the second situation (building quality into software) you do not want to create artificial separation between development and test activities by introducing a specialized set of AI/ML or no/low code tools just for testing, or by dumbing down those tools for the non-engineer.

Instead, you want to minimize the burden of context switching between automation and development by leveraging the same tools and tech stack in both. Developers should feel as comfortable within the automation ecosystem as within the development ecosystem, and vice versa.

If you are testing the active development of software, leveraging AI/ML and low/no-code tools is counterproductive to this objective. By design these tools attempt to separate building software into the technical activity of software development and the non-technical activity of software testing.

AI/ML tools that boast how simple and non-technical they make test automation assume you fall into the first category, that you are simply testing quality, while most technology or product companies with internal development organization will fall into the second. They are trying to build quality.

Before you invest in any tool, understand which of these two situations best describes your organization. If your team is building new software, we would strongly suggest developing automation strategies around code-first, open source tools like Playwright, Cypress.io, Webdriver.io, Puppeteer, Selenium, or similar tools, rather than AI/ML, low/no-code tooling.

FOUR: Be skeptical of hyperbole

If something seems too good to be true, it probably is. This is especially true if the something includes AI/ML. Be skeptical of tools claiming orders-of-magnitude improvements or a guaranteed return on investment. Legitimate vendors understand that software testing and test automation are complex activities impacted by many variables, and even groundbreaking tools leveraging AI/ML will usually provide only incremental improvements in efficiency or effectiveness. Unfortunately, there is still no silver bullet to ensure all software is high quality.

Some vendors will make tantalizing claims, but require that you sign up for a product demo with a company representative for any actual information about the product’s implementation. They hint at legitimate, technical documentation, but hide it from you unless you provide contact information.

This is a sales tactic to capture leads and you should avoid these companies.

For an example of an AI/ML based tool that makes reasonable claims and provides adequate information on what it does and how it works, check out the Eyes SDK for visual regression from Applitools. Unfortunately, this seems to be an exception to the rule.

In Summary

It should be obvious that many claims on the benefits of AI/ML in testing and test automation are exaggerated. It seems that some vendors include AI/ML language in product descriptions simply for the marketing value, and those AI/ML tools do not represent any tangible improvement over earlier generations of products.

However, there are still legitimate vendors using AI/ML to improve software testing and test automation. The challenge for all test managers, automation architects, and engineering leaders will be separating out false promises and marketing material from legitimate products that provide real benefits, while ensuring tools’ benefits are actually relevant to the needs and challenges of their teams. In many cases, the answer to this will be ‘no’.

Do not be afraid of being labeled old fashioned because you require proof of the value of AI/ML and the tools that leverage these algorithms. AI/ML is an exciting technology and it’s application within the domain of software testing and test automation will undoubtedly add value somewhere. Unfortunately, as with any new technology, that value is hidden within an ocean of questionable ideas, exaggerations, false starts, and people simply exploiting the excitement to make money.

Other stories by Blake Norrish:

--

--

Blake Norrish
Slalom Build

Quality Engineer, Software Developer, Consultant, Pessimist — Currently Sr Director of Quality Engineering at Slalom Build.