QA Club

John Gluck
John Gluck

Posted on

7 Principles for Test Automation

More lessons from the school of Hard Knocks

If I think of more, I’ll change the number but for now, 7 seems fine.

I. Favor test execution before over after artifact deployment

You might read this as “Favor unit-test and integration tests that you run before merging as opposed to any other sort of tests that you run after you merge your build and it deploys.” The primary reason to follow this recommendation is for faster feedback. The more coverage you get before you build, the faster your cycle time will be.

A lot of things have to happen to get an end-to-end test to run. You have to:

  1. Create the PR and, hopefully, run the unit/component integration/contract tests before doing so
  2. Build the application
  3. Create an artifact for your application
  4. Deploy the application artifact
  5. Start the application
  6. Create the PR for the test harness change
  7. Build the test harness
  8. Create an artifact for the test harness
  9. Deploy the test harness
  10. Start the test execution
  11. Wait for the results

At any stage, something could cause you to stop and go back, perhaps all the way to the beginning. It should be clear that you are better off writing unit and integration tests where it makes sense. Because the sooner a test fails, the faster you can repair it.

II. Favor existing coverage in a given level over creating duplicate coverage in a specific level (security blanket)

Understanding the coverage you get from all of your teams testing efforts (unit, integration, end-to-end manual and automated) requires either explicit instrumentation for your end-to-end tests or serious coordination. Serious coordination requires that your team as a whole understand what is covered in the end-to-end tests if you haven’t figured out how to instrument said coverage. (Side note: it is possible to get these numbers and there are tools that look promising and methods for instrumenting coverage from someone you can trust).

But in the absence of specific coverage, your team needs to come to an agreement about what functionality an given automated end-to-end test covers to make sure it’s not duplicating earlier testing. If you have already thoroughly tested your business logic in unit tests, it makes no sense to create an end-to-end test that exercises that same logic more slowly.

You may argue that your integration test mocks a service, where the end-to-end test uses a live service. Good. Integration tests should mock services that are outside of their control or not the subject of the test. But

  • If you are now running a test that tests that service, you may have exceeded scope.
  • If someone else is testing that service, you shouldn’t test it again.
  • If you are testing that service, in other words it is the System Under Test (SUT), hopefully you have unit tests that exercise the business logic for that service. If you don’t, you should.

Therefore, your end-to-end test only needs to assure that the SUT and the service it depends on can communicate. This proposed testing is probably a lot easier than you think it is. Testing business logic would require you to validate expected state. Validating communication between a provider and a consumer only requires confirming that state has changed; we don’t need to know what the state changed to if that was tested in the unit test as it should have been.

III. Favor investment in tooling over deferred maintenance of testing

For some applications, testability, or rather the lack thereof, can cause significant problems that compound over time. Especially in legacy applications, deferred maintenance gets covered over by extensive manual testing and/or flaky automated tests. There are strategies for testing with legacy code in particular, such as characterization tests.

That said, another frequent source of deferred testability happens when developers use anonymous functions. Since anonymous functions aren’t namespaced, it can be challenging to support them unless your programming language has built in support for this (Hooray, Python 3).

Serverless architecture, a more recent development, brings with it countless opportunities for teams to cut corners, but there are also examples for how to set up a scalable automated testing approach for this architecture. Bear in mind that by ignoring testability , the team doesn’t make the problem disappear. They just hide it and force a conflict for testers; either testers speak up about the poor decisions someone made regarding testability or they just suck it up and take it for the team. Most prefer the latter, resulting in more time needed for testing and/or more production escapes.

IV. Favor a clear signal to noise ratio over false-positive inspection

There’s no soft way to say this; flaky coverage is fake coverage. A sporadically failing test, while giving you the illusion of security, is actually likely causing your testing team to lose cycles managing false positives. Any team that has mistaken a real defect for a false positive and let it escape to production will know this.

If you take away anything from this article, take this away;Any amount of non-deterministic output from test results nullifies the entire run. Why? because you don’t know exactly what failed unless you check it every time and, I assure you, you aren’t going to do that.

There is a huge shift that happens when your team starts treating failing tests as an all-hands emergency. Your team’s confidence in releases increases as a result of knowing that their deterministic test output is a reliable indicator of the likelihood of the success of a given feature release. Cycles spent in doing last minute ad-hoc checking or ad-hoc worrying decrease and feature delivery speeds up, slowly, almost imperceptibly.

V. Favor singular goals per test over multiple

Each test, regardless of its type, should have a singular goal. This is easy to understand for unit and component tests. For end-to-end tests, we have grown accustomed to thinking about validating “flows”.

Really what we’re validating is accumulated state. But we can think of this like math. In order to validate that 1 + 2 + 3 + 4 = 10, we can validate that 1+2 = 3 and 3 + 3 = 6 and 6 + 4 = 10.

If we approach our end-to-end testing in this more atomic manner, it implies that our application must be able to support injectable state, to prevent us from having to repeat the same steps over and over to arrive at the same point. This may not seem like much of a savings. But being able to do so saves us a significant amount of time in the long run because of how often we run our tests.

Automated end-to-end tests are expensive because:

  1. They require some-to-all of our dependencies to be up and behaving correctly in order to allow us to idenity defects in our own application.
  2. They because they are repeated multiple times per day, every day.

VI. Favor few assertions at the end of tests over asserting at each step

It is not uncommon to find end-to-end tests that are like journeys. Journey tests may go through a given sequence of pages, asserting at each step of the way. The problem with this approach is that any failed assert causes either:

  1. A blockage that prevents your test from completing its intention.
  2. In the case of soft asserts, a need for investigation to confirm a false positive (assuming your team tolerates false positives).

Another way to frame this is to think of every step in an automated test that is not an explicit assertion as an implicit assertion. Whatever application functionality your test had to use to get the intended final assertion can be said to be working, since you ended up in the spot where you executed the final assertion that validated the tests intention.

Think about ways you can break up your journey into multiple destinations. There is no need to duplicate coverage. If you have tested some static content on one page, it is likely wasteful to test it again. If you find yourself doing this because it saves you time and resources, you likely have tech-debt, in which case, see the section above on favoring “investment in tooling over deferred maintenance”.

VII. Favor team-ownership over single point of failure

Quality is a responsibility, not a role. Everyone on your team is responsible for quality. Teams need to coordinate as a whole to agree on how/where to cover each feature.

While there are arguments for and against automated testers writing unit tests, teams need to have the conversation about whether to allow or even encourage this if there unit-test coverage is not “high value, high meaning”.

It may still be a valuable conversation to have if a team is attempting to increase velocity and already has strong coverage.

Many teams have a nested testing silo within their team. The unspoken agreement in some teams seems to be “I won’t tell you how to do your job if you don’t tell me how to do mine.” My experience tells me this is an approach that leads to increased risk. If developers don’t advise testers on what risks they are most concerned about, testers are more likely to spend their time on covering less risky functionality. If testers don’t tell developers when things are hard to test, developers have no way of knowing and will simply continue on blissfully asking for bloated end-to-end testing as if nothing is wrong with that.

Top comments (0)