Should we really follow the golden Testing Pyramid?

5 min readOct 15, 2022
Photo by Eugene Tkachenko on Unsplash

I do what I can to convey what I feel in front of nature… I must forget the most elementary rules of painting, if there are any — Claude Monet

When asking interviewees about their testing strategy, most of the developers would confidently whip out the golden rule — Testing Pyramid. But when being asked why it has to be like a pyramid, the best they could mention is properly: It takes long time to run the high-level tests and they are hard to maintain.

Let's maybe revisit Cohn’s original test pyramid rules:

  1. Write tests with different granularity
  2. The more high-level you get the fewer tests you should have

Stick to the pyramid shape to come up with a healthy, fast and maintainable test suite: Write lots of small and fast unit tests. Write some more coarse-grained tests and very few high-level tests that test your application from end to end. Watch out that you don’t end up with a test ice-cream cone that will be a nightmare to maintain and takes way too long to run.

If you look carefully, there are at least 2 implicit assumptions:

  1. Lower-level tests are easier to maintain than higher-level tests
  2. Lower-level tests are significantly faster to run than higher-level tests

The ultimate goal of having tests is to help developers build reliable softwares effectively and efficiently.

Let's rethink a bit what we really care about tests…

Maintenance efficiency

Have you ever experienced a situation when you modified a very small piece of code deep down inside a class or a component without changing its behavior. Many unit tests are suddenly exploding, including some seemingly irrelevant ones… Or you started an important refactoring. The implementation took you a few hours to do but you still struggled to repair all the unit tests after several days, and you ended up smash your head into the screen.

Large amount of unit tests are really easier to maintain?

For me, having large amount of unit tests often cause testing code altering inertia. We have to suffer from this because our tests are deeply coupled with our implementations. We are kind of testing the implementation details instead of testing the behaviors.

Some people would even argue that breaking all these tests are good meaning that the tests are protecting well against the implementation mutations. Are they? with so many parts being mocked? I’d say “maybe”, but mostly they are dragging down developer’s productivity instead.

Personally, I don’t see how we could avoid this side effect if we write huge amount of unit tests according to the testing pyramid, or you might have to at least redefine what should be the “unit” in your domain context.

Testing efficiency

Our test suites are slow because we are not applying the testing pyramid

Most software engineers believe this is exactly what the testing pyramid strategy is solving for us by writing large amount of unit tests, medium mount of integration tests, and very small amount of e2e tests, etc..

The reality is — Test suite execution performance usually has little to do with the “shape” of your test types. Most slowness comes from the badly implemented tests themselves, e.g. re-creating application context all the time in the integration tests. Of course, having more high-level tests for sure will run a bit longer. In the end, it’s up to you to find the right balance between test execution time and test effectiveness.

So lower-level tests are indeed faster to run but we cannot sacrifice the testing quality only for speed.

Testing Effectiveness

Imagining you have being following strictly the testing pyramid rules for your beloved application, you made a change at the service level, and adjusted the corresponding tests. Now, are you confident enough to deploy your changes straight to the production without relying on any external validations?

Most probably not. Because you know you have been mocked a lot in your tests if you followed the golden rules. In other words, if you have implemented enough high-level tests covering all your use cases in the contrary, your test suites can definitely NOT be in a "pyramid shape".

Now, when we look back at the golden rules again, it actually didn't mention the reduced confidence level due to fewer mount of high-level tests. At least, I could never confidently deliver something straight to the production without testing all the use cases in a close-to-reality testing environment.

The best tests are the tests without mocks. The fewer things you mock in your tests the better confidence level and testing effectiveness you'd get.

Back to the topic, should we strictly follow the testing pyramid?

I'd say it depends on the tradeoffs… Most of the cases: NO.

Personally, I'd suggest to follow a testing diamond, which means writing big amount of integration tests, much smaller amount of unit tests, and e2e tests. I'd prefer to let more integration tests take a few more seconds to run rather than introducing this discouraging testing code altering inertia and reduce testing confidence.

Photo by Milad Fakurian on Unsplash

The hard part to achieve here is writing high quality integration tests which requires excellent test configuration to make your tests extremely realistic as well as efficient to run. Some concrete ideas would be:

  • Reuse your running application context — creating it is what takes the most time, not your test executions
  • Mock only external parts — make your test environment as close as possible to the production one
  • Write unit tests only when the test scenarios are hard to produce e.g. database dead lock, network failure, etc… or there are really many conditional branches to test and you don't have to mock anything in your unit (self-contained).
  • Use e2e tests for highest-level only — testing multiple micro-services trying to achieve certain goals.
find the right balance for your tests

Benefits of implementing the test diamond correctly

  • Tests are effective and give high confidence and reliability
  • Tests are embracing for changes thus more agile
  • Tests focusing on behaviors are living documentations

If the developers can understand that different components in a system should be decoupled from each other as much as possible, and the architecture should be friendly to future changes, they must be able to understand that the same golden rule applies to the tests, right?

Happy testing!

Responses (1)

Write a response