The Test Kite: Bulletproof Testing for Microservices

13 min readApr 16, 2020

Introduction

A long long time ago, somewhere in 2003 or 2004, in springtime, maybe, a developer by the name of Mike Cohn first drew up a triangle that revolutionised software testing. Later published in his book, Succeeding with Agile, this triangle is commonly known as the test pyramid.

Like other great triangles throughout history (the Egyptian Pyramids, the Louvre, Toblerone), the idea was simple but ingenious. Cohn stated that there are three different tiers of testing to perform in your application: unit tests, service tests, and UI tests (or integration tests). The width of the tiers emphasises the ratio of tests you should have in each tier, i.e. you should have more unit tests, fewer service tests, and even fewer UI tests.

This may seem counter intuitive. UI tests test your application as a whole, so surely they would be better? In theory? Yes. In practice? No.

UI tests are slower to run, more complex to maintain, and also — when tests inevitably do break — won’t tell you where in the system something has broken. Meaning you spend wasted time bug hunting rather than writing your code like God intended. Unit tests break down the business logic of your application into lots of little individual pieces. Whilst they do not test that the individual units add to the whole, they are faster to run, easier to maintain, and — crucially — easier to diagnose. Service tests sit in the middle in terms of complexity and time to run and can be used to check that a group of units work together when combined, e.g. the backend of your application (no GUI) works as intended. Therefore a mix of the three types is needed. With the right ratios in each layer, you get a test suite that is fast, easy to support, and easy to diagnose.

Nice! The test pyramid gives us a good guidance system to show us how to test our applications! And that’s the issue of testing sorted forever and ever. The end.

Well. Not quite. Along came a sledgehammer called microservices to turn that pyramid into a rubble-strewn pile of bricks.

Microservices and the Test Pyramid

Unless you’re new to software engineering, or have been living in a cave these past few years (see also: working in a bank), you would have heard about microservice architecture. The idea of microservices is to break down a monolithic application into lots of smaller decoupled applications that communicate over HTTP with one another, and each service should own a specific and well-defined part of the business logic. This allows each part of the business logic to be deployed separately, improves fault isolation, and allows your overall platform to be more scalable.

Microservices at Amazon. Each black dot represents a single microservice.

The annoying part about microservices is that it makes testing far more complex. Now, if you are a single team of a handful of developers and are developing microservices, you could probably get away with using the test pyramid. But what if you’re Amazon? Or Netflix? Or a company with hundreds or thousands of microservices distributed across different departments, buildings, regions, countries, planets, etc.

Does it make sense to write a suite of integration tests that run across every microservice? Who would be responsible for that? How would you diagnose and locate regressions? Similarly, does it make sense for your team to do an integration test with an application owned by another team, given that you are not responsible for how their service behaves or have access to their underlying database? If a test breaks, how do you know whether the breakage was caused by changes in your service or theirs?

Like a Toblerone in the sun (as you may have guessed, I love a Toblerone), the test pyramid begins to break down to a sloppy mess when an organisation structures their development in this manner. So what can be done?

The Test Kite

In order to address testing in microservices, I propose that we should use what I’ve coined the test kite.

I’ll be the first to admit it… the test kite isn’t that much different from the test pyramid. “It’s basically two pyramids glued together!”, I hear you cry. (And you’re not wrong…)

The test kite isn’t intended as a completely overhaul of the test pyramid. I’d describe it as more of an iterative improvement to make it clearer on how to apply the test pyramid to microservices.

I propose that microservices should have four types of testing: unit, application, contract, and end-to-end (E2E) tests. Application tests should make up the main bulk of the tests within a microservice. There should be fewer unit tests than application tests. There should also be fewer contract tests than application tests, and fewer E2E tests than contract tests.

I’ll get onto the different layers in a later section, but first to get the test kite to work, you’re going to need to take a few steps.

Prerequisites for the Test Kite

Just like a real kite won’t fly if its wings are full of holes and struts made out of lead, the test kite won’t fly if your application isn’t structured correctly. It may be possible for you to apply the test kite to your microservice no matter its internal architecture, but then again you could also eat soup with a fork. 🙂

I would strongly suggest using the following patterns to help yourself out.

Ports and Adapters

Also known as hexagonal architecture. Because apparently we software engineers are suckers for geometry (triangles, kites, hexagons…).

The aim of this architecture is to decouple a microservice’s business logic from its inbound and outbound connections. An inbound connection may be the API of the service, or a message queue reader. An outbound connection could be a call from the microservice to a database or another microservice.

When your microservice’s API (primary adapter) needs to interact with your business logic, it should go through an interface (primary port). If your business logic needs to make a REST call to another microservice, it should go through an interface (secondary port) in order to reach a class responsible for that REST call (secondary adapter).

The reason for this decoupling is that your business logic is the heart of your application. It is essential for its purpose. If you change anything within the business logic, the core of your application will change. However, the adapters should be interchangeable. There may be a time when you want to change your underlying database, or you want to migrate from using a local file system to Hadoop, etc. Hexagonal architecture means that you can make these changes and be confident your business logic remains untouched. The ports (i.e. the interfaces) allow you to ‘plug-in’ different adapters as you need.

Implementing ports and adapters will come in useful for the test pyramid when it comes to writing application and contract tests, but more on that later.

12 Factor Applications

Created by developers at Heroku, this is a methodology for creating portable and resilient web applications, and works exceptionally well on cloud services. Not every principle is relevant to the test kite, but there are a couple to take note of.

6) Execute the app as one or more stateless processes

Microservices should shift state to dependencies as much as possible. For example, if you need to use a cache, it is much better to use a technology such as Hazelcast instead of an in-memory one.

Stateless microservices make testing easier because we can then use the ports and adapters pattern to mock responses of the external dependencies. This gives us better control over the state of the underlying data when setting up individual tests.

10) Keep development, staging, and production as similar as possible

Minimising the disparities in your application between different environments will mean that you have fewer code paths to test. Simpler code means simpler tests, and a lower chance of introducing bugs.

The Test Tiers

Application Tests

The problem with classic unit testing on a class-by-class level is that you end up with a test suite that is incredibly coupled to the class design you have chosen for your application. This means that whenever you need to make a change somewhere in your service, the chances are you’ll have to fix a unit test. If you have to change a piece of business logic, you may end up having to change tens of classes. Which means fixing tens of tests. This leads to unhappy developers, fixing tests becomes a grind, and developers may end up skipping writing new tests to avoid contributing more to the grind.

Testing like this (and I have actually worked in a company where we did this) is massive overkill for what you are trying to achieve. When it comes down to it, the business couldn’t give two hoots that you’ve decided to use an abstract class in one place or a fancy factory pattern in another. I’m sorry to hurt your pride over your well abstracted code, but you know it’s true.

It also means that you are not actually testing your business logic directly. You are testing lots of tiny units that together make up some form of logic. And that logic may be far removed from what your service is supposed to be doing. A microservice is all about creating a service that does a piece of business logic, and does that piece well. Therefore it makes much more sense to have a suite of tests that directly verifies that your microservice does what you say it does.

Domain of an application test, from a ports and adapters perspective.

An application test should test your microservice from a primary port of your application and verify the return result from that port. For example, the primary port could be an API, and the return result be the response code and body of that API call. The responses of secondary ports in the application should be mocked using a test double to ensure that we are only testing the microservice and no downstream dependencies.

A suite of application tests is effectively documentation for how your microservice is supposed to behave, and will act as the most up-to-date documentation you have. By mocking the dependencies, the tests become much easier to write and maintain, and we can still verify how your application will work if your database does/doesn’t contain user data, for example. It also means that we are free to refactor the internals of our microservice however we fancy, safe in the knowledge this refactor won’t break our APIs.

Note: some people call these types of tests component testing. Personally I prefer the name ‘application tests’ as it makes the scope of the tests abundantly clear. Choose whatever name you want, I’m not the boss of you.

Unit Tests

Right now you’re probably thinking, “Wait, you just spent a few paragraphs bashing unit tests? What on earth is this section doing here?”, to which my response is fair enough. But allow me to explain.

First of all, I would clarify that an application test is actually a form of unit test. But instead of your unit being a single class, you have made your API your unit. If you extend this further, anything can be a unit! A single class. A Gradle module. A library. A unit can be anything you need your unit to be for the sake of making your tests easier to work with.

Unit tests in the test kite are for the sake of closely testing a part of your microservice that needs particular close attention (i.e. very complicated business logic), cannot be covered by application tests, or if it helps to greatly simplify your application tests when they become too complicated. If you are building a really simple microservice, I would even suggest to skip this tier altogether if it makes life simpler for you!

Contract Tests

Any time your service needs to communicate with an external dependency, you enter a contract with that dependency. This contract stipulates ‘when I give you A and B, you’ll give me X and Y’. So it’s sort of like a legal contract, except you don’t need to get the solicitors involved.

There are two parties in a contract — the consumer (the service making the request) and the provider (the service providing the response). It’s the consumer’s job to ensure it is making the right request, and the provider’s job to give the right response for that request. If either party strays from the contract, then the two services can no longer communicate properly.

Contract tests are a way of validating that your microservice is making the right request and receiving the right response. In relation to ports and adapters, they should start at the secondary ports (i.e from where the application tests mock out) and call the real external service. This allows you to test the part of your application where you are handling the request/response to that service, without any pesky business logic getting involved.

If you recall, in our application tests we used test doubles of the secondary adapters. The contract tests should be used to verify that the contracts with the test doubles are the same as those of the real provider. Otherwise our application tests could just mock any old nonsense response — and that’s no use to anyone. By doing this, through transitive law you now know that your entire microservice works as intended from top to bottom.

Contract tests are incredibly powerful for a few reasons:

Breakages should only occur due to a breach of the contract. If the tests break after you’ve made changes to how you call an external service, chances are that you’ve broken the contracts somehow. If the tests break, but you haven’t touched any code around this external call, chances are the API provider has broken the contracts somehow.
Contract tests require a contract in the first place. For new APIs, this forces different teams to negotiate on what the contract should look like before implementation. This allows both teams to work independently to that contract. For existing APIs, there should already be an existing contract that API consumers can use.
The tests are documentation for the contract.

Contract tests can be taken further by using a consumer-driven contracts approach, such as Pact. This approach ensures that the consumer and provider of an API share their contract tests, so that both parties can independently verify they adhere to them. I won’t talk about Pact here, otherwise I’ll end up writing a bajillion more lines singing its praises, but I highly recommend taking a look at it.

Something to note is that contract tests don’t only have to be conducted with downstream microservices, but they can also be used with other dependencies such as databases, external caches, etc. Some developers call this type of testing integration testing, but ultimately it’s still a type of contract test because you’re still checking you’re using the dependency and its contract correctly.

Also note that if your microservice doesn’t have any outbound calls to other dependencies, you can skip contract tests altogether for it. Don’t ask me what shape this makes the test kite in this circumstance… 🤷‍♂️

E2E Tests

We have already established that your microservice has the correct business logic using application/unit tests. We have already established that you are making the correct API requests and receiving the correct responses in the contract tests.

This means that the E2E tests are merely a final sanity check to make sure your applications can communicate with each other in your environments. And this simplifies your E2E tests a lot. It means that instead of verifying that your database receives an audit event (for example) that looks like ABC, we can just verify that your database has happened to receive an audit event. Simples.

Altogether

When you apply the four tiers of the test kite to a microservice, this is the resulting effect:

A ports and adapters microservice with the test kite applied.

Application tests are to ensure the business logic in the microservice is correct. These should make up the main bulk of your tests.
Unit tests are to ensure smaller units of logic work correctly. They should only be used when you can’t use application tests, or to make your application tests simpler.
Contract tests are to ensure that your application is making the correct request to an external dependency, and that its responses are as you expect. They also verify your test doubles are valid.
E2E tests should be used to ensure that the different services are able to communicate in the live environment.

The combination of these tests gives you a test suite that verifies different elements of your microservice(s), ensures that you only test the parts of your application that are relevant to you, is effective documentation on how the different layers in the application work, and makes it abundantly clear where a regression occurs if one is introduced. And that’s what a good test suite should be — something that makes your life easier, not harder.

So make your microservice’s test suite soar to the heights (sorry/not sorry for the utter cheesy tagline). Use the test kite.

To see the test kite in action on an actual repository, I’ve made an example Spring Boot project here to look at/play around with.

References

About the Author

Kyriacos Elia is a Senior Software Engineer at a UK-based consultancy. He is an advocate of TDD and testing best practices, and agile principles.