Integration testing your streaming data application

Integration testing your streaming data application

Learn how to use these popular integration testing tools to feed test data into your streaming data stack and ensure it’s set up correctly.

When setting up a streaming application, especially if you’re new to streaming data platforms like Redpanda, you’ll want to test that your application is set up correctly. You can do this using integration tests.

Integration tests check your producers and consumers against your data stream. They push test data through your application, allowing you to see if your architecture is correctly set up and working as expected.

Below, I discuss two popular libraries for integration testing: Testcontainers and Zerocode. I use these when I need to run integration tests, and nearly every developer I know uses them, as well.

In this post, you’ll learn how to run integration tests with them, too, so you can ensure your streaming application is properly configured. You can find the resources for the demos below in this GitHub repository.

The 2 best integration testing tools for streaming data stacks

1. Testcontainers

Testcontainers is a Java library that you can use to test anything that runs in a Docker container. You can use it to do integration testing on your data stream.

1.1 Prerequisites

Testcontainers can only be used within the Java ecosystem. For this reason, you will need to import it as a dependency. You can do this with Maven as shown here:

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>testcontainers</artifactId>
    <version>1.17.2</version>
    <scope>test</scope>
</dependency>

You can check out the latest dependencies in Maven’s central repository. You can also access the complete demo we’re about to walk through in this GitHub repo.

Next, you can move on to setting up your producer and consumer for testing.

1.2 Producer and consumer setup

Typical integration tests will include a producer that creates and sends an event. To do this with Redpanda, you can use the Apache Kafka API since Redpanda is API-compatible with Kafka.

KafkaProducer<String, String> producer = new KafkaProducer<>(
       ImmutableMap.of(
               ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
               bootstrapServers,
               ProducerConfig.CLIENT_ID_CONFIG,
               UUID.randomUUID().toString()
       ),
       new StringSerializer(),
       new StringSerializer()
);
producer.send(new ProducerRecord<>(topicName, "testcontainers", "redpanda")).get();

You will also need a consumer that consumes the event from the same topic as your producer. Again, we can set this up using the Kafka API.

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(
       ImmutableMap.of(
               ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
               bootstrapServers,
               ConsumerConfig.GROUP_ID_CONFIG,
               "tc-" + UUID.randomUUID(),
               ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,
               "earliest"
       ),
       new StringDeserializer(),
       new StringDeserializer()
);
consumer.subscribe(Collections.singletonList(topicName));
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));

1.3 Redpanda node setup

Since we are running this test in Redpanda, you will also need to set up a Redpanda node. This is where Testcontainers comes into the picture. It allows you to create throwaway instances of the node, which will then be destroyed when tests finish running.

@Before
public void init() {
   redpanda = new RedpandaContainer("vectorized/redpanda:v22.1.4");
   redpanda.start();
}

In above snippet, I tell Testcontainers to pull down the Redpanda Docker image and start the container behind the scenes.

docker container ls
CONTAINER ID   IMAGE                            COMMAND                  CREATED         STATUS         PORTS                                                                                         NAMES
34a719219ff9   vectorized/redpanda:v22.1.3      "sh -c 'while [ ! -f..."   7 minutes ago   Up 7 minutes   8081-8082/tcp, 9644/tcp, 0.0.0.0:49167->9092/tcp, :::49167->9092/tcp                          zealous_elion

As you can see, Testcontainers created the Redpanda container. Next, you’ll define your test as a regular JUnit test:

    @Test
    public void testUsage() throws Exception {
        testRedpandaFunctionality(redpanda.getHost() + ":" +   redpanda.getMappedPort(9092), 1, 1);
    }

You then run the test using the following command:

mvn test

If a test fails, you will see this printed in the log under the Failures and Errors sections:

Tests run: 1, Failures: 1, Errors: 0, Skipped: 0

If your test is successful, you will instead see a 0 in the Failures and Errors sections:

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

After you are done testing, you can stop the container with the following command:

@After
public void tearDown() {
   redpanda.stop();
}

And that’s it! You’ve successfully run an integration test with Testcontainers.

Next, we’ll move on to another popular integration testing tool.

2. Zerocode

Zerocode is an open-source Java test automation framework that uses a declarative style of testing. In declarative testing, you don't write code, you declare scenarios that describe each step of a test in a JSON/YAML file. The Zerocode framework will then interpret the scenario and execute the instructions that you specify via a custom DSL. Zerocode can be used for end-to-end testing of your data stream.

2.1 Prerequisites

Zerocode is Java library, so it can only be used in the Java ecosystem. You can get it from this central repo and declare it as a dependency:

<dependency>
    <groupId>org.jsmart</groupId>
    <artifactId>zerocode-tdd</artifactId>
    <version>1.3.28</version>
</dependency>

You can also check out official GitHub repository of the Zerocode framework here, and you can find the full demo I’m about to walk you through below in the GitHub here.

2.2 Producer and consumer setup

As I did in my integration testing with Testcontainers, I also need a producer to create events with Zerocode:

{
 "name": "produce_test_message",
 "url": "kafka-topic:test-topic",
 "method": "produce",
 "request": {
   "recordType": "JSON",
   "records": [
     {
       "key": "${RANDOM.NUMBER}",
       "value": "Hello Redpanda"
     }
   ]
 },
 "assertions": {
   "status": "Ok"
 }
}

In the above snippet, I declare what our producer should do:

  • name - The scenario step name. This can be anything you want.
  • url - Specifies the Redpanda topic via the kafka-topic property and tells the producer which topic events should be sent to (Note: Although there is no Redpanda keyword here, kafka-topic will work with Redpanda).
  • method - Tells Zerocode to create a Redpanda producer.
  • request - Specifies data that should be produced.
  • recordType - The type of records to be produced/consumed. In this example, it's JSON.
  • assertions - Checks the execution response. In this example, we are verifying that producing the event was successful.

The above declared producer will send one event (with a JSON payload) where the value would be “Hello Redpanda”. You then need to consume that event and check the payload. For that reason, I declare the consumer as well:

{
 "name": "consume_test_message",
 "url": "kafka-topic:test-topic",
 "method": "consume",
 "request": {
   "consumerLocalConfigs": {
     "recordType": "JSON"
   }
 },
 "retry": {
   "max": 2,
   "delay": 30
 },
 "validators": [
   {
     "field": "records[0].value",
     "value": "Hello Redpanda"
   }
 ]
}
  • url - Specifies the topic (via kafka-topic keyword) to consume from. This should be the same as the url you set in your producer.
  • method - Tells Zerocode to create a Redpanda consumer.
  • retry - Sets a max number of retries and the delay between retries in case the consumer did not find any events.

Other keywords in our consumer are the same as in the produce. In the validators block, you can see that I’m verifying that the consumed events value is “Hello Redpanda” as it was written by the producer.

With the above steps I verify that I can produce an event to the Redpanda stream and I can consume that same event. Once you’ve completed this step, you can move on to configuring the test.

2.3 Configuration

In the case of Testcontainers, it was the library that created a Redpanda broker (via Docker). However, before launching the Zerocode tests, you need to have the Redpanda broker up and running. For local testing, you can create a YAML file and use Docker Compose to do this.

After you have the Redpanda broker ready, you need to tell Zerocode how to reach it. You may need to specify some properties that Zerocode will use when creating your producer and consumer. For that reason, create a properties file with following content:

kafka.bootstrap.servers=localhost:9092
kafka.producer.properties=producer.properties
kafka.consumer.properties=consumer.properties
  • Kafka.bootstrap.servers - Here you specify the bootstrap of Redpanda. Keep in mind that there is not a Redpanda keyword, but the Kafka keyword works with Redpanda.
  • Kafka.producer.properties - Name of the file that contains producer properties.The file is in the same folder in this example.
  • Kafka.consumer.properties - Name of the file that contains consumer properties.The file is in the same folder in this example.

Once you’ve configured the test, it’s time to write the test case.

2.4 Writing the test case

At this point in the demo, you have the scenario file and configuration. So, how do you link them and run the scenario? Behind the scenes, Zerocode uses JUnit4 runners. For that reason, we now create a Java test class where you will utilize JUnit annotations:

@RunWith(ZeroCodeUnitRunner.class)
@TargetEnv("redpanda.properties")
public class RedpandaTest {
   @Test
   @Scenario("redpanda-stream-test.json")
   public void test_redpanda() {
   }
}
  • @Runwith - You specify the Zerocode runner that will be responsible for running your scenario.
  • @TargetEnv - The name of the configuration file that Zerocode will use for the scenario. This is how you link configurations files to scenarios.
  • @Scenario - The name of our scenario that Zerocode will run.
  • @Test - This is the Junit annotation.

You can run the test using the following command:

mvn test

Just as with our Testcontainers test above, you will see any errors or failures printed in the log:

Tests run: 1, Failures: 1, Errors: 0, Skipped: 0

If your test is successful, no failures or errors will be noted.

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

Conclusion

Now that you’ve learned how to run two types of integration tests on your applications, you can validate that your data streams are configured correctly. These tests are also useful for checking your producers and consumers against your Redpanda nodes.

As I mentioned at the start of this article, Zerocode and Testcontainers are the two integration testing tools that I and other devs tend to use, and there aren’t many other integration testing tools available. If you know of others that we should look into, share them in the Redpanda Community on Slack, or share them on Twtiter: @redpandadata. To learn more about getting started with Redpanda, view the documentation here.