Reliable releases: Focus on the product not on the feature

6 min readJun 5, 2023

Automation approach to validate the whole system at each execution cycle

Intro

As mentioned earlier (here, and here), different organizations may have their own definitions and structures for automation and validation teams. However, my suggested approach to ensure the quality of a software product before release is to maintain continuous testing throughout the product’s entire lifecycle. One effective way to achieve this is by conducting a “Develop vs Develop” approach, which involves testing the latest commit of each component together with the latest commit of all other components, also known as system-level testing. This approach validates the interactions and dependencies between different components in the integrated system, rather than testing them in isolation. It helps identify issues related to overall product behavior, which may not be detected in isolated component testing. Continuous testing, including end-to-end (E2E) testing, aligns with modern software development practices and can contribute to delivering a high-quality software product before it is released to production. In the upcoming article, I will demonstrate how to integrate this approach into your CI/CD infrastructure and release management processes.

Reminder

In our previous article
“Release High Quality Products: Breaking down and reassembling automation to gain its full potential”
We have mapped the challenges in the field of system automation and validation in various organizations, We also have mentioned the values and the roles we should stick to, in order to overcome these challenges.

Challenges under discussion in this article

Difficulty in maintaining sync between all product components
Maintain automation infrastructure in a fast delivering echo-system
Supporting testing across different platforms ,devices and environments

Proposed Solution

Let’s consider an example of an e-commerce website as the software system for which we want to establish a Continuous Integration and Continuous Deployment (CICD) process. This system comprises various components, including

DevOps for Deployment
Front End Application
Back End Application
Database
Security Mechanism
Payment Application.

Let’s Focus on few test jobs we can develop into our CICD infrastructure:

Nightly Job: This pipeline runs nightly and includes performance tests, end-to-end (E2E) tests, load tests, and user flow tests. These tests are designed to validate the overall performance and functionality of the system as a whole. For example, it may simulate multiple users performing actions on the website concurrently, such as logging in, adding items to the cart, going through the checkout process, and making payments with valid credit cards while reporting during the whole process systems utilization (CPU-wise Memory-wise, etc..). This pipeline also validates the security mechanisms in place and ensures that the system is able to handle expected loads and user flows. The biggest challenge in such tests is to maintain reliable expected results and maintain simplicity in order to easily reproduce the steps during a defect detection.

Daily Job: This pipeline runs few times a day and focuses on integration tests for each component of the system. It could include tests that validate the interactions and integrations between different components, such as the front-end application, back-end application, database, security mechanism, and payment application. For example, it may test different card types for payment, different combinations of usernames and passwords for login, and ensure that the cart is updated in a timely manner when items are added or removed. The most important thing of this pipeline is to store as much debug data as possible, in order to make it simple to detect the “problematic component” in case of a defect. more about storing CICD data for internal purposes will be described in the next article “Use Big Data Methods To Improve Automation Systems”.

PR Job: This pipeline is triggered with each pull request and includes “Gate Watcher” system tests. These tests are designed to validate the overall health of the system before merging changes into the main branch. It could include tests that verify if all system’s component containers are up and running, and there are no API breaks between different components. It also possible to include tests that simulate user interactions, such as logging in, buying an item, and making payments with appropriate receipts, to ensure that the changes made in the pull request do not impact the overall system functionality and performance.

Lets visualize it:

Lets start of the Selected technologies for the example
(all can be replaced by any parallel known framework you prefer)

Code management: GitHub (Alternative: GitLab, BitBucket…)
CICD Pipelines: Buildkite (Alternative: Jenkins, Circle CI…)
Cloud storage: Amazon AWS (Alternative: MS Azure, Google Cloud…)
Tests Framework: Pytest (Alternative: Gtest, Cypress, Catch2…)
DB, Logs and Reports: Elastic Search (Alternative: MongoDB, SQL…)
Image Creator: Docker
Containers Deployment: Kubernetes
Isuues Tracking: Jira
Image Creator: Docker
Containers Deployment: Kubernetes

PR Integration flow:

Pull Request is open on a specific component → Image is build together with Unit-tests execution → Image is pushed to the Storage Unit → Pull Request Integration pipeline is triggered → The Image from first step is pulled together with the rest latest images of other components → “Gate Watcher” tests are executed → Test Outcome is reported to Code management framework → Merge is allowed / Diss-allowed based on test outcome.

Daily Integration Flow:

Every X hours Daily Integration pipeline is triggered → All latest images of all components are pulled → Integration level tests are executed → Statuses, Logs, Failures and Metrics reported to DB → Failing tests are reported to management framework with all relevant data.

Daily Integration Flow:

Every Night Nightly Integration pipeline is triggered → All latest images of all components are pulled → System level tests are executed → Statuses, Logs, Failures and Metrics reported to DB → Failing tests are reported to management framework with all relevant data → Full platform release is created → Full platform release is deployed (based on pipeline outcome)

In the proposed CICD process, there are three key benefits:

Prevention of flawed tests from being integrated into the product code: By running system tests in the PR pipeline, the CICD process ensures that only valid flows are merged into the main codebase. This helps in preventing flawed tests from being integrated, ensuring the overall quality and reliability of the system.
Reduction of time gaps between valid and failing test executions: The daily pipeline, which runs integration tests every few hours, helps in minimizing the time gaps between valid and failing test executions. This allows for quicker identification and resolution of issues, reducing the risk of bugs and improving the efficiency of the development process.
Continuous monitoring of system status: The nightly pipeline, which includes E2E tests, ensures that the entire system is thoroughly tested on a nightly basis. This provides continuous visibility into the system’s status and product’s regression, helping to identify and address any potential issues proactively.

To further enhance the automation system, it may be beneficial to refer to the “Use Big Data Methods to Improve Automation Systems” article, which can provide insights on leveraging big data techniques to enhance the debugging process, making it quicker, more efficient, and informative. This can help in further optimizing the CICD process and improving the overall quality of the software system.

Golden Tips:

All of these processes should be FULLY AUTOMATED, there are a lot of convenience ways to integrate the mentioned technologies to the CICD process, this automation will reduce massive time of manual work such as reporting issues, debugging code changes in the Code management framework history, etc.
When building your pipelines, it’s important to ensure that they are dynamic and flexible enough to handle changes in the projects you are running. For example, if you are working on several components that are meant to support the development of a big system feature, your pipelines should be easy to modify in order to use them for regression validation on the in-progress work of the components before deploying them to the main (develop) branch. This will help ensure that your pipelines remain maintainable and flexible over time. For further information on how to build such pipelines, please refer to the “Validation and Release Automated Process A-Z” article.
In addition, it’s a good idea to stream data during execution, particularly for long and complex tests that run in a pipeline, such as nightly jobs. By supplying data “on the fly” in the form of logs or writing to databases like Elastic search, you can catch problems during execution without having to wait for the regression cycle to complete. For more information on how big data methods can be used to improve automation systems, please refer to the “Use Big Data Methods to Improve Automation Systems” article.