Greetings, everyone! It’s time for the next piece in my series on “Unleashing the Power of CI&CD for Android Apps.” Today, I’ll share some practical knowledge about techniques to do efficient testing on CI that can greatly benefit any company looking to reduce its development cost with efficiency.
If you are having a long feedback loop, high invoices for your CI in the cloud due to the tests taking long time, and are not sure about how to ensure the quality of your test cases, you are in the right place 🎉
Before we proceed, I want to clarify that this article doesn’t cover the detailed aspects of writing tests. Instead, it focuses on running tests more efficiently and measuring their quality. So, if you’re looking for insights on optimizing your testing process and evaluating its effectiveness, you’re in the right place. Ready? So, let’s dive in!
…Wait… A kindly reminder
My references and tooling advice might focus on Android-specific details. However, it is important to note that, in theory, the underlying principles apply to any software development platform or testing environment as long as there are no resource constraints, lack of infrastructure support (legacy systems for example), etc. While the implementation may differ between the platforms, the outlined concepts and guidelines can be adapted and extended.
Quality over quantity
It is a common antipattern for multiple companies to be trapped with using test coverage as their utmost quality metric. 100% code coverage is a foolish target alone and it is nothing but a well-known common fallacy.
No one succeeds without effort…
While many coverage criteria can be used to assess the completeness of testing efforts, statement coverage, known as line or node coverage, is often considered a minimum requirement for testing, as it helps identify simple syntax errors, basic control flow issues, and missing code paths. However, 100% line coverage doesn’t say much as well as 10% line coverage. Therefore, you should consider combining coverage criteria, such as branch coverage, to achieve slightly more comprehensive testing. I like branch coverage as it requires all decision points, logical operators, and loops to be covered. Each coverage criteria has its value as well as its cost, and to me, branch coverage and line coverage slightly give a better understanding of the completeness of testing efforts with efficiency in cost.
Every effort counts, but …
Don’t be trapped by the fallacy of test coverage. Use it for the right purpose, such as to identify areas of the codebase that are well-covered and require additional testing. What should matter the most is the quality of your tests over quantity. So say hi to mutation testing! 👋
Mutation Testing
Mutation testing simply introduces intentional changes (mutations) into the code and checks if the existing test suite can detect those mutations. Consider your tests like super-vaccines that cure code viruses! Vaccines protect us from harmful bugs, while your tests protect your codebase. But here’s the cool part: mutation tests create mutant code versions, like new strains of viruses. If your tests are strong, they’ll defeat those mutants, too, ensuring your code stays healthy and bug-free.
The more mutations we zap, the happier our code will be! The goal is to evaluate the quality and robustness of the test suite by measuring its ability to identify and “kill” these mutated versions of the code.
Nothing comes for free …
You guessed right! Mutation testing is expensive for many reasons, such as test execution time, test suite maintenance, etc. The smartest way would be to identify the most critical tests you have (i.e., tests related to login, signup, payment, etc.) and evaluate their quality and robustness by running mutation tests in a regular cron job.
Remember the Pareto principle of fault distribution …
Multiple studies utilize the Pareto principle to confirm fault distribution of software [1][2]. So according to these studies, 20% of the software modules are responsible for 80% of the faults in a project, of course, this principle might neither be applicable nor meaningful if this 20% of modules compose the majority of the project/system size.
The budget for development always depends on the company. Regardless of your budget, you can start from 20% and iterate over time. So FIND your 20%, define their criticality and priority, and enhance their quality one by one. If you don’t have test metrics to find out your 20%, be best friends with your QA fellows, crashlytics, test analysis tools, etc 💛
Don’t take the quality as an option but make it the foundation of everything you do. If I am not mistaken, my professor at college was yelling this at least a few years: a wrong software solution is worst than a non-working software solution 😁 😅 Hats off to this wise man✊
Speed over perfection
In the world of a fast-changing world, speed and agility are essential to be able to stay ahead of the competitive tech world. Besides, slower execution causes a longer feedback loop and increased resource consumption, longer feedback loop and increased resource consumption causes increased cost including the time and resources wasted in the present and the missed opportunities that may arise in the future. Oohh such a delightful chain of events … 😬
Please don’t take the title wrong, contrary to popular belief you don’t have to compromise on perfection for the sake of speed. It might take time but with consistent effort and efficient tooling, you can aim to have both:
Selective Testing
Time is money for many reasons, and the execution of all tests generally takes time. So ask yourself, do you really need to execute all of your tests in every commit? Well, not really. If you want to optimize the execution of tests and improve the overall efficiency of the testing process, selective testing would be a vital answer. By selectively running tests and focusing on specific areas or components of the codebase, you can reduce the time and resources required for test execution while still ensuring code quality. Woolah 🎊 🎈
The following might help you to understand it better:
- When a change is made to a specific module in a PR (pre-merge):
- Identify the module that has changed (e.g., Module A).
- Configure the pre-merge flow to execute only the tests related to that specific module. For example, if the change is in
com.x.moduleA.ExampleClass.kt
, run the tests formoduleA:unittest
and:moduleA:uitest
during the pre-merge flow.
2. If other modules depend on the changed module:
- Identify the modules that depend on the changed module (e.g., Module B depends on Module A, Module C depends on Module B).
- Configure the pre-merge flow to execute the tests for these dependent modules. For example, if the change is in
com.moduleC.ExampleClass.kt
, run the tests for:moduleA:unittest
,:moduleA:uitest
,:moduleB:unittest
,:moduleB:uitest
,:moduleC:unittest
, and:moduleC:uitest
during the pre-merge flow.
More modular is better..
The more modular your software is the better. Think about the described scenario above, if you break down the system into smaller, manageable components, you will have fewer tests to execute whenever there is a change and this can reduce the time required for testing, making the development and testing process more agile.
Gain confidence…
I suggest running all tests with a cron job until you feel safe with your selective testing setup in a regular cadency — start with once a day, then once a week — stop running all tests gradually.
Test tagging
I know there is at least one feature every project defines as a critical one and there is no tolerance for failure of those. Fair enough. If certain test cases are critical and must be executed with every code change, regardless of selective testing, you can use test tagging. By applying test tagging, you ensure these critical test cases are consistently run to validate the changes made. You can check how test tagging is done for Junit here. You will most probably use the same tag multiple times then the better way is to create a custom tag annotation to be able to set the standard for all.
Test Sharding — put on your dancing shoes and step into the world of parallel executions
As we discussed, if you break down the system into smaller, manageable components in a constructive way, you will have fewer tests with selective testing. However, it doesn’t necessarily mean your test suite will be efficiently small or your management will be okay to do selective testing. If any of these applies to you, you can divide a large test suite into smaller subsets or groups called shards with the technique called test sharding. Each shard contains a subset of test cases that can be executed independently and concurrently. The primary goal in implementing test sharding is to distribute the test execution workload across multiple machines or processes, enabling faster and more efficient testing. You can find out how you can do sharding for Android projects here.
Test sharding is a fun solution to achieve an efficient testing process but you need to allocate more machines and processes which might not be always feasible and sufficient enough. A combination of other techniques such as selective testing and test tagging can give you a more cost-value balance of the execution workload.
I see the deadlock but let's focus on a solution: start with selective testing and test tagging if allocating more machines and processes is not feasible for you initially. Ensure the cost of quality and the importance of iterative development are understood by every party.
Ever tried. Ever failed. No matter. Try again. Fail again. Fail better …
Fear is the poison of every brilliant idea. Acknowledge it, define the risks, analyze it, and give it a try. Most of the time, the risk of doing something is almost none compared to its value but it is a common “human error” to get lost in dwelling on the fear of failure. It might be you or any other party dealing with this poison, make sure iterative development is understood and how beautiful can it be to fail if you learn something from it and do better next time. Don’t waste time dwelling on “what ifs” for too long. As long as you don’t spend millions on what you are doing or risk the company losing its reputation somehow or risk having security incidents you will be safe. Again define the risks, analyze them, and find out the cost and value equation. Most of the time your equation will be far from a reason to worry.
Ah before I forget following is my motto in general and it is almost a candidate to be my next tattoo 😅
We will try, fail, learn, and fail better.
Wrap up
It might be overwhelming to follow all so let's wrap it up:
- Don’t get tricked by test coverage fallacy — use it for the right purpose, such as to identify areas of the codebase that are well-covered and require additional testing.
- Don’t rely on only line coverage — combine it with other coverage criteria, such as branch coverage.
- Focus on quality, not quantity — be friends with mutation testing and find a way to balance the usage of it with its cost.
- Remember the Pareto principle of fault distribution — FIND your 20%, define their criticality and priority, and enhance their quality one by one.
- You don’t have to compromise on perfection for the sake of speed — use selective testing to optimize the execution of tests and improve the overall efficiency of the testing process in a meaningful way, enhance it with test tagging by executing “critical” tests in every code change.
- Put on your dancing shoes and step into the world of parallel execution
- Try, fail, learn, and fail better — don’t be afraid of trying different things. Doing nothing is worst than failing while you are trying to do better.
What is next?
According to the plan, the next article of the series will be Automated Code Quality and Security Checks for Android Apps — well it will take time 😅 Please keep in mind, this series is not focusing on giving you the sample code but the information.
Thank you for your reading.
Cheers until next time 🥂