Managing iOS Tests at Scale: A Symphony

Atakan Karslı
Trendyol Tech
Published in
9 min readJun 22, 2023

--

Photo by Manuel on Unsplash

Picture the complexities of an orchestra, with every instrument playing a vital role in creating a harmonious melody. Each note, each beat, and each pause are essential. Now, imagine that orchestra as Trendyol, one of Turkey’s leading e-commerce platforms. The super-app offers a wide range of products and multi-channel services, and each domain and feature is a musician with a critical role. Our task as the iOS Platform team, the conductor, is to harmonize these numerous elements and ensure that each one performs optimally and contributes to the overall masterpiece that is our application.

We’ve been publishing numerous articles discussing our app structure, test extensions, and mocking techniques. Although we are constantly developing other strategies to enhance the experience for both our developers and users, this article will shed light on an important section of our orchestra: Testing.

If you’re interested in exploring more about our other sections, the following article is a deep dive into the story of our app’s modularization.

Current State of Testing

As the Trendyol iOS app has grown in scale, so too has the complexity and importance of our testing practices. Today, we run an astounding 25,000+ tests on each commit as an essential part of our iOS development flow, which requires careful orchestration under a 30-minute window to ensure we maintain our development pace.

We have 25k Unit, 1.5k UI, 250 Smoke Tests, and 500 Snapshots)

The equation is straightforward: more developers lead to more features, and therefore, more tests. We’ve seen a ~70% surge in our test count in the last year. If this growth trend persists, we anticipate our test count nearing 50,000 within the next 1 to 1.5 years.

Our in-house DevOps system named Event Handler is the key factor allowing us to handle and efficiently execute this large volume of tests. Using Swift and Vapor, we’re able to manage all aspects of our DevOps and CI/CD processes. On the other side Jenkins, our reliable ally in this work, runs all of our pipelines, thereby playing an instrumental role in maintaining our extensive testing pipelines.

Single build, multiple uses:

In the early stages of our iOS app, we maintained various schemes and targets for tests beyond the primary Trendyol target. Our pipelines were running in parallel; however, each type of test required a full rebuild of the app. This approach was manageable when our team was smaller and the number of tests was relatively low.

Each test job had its own independent build process.

However, as our team expanded and our test count grew, the increase in job execution time began to slow us down, blocking our developers for more extended periods. Recognizing the bottleneck which is building, we reconsidered our approach to running tests.

We restructured the iOS app scheme to include all test targets — Unit tests, Smoke tests, and Snapshot tests. This move eliminated redundant app rebuilds. We evolved to a more efficient flow where we built the iOS app just once, and all test stages ran on this singular build.

To achieve this goal, we included two xcodebuild parameters in our Fastfile:

  1. build_for_testing: This parameter commands the compiler to build, including app and test targets, for testing, but it does not run the tests.
  2. test_without_building: Enables testing using the derived data path specified with the build_for_testing option but without requiring a fresh build.
desc "Only build project"
lane :build_only do |options|
scan(
workspace:"workspace.xcworkspace",
derived_data_path: "derived_data_path",
scheme:"scheme",
device: "iPhone 15",
build_for_testing: true)
end
desc "Run unit tests"
lane :unit_tests do |options|
scan(
workspace:"workspace.xcworkspace",
derived_data_path: "derived_data_path",
scheme:"scheme",
device: "iPhone 15",
testplan: options[:testplan],
test_without_building: true,)
end

Finally, as we added different types of tests and the total feedback time increased, we transitioned to a structure where all jobs could run in parallel simultaneously. The first job uploads the app and runners to the file server and the other Jenkins jobs pull that files to their derived data paths before start executing tests.

To run tests without building, only the runner and the app are needed.

Thus, we transitioned to a more streamlined and efficient testing process, mirroring the concept of ‘single build, multiple parallel uses ’, and significantly reducing our testing pipeline times and developer waiting periods.

New commit flow: 25k+ tests run on each commit (under 30 min)

Queue Manager:

Another reason for the pipelines taking a long time was the queues that formed on the Jenkins side. This was due to several reasons:

  1. Developers often re-triggered the pipelines when test results were slow to arrive, leading to extensive queues.
  2. Repetitive commits also spawned numerous jobs, filling the pipeline with jobs already expired.

To solve this, we introduced a Jenkins job called Queue Manager and placed it in front of the other jobs. Our Event Handler project now triggers this job as a first step. It scans all test pipelines for any pending jobs associated with the branch and aborts them to clean the queue.

Achieving this was quite simple:

First, you need to set the displayName as the branch for all the Jenkins jobs like so: currentBuild.displayName = branch

Then in the Queue Manager, we stop the pending jobs that we have scanned by the branch name before triggering new ones.

  stages {
stage('Abort Active Test Jobs') {
steps {
script {
for (jobName in jobNames) {
def runningBuilds = Jenkins.instance.getItemByFullName(jobName).getBuilds().findAll() { it.getResult().equals(null) }
runningBuilds.each { build->
if (build.displayName.contains(branch)) {
build.doStop()
}
}
}
}
}
}

stage('Trigger Unit Test Job') {
steps {
script {
build job: jobNames[0], parameters: []
}
}
}
}

Mute System: The beauty of silence.

We use the native XCUITest framework to write our UI tests, and merging snapshot testing with UI tests provided us with more tools to ensure our app’s quality. As our regression testing strategies grew in complexity, we inevitably experienced flaky and unmaintained tests caused by reasons such as data, environment, or the rushing of new features. These failed tests not only confused the analysis of test results but also extended the execution time. To overcome these challenges, we engineered a unique solution: The Mute System.

This solution has allowed us to run our regression and snapshot tests more efficiently, offering quicker feedback and yielding more clear results on merge requests. Before we deep dive into this system, you may want to explore these articles to better understand our UI tests.

Currently, the Trendyol iOS app is divided into eight domain teams. Each team focuses on developments and tests related to their domain. We have made this distinction in the project by separating test targets and test plans. Before we started running regression tests (UI) on all merge requests, there were two main flows. Daily tests ran every morning on the ‘develop’ branch and tests ran whenever a regression branch was created.

Message blocks in order: Daily Develop, Release Candidate (RC) Branch, and Merge Request.

We were seeing the failing tests on ‘develop’ every day due to the reasons mentioned above, and tasks were taken to fix the tests that failed. Running domain-based regression tests on every merge request while these failures exist would cause these failing tests to fail on all merge requests, creating unnecessary noise.

In essence, our system mutes failing tests on the develop branch by automatically skipping them from the tests that run on merge requests. This system functions in two stages: firstly, it identifies and saves failures, and then it proceeds to skip them.

  1. Each day, after running the regression tests on our ‘develop’ branch, we have been consistently storing all test results in a database. This practice allows us to filter any test failures, categorizing them as ‘mutable tests’.
  2. Therefore, whenever a regression test pipeline is initiated on any branch, our first step, as defined in our Groovy mute stage, is to skip these mutable tests.

Performing this skipping operation on CI while using Xcode test plans was something new. So we need a new tool for that.

xctestplanner: Meet with the Conductor

From the start, we’ve been using test plans. Xcode test plans are a great way to organize and run a collection of tests with different test configurations. The test plan is a simple JSON file with the .xctestplan extension. The only downside was their lack of a command line interface for editing them with the terminal commands.

xctestplanner is a simple CLI tool for managing test plans from the command line. It works with adding or removing objects from the JSON file which is created by using ArgumentParser and Swift. If you use test plans, this tool undoubtedly deserves a spot on your radar.

There are two main commands for setting tests in a test plan: select and skip. Our Mute System uses the ‘skip’ command to bypass the failed tests by appending them to the ‘skippedTests’ objects in the test plan before starting to run those tests.

xctestplanner skip TestClass1 TestClass2 -f path/to/testplan.xctestplan
It’s just like turning off the test in Xcode’s interface.

Failures are often the most time-consuming parts of test suites. Features like asynchronous checks with timeouts and reruns can cause these tests to take up to 100% more time compared to when they pass.

Therefore, by muting tests that we’re certain will fail in the merge request pipelines, we’ve managed to gain substantial advantages in pipeline execution times, often reducing the run time by up to 80%.

In the meantime, this practice has also improved the examination of tests by developers, making any issues more noticeable and therefore easier to address.

In this case, the test duration decreased from 34 minutes to 26 minutes, representing approximately a 24% increase in speed

What’s next?: Never-ending Symphony

Here at Trendyol, testing is something we truly enjoy. It is a fundamental pillar in our promise to offer a top-notch app experience to our users. Just like a symphony that plays a never-ending encore, we’re always finding new ways to make our testing better.

In the last two years, our iOS team that works on the app has grown five times bigger. As we see in the figures we are expecting to have about 50,000 tests to look after. This growth means we have to be smart about how we manage things. To ensure our developers keep getting quick and clear feedback, the iOS Platform team is already working on things like dynamic skips and diff-based test execution.

But the best part? We’ll be sharing more about these methods in upcoming articles. If you’re interested in learning more about our evolving testing symphony, stay tuned for deeper insights into our process and the unique strategies we’re embracing at Trendyol.

Want to work on this team?

Do you want to join us on the journey of building the e-commerce platform that has the most positive impact?

Have a look at the roles we’re looking for!

--

--

Atakan Karslı
Trendyol Tech

Senior Developer In Test @Trendyol | Curator @Testep