Introducing XCRemoteCache: The iOS Remote Caching Tool that Cut Our Clean Build Times by 70%

November 16, 2021 Published by Bartosz Polaczyk, Senior Engineer

At Spotify, we constantly work on creating the best developer experience possible for our iOS engineers. Improving build times is one of the most common requests for infrastructure teams and, as such, we constantly seek to improve our infrastructure toolchain.

We are excited to be open sourcing XCRemoteCache, the library we created to mitigate long local builds. As the name suggests, this library is a remote caching implementation for iOS projects with an aim to reuse Xcode target artifacts generated on Continuous Integration (CI) machines. It supports Objective-C, Swift, and ObjC+Swift targets and can be easily integrated with existing Xcode projects, including ones managed by CocoaPods or Carthage.

Best of all, XCRemoteCache resulted in a 70% decrease in clean build times (we classify a build as clean when at least 50% of all targets compile at least one file).

Background

Using our Xcode build metrics (for more details on how this works, take a look at our open source XCMetrics project), we found out that it often takes our developers more than 10 minutes to build the main Spotify iOS application. Even though the number of these builds is relatively small (less than 3% of all builds), they take more than 50% of global building times (Figure 1). 

After some investigation, it was revealed that long-lasting builds usually happen after rebasing or merging remote branches. Implementing a remote cache solution was the perfect fit.

Figure 1: Distribution of local machines’ build times and their total build times.

Remote cache principle

A remote cache is a popular technique to speed up builds of big applications by applying the “compile once, use everywhere” approach. As long as all input files and compilation parameters are the same, instead of building a target locally, one can download artifacts that were built and shared from some other machine. A key success factor for remote caching is finding an optimal caching level. Caching units that are too granular, where every single piece of the compilation step is cacheable, may lead to extensive network traffic overhead, which can offset CPU savings. On the other hand, putting the entire codebase into a single cacheable unit may significantly degrade the caching hit rate; every single local change invalidates remotely available cache artifacts, triggering a full build, locally.

The main Spotify iOS application is highly modularized and contains more than 400 independent modules configured as separate Xcode targets. Applying target-level caching was natural, and as we found out later, the right decision.

Designing the remote cache solution 

In the design phase, our aim was to come up with a solution generic enough that it could be applied to a broad range of iOS applications with minimal or no project changes. That was an ambitious goal given how the Xcode build system works. Before going straight to the applied solution, let’s consider how an Xcode build actually works. 

How does XCRemoteCache work?

In general, all caching mechanisms use a fingerprint of input files to recognize if build products can be reused. However, finding a precise set of these files is a nontrivial task. The Xcode build system is very liberal when it comes to dependency attribution. It tries to optimistically find a definition of the dependency (header files, .swiftmodule, etc.) in all available search paths — provided either by the developer in the Xcode project settings or in the current build product directory. As a result, developers don’t have to explicitly specify all dependencies a target uses, but just have to make sure that those dependencies will be placed in a correct location before Xcode actually needs them. The fact that compilers are able to implicitly find required dependencies, hinders the simple fingerprint generation by hashing all available files in header and framework search paths — a list of files to consider in the fingerprinting would often be too broad.

On the other hand, we observed that Xcode works quite well for local incremental builds and executes only a narrow subset of steps that were affected since the last build. In other words, the Xcode build system knows which files are the actual input files for the compilation, but that list is generated as compiler’s output (.d files) and is not available ahead of a compilation.

XCRemoteCache applies a unique approach to automatically identify all input files of the compilation based on Git history and dependency lists provided as a compilation output. The generation side, called producer mode, along with the compilation product, also uploads a meta file. That file contains a list of all compilation files the compiler used and the full SHA-1 (Secure Hash Algorithm 1) commit identifier it was built against. The producer mode should run on CI for each primary branch (like master or develop) commit, as a part of the post-merge phase.

On the consumer side (aka consumer mode), XCRemoteCache finds the most recent history commit for which the remote server contains build artifacts and builds a fingerprint based on input files provided in the meta file. Let’s imagine two developers, A and B, working on their local branches featureA and featureB, branching out from master on Commit1 and Commit3, respectively (Figure 2). A CI job that produces and uploads cache artifacts to the central server finished its work only for commits 1, 2, and 4. For some reason, Commit3 artifacts are not ready — either the build is in progress or it has failed. Developer A’s machine will reuse the artifacts generated for Commit1, while Developer B’s takes them from Commit2 — it tried with Commit3, but they are not ready yet.

Figure 2: Finding a commit with artifacts to reuse.

With that procedure, XCRemoteCache gets a strict list of input files almost for free.

Assuming we have an app split into several independent targets and local branches that don’t divert much from a primary branch, the caching hit rate will be high, minus only these targets that contain changes comparing the commit of which remote artifacts are used. 

Build artifacts portability

Another problem to consider is “build artifacts portability” between multiple machines. Several types of compilation output files include absolute paths, so for full compatibility, some kind of normalization is required. For iOS projects, such a step is required for .swiftmodule files and debug symbols. 

Projects cloned to /dir1 and /dir2 generate .swiftmodule files that do not match on a byte level. Swiftmodule represents an inter-module API (Swift counterparts to .h) that can be included in the list of fingerprinting files. To overcome falsy cache misses if two machines don’t have the same absolute source root paths, XCRemoteCache carries an extra fingerprint file (called a fingerprint override) next to the .swiftmodule that includes a fingerprint of all the files used in the compilation step. A fingerprint override is path agnostic, so contrary to the .swiftmodule file, it can be used as a byte-level stable fingerprint of a Swift target. 

Debug symbols, other path-sensitive files, are appended to the binary package to associate the machine code with the corresponding source location when debugging. The XCRemoteCache leverages support for LLDB runtime path rewrites using settings set target.source-map. Both producer and consumer pass debug-prefix-map parameters to the Swift and Clang compilers to align the source root of all debug symbols making the LLDB source mapping a simple, single-line command.

The Spotify story

At Spotify, we have fast, well-optimized CI jobs that no longer slow down our development feedback loop (please head to How We Gave Superpowers to Our macOS CI to read more about that), so we focused on applying XCRemoteCache on local machines. Keep in mind that the tool is able to work on CI machines and accelerate PR jobs, too.

In controlled conditions, XCRemoteCache was able to cut the very first iOS Spotify application build time by 85%. This was a great achievement, but in practice, developers introduce changes locally, and some targets have to be compiled locally. Our goal was to evaluate real-world scenarios to understand their true impact. To estimate that, we rolled out the remote cache to 50% of our developers for a week and compared all build metrics. 

Results exceeded our expectations — we observed a huge improvement of the local build times: median clean build and incremental build times decreased by 70% and 6%, respectively. We classify builds as clean when at least 50% of all targets compile at least one file.  Other builds that compile at least one file are incremental.

So, we enabled XCRemoteCache into our main application more than a year ago, and since then, it has worked flawlessly. We get very positive feedback from our developers and, as a side effect, we’ve seen the portion of clean builds almost double compared to the pre-remote cache times. Developers nowadays are twice more likely to rebase their working branches, eventually leading to fewer conflicts when creating a pull request. 

How to integrate XCRemoteCache into your existing project

Now that XCRemoteCache is open source, you can apply it to your own project with minimum effort. It supports multiple project setups, including the ones managed by CocoaPods, Carthage, or any other custom dependency management. Keep in mind, though, that for best results, your project should be split into several targets or you risk frequent cache invalidations, as described above.

For seamless integration, we are open sourcing a CocoaPods plugin and provide an automated script to modify the existing .xcodeproj project.

XCRemoteCache works with any HTTP server that supports PUT, HEAD, and GET requests. You are free to pick a server that works best for you, including the two popular storage options provided by Amazon S3 and Google’s Google Cloud Storage. We also provide a simple docker image that hosts a local server, perfect for the development phase.

With that in place, you should be able to try out XCRemoteCache within minutes. For a list of integration steps, head to the How-to section in the GitHub repo.

Contributing to XCRemoteCache

The XCRemoteCache tool is written fully in Swift, so iOS developers can easily familiarize themselves with the codebase and potentially contribute to it. We tried to cover most of the common scenarios but, keeping in mind that Xcode projects may have very custom setups, some of them may not be compatible right now. Therefore, any inputs from the community, both raising issues or pull requests, are very welcome. We believe that, together, we will be able to move the project even further and support a wider audience of iOS developers. 

If you want to contribute to the codebase, make sure to check out our development guide. And if you want to work full-time on tools like that, please check out our open job positions.

I want to personally thank Erick Camacho for his help in preparing this blog post.

Xcode is a trademark of Apple Inc., registered in the U.S. and other countries.


Tags: