Speed Up Swift: Faster SwiftSyntaxMacros Package Compilation Part 2.
How to enable concurrent compilation of `swift-syntax` in the most optimal way?
Context
I have explored a simpler approach of optimizing swift-syntax
compilation duration in Part 1 of these series, and now I am going to take you into the journey of a very sophisticated approaches which I have tried to use to optimize Swift code for a better Type Checker performance.
You’ll learn how I came up with ideas that didn’t work, and how I leveraged the experience of knowing what does not work to create a better and clearer working solution.
Let’s dive in!
Overview of swift-syntax
Package swift-syntax
consists of several packages, out of which only 2 are important within the context of this article:
swift-syntax
CodeGeneration
While swift-syntax
is used by SwiftSyntaxMacros, it is also used by CodeGeneration
package within swift-syntax
repository. Now, the interesting part: CodeGeneration
generates code using swift-syntax
which then would be used inside of… swift-syntax
☯️.
In my previous PR to apple/swift-syntax I have mentioned some of the Type Checker slowness within the generated
section of swift-syntax
, which was not addressed at that time. The reason for this is simple: it was nowhere near as simple to do as it might have seemed.
Anatomy of CodeGeneration in swift-syntax
Code generation process in swift-syntax/CodeGeneration
package is done step-by-step in the file named GenerateSwiftSyntax.swift
. There is a smart optimization done by Apple engineers in this file for splitting large files into multiple smaller ones, similar technique I have used when team at Qonto was reducing the test build time by more than 45%.
fileSpecs += ["AB", "C", "D", "EF", "GHI", "JKLMN", "OP", "QRS", "TUVWXYZ"].flatMap { (letters: String) -> [GeneratedFileSpec] in
[
GeneratedFileSpec(swiftSyntaxGeneratedDir + ["syntaxNodes", "SyntaxNodes\(letters).swift"], syntaxNode(nodesStartingWith: Array(letters))),
GeneratedFileSpec(swiftSyntaxGeneratedDir + ["raw", "RawSyntaxNodes\(letters).swift"], rawSyntaxNodesFile(nodesStartingWith: Array(letters))),
]
}
Other than this, files are generated based on “templates”, which are nothing like Stencil templates used in Sourcery. These templates are written using Swift programming language, in a very similar way to how Swift macros is written. Files are defined using SourceFileSyntax
type, and different blocks are then used to fill in the contents accordingly.
Attempt #1: Split Big Files
What caught my eye was the fact that some files were very big and were taking a significant amount of time to compile:
Having a good experience with splitting Swift source files into multiple files to optimize compilation pipeline, I have tried and split these files into separate files, for every struct
there would be a separate .swift
source file. Did this improve compilation duration as much as I have expected, by 50%? I was very naïve, and failed miserably:
Even though files were split substantially, there are simply not enough CPU cores to fully parallelize the workload. And so, the first attempt failed. I have moved on to the next one: smart optimization based on intrinsic details of every use case. I could also notice that Swift Engineers at Apple have tested the performance of type-checker in this package tinkering with the code generation configuration with good care and attention.
Attempt #2: Manual Optimization of Generated Code
To “try and see” if the new approach for code generation would work prior to further modifying templates in CodeGeneration
package, I have decided to manually tinker with the code and see what are the weak spots and what can be done. Let me say, it was very hard to figure out what was going on wrongly, and even harder to optimize.
My attention caught the slowest code to compile, which is located in swift-syntax/SwiftSyntax/generated/ChildNameForKeyPath.swift
. This file contains a simple function with a very large (~3454 LoC) switch
statement. This method heavily relies on type inference for matching AnyKeyPath
against all of the implicitly-typed case
labels, using key-path operator \
. My approach was to split the switch
statement alphabetically.
/// If the keyPath is one from a layout structure, return the property name
/// of it.
@_spi(RawSyntax)
public func childName(_ keyPath: AnyKeyPath) -> String? {
switch keyPath {}
}
/// If the keyPath is one from a layout structure, return the property name
/// of it.
@_spi(RawSyntax)
public func childNameB(_ keyPath: AnyKeyPath) -> String? {
switch keyPath {}
}
/// If the keyPath is one from a layout structure, return the property name
/// of it.
@_spi(RawSyntax)
public func childNameC(_ keyPath: AnyKeyPath) -> String? {
switch keyPath {}
}
/// If the keyPath is one from a layout structure, return the property name
/// of it.
@_spi(RawSyntax)
public func childNameD(_ keyPath: AnyKeyPath) -> String? {
switch keyPath {}
}
...
While it seemed like it helped, in actuality it just split the same amount of time, now across multiple functions, needed for the type-checker to match the types as it was originally.
Total compilation duration was not affected, and so, attempt #2 failed. But it gave me the clue for the next attempt, now in the SyntaxVisitor.swift
file. Hint: the problem is not about the size of the switch
statement, but rather about “something else”, and in every case it is different.
I’d also like to mention though that with this “smart split” of a large
switch
statement, the performance of Xcode source editor was improved, in a way that the scrolling and find/replace operations were way faster and even smooth!
Attempt #3: SyntaxVisitor.swift
SyntaxVisitor is a large Swift file which provides essential features for traversing through Swift code structures and expressions. For that there is a method named visit(_ node: Syntax)
. This method calls a private method named visitationFunc
and this is the method which takes more than 400 ms to compile on M1 Max SoC with 32 GB of RAM.
The code in visitationFunc
is simple: it uses a large switch
statement, cases in which return a closure, and within this closure the following code is executed:
return {
self.visitImpl($0, YieldedExpressionsClauseSyntax.self, self.visit, self.visitPost)
}
The method visitImpl
has the following signature:
private func visitImpl<NodeType: SyntaxProtocol>(
_ node: Syntax,
_ nodeType: NodeType.Type,
_ visit: (NodeType) -> SyntaxVisitorContinueKind,
_ visitPost: (NodeType) -> Void
)
Here there are multiple hints on how to approach this slowness of Type Checker:
- Avoid use of Generics
- Avoid type inference of the closure argument
Although it sounds reasonable, it did not help at all:
return { (arg: Syntax) in
self.visitImpl(arg, AccessorBlockSyntax.self, self.visit, self.visitPost)
}
Then I have tried to remove generics from visitImpl
method, but was immediately hit with the infamous Swift compiler error:
The compiler is unable to type-check this expression in reasonable time
I have opened an issue for it in apple/swift related to this and to another, closely related to this optimization attempt, Swift compiler crash.
So, generics are there to stay, what else? I have tried to split the large switch
statement like I did for the Attempt #2, and it did not solve the issue with the slow type checker performance in particular; however it has highlighted the fact that the problem was not directly related to switch
statement, since I have tried to also split it like in the ChildNameForKeyPath.swift
case.
My next take was to remove arguments from visitImpl
method and see if that would solve the issue... and it worked! I was able to reduce the speed of visitationFunc
by more than 90% at once! But we do need those arguments, so what is the actual problem?
Structure of SyntaxVisitor.swift
SyntaxVisitor
has a lot of overridden methods, and those self.visit
and self.visitPost
are actually the overridden methods:
open func visit(_ node: YieldStmtSyntax) -> SyntaxVisitorContinueKind {
return .visitChildren
}
open func visitPost(_ node: YieldStmtSyntax) {
}
open func visit(_ node: YieldedExpressionListSyntax) -> SyntaxVisitorContinueKind {
return .visitChildren
}
open func visitPost(_ node: YieldedExpressionListSyntax) {
}
open func visit(_ node: YieldedExpressionSyntax) -> SyntaxVisitorContinueKind {
return .visitChildren
}
...
which are then passed as closures to visitImpl
, where generics are inferring the type, and so the compiler assigns the correct method as argument #3 and argument #4 of visitationFunc
.
I have tried to cast the closures to ease the pain of Type Checker:
return { (arg: Syntax) in
- self.visitImpl(arg, AccessorBlockSyntax.self, self.visit, self.visitPost)
+ self.visitImpl(arg, AccessorBlockSyntax.self, self.visit as ((AccessorBlockSyntax) -> SyntaxVisitorContinueKind), self.visitPost as ((AccessorBlockSyntax) -> Void))
}
But unfortunately it wouldn’t help with the compilation duration. Next I have tried to add additional closures around the argument #3 and argument #4:
- self.visit
+ { (arg: AccessorBlockSyntax) -> SyntaxVisitorContinueKind in return self.visit(arg) }
This also did not help with the Type Checker performance, but it gave me the right idea: to get rid of the overridden methods completely!
To do this, I needed to change the CodeGeneration
of SyntaxVisitorFile.swift
:
By having separate method signatures for every type, Type Checker does not need to figure that out using generics and type inference, it is all there before compilation even starts!
And voila! It actually worked! The same was applied to the SyntaxRewriter.swift
file as well…
Results
I have measured the duration of swift-syntax
using the following command:
time xcodebuild -scheme swift-syntax-Package ONLY_ACTIVE_ARCH=YES -destination 'platform=macos' clean build > /dev/null
The output did not show significant improvements:
I have then used SwiftCompilationTimingParser
and processed its output as CSV with the following results:
And most importantly, the following measurements were made using Xcode itself:
Altogether, along with other minor improvements, sped up the swift-syntax
by almost 2%! Personally, it looked very satisfying after weeks of trial and error to see a measurable impact that would, eventually, improve developer experience ❤️ for all Apple platform developers and engineers, as well as will save CI usage across the globe once this new PR will be merged and a new version of swift-syntax
would get embedded into a new Xcode release. Part 3 of these series will include an even deeper dive-in into this topic.
While Swift programming language is safe and secure, majorly relying on Type Checker, its compilation performance tends to slow things down, sometimes even too much. For example, in my video presentation which I did during Bitrise DevOps Summit 2023 (video is available on-demand for free!), I showcase how a simple type inference might have costed more than tens of thousands of dollars on CI usage over 6 months, while also slowing down the development process in general.
Want to Connect?
Follow me on X (Twitter): @r_alikhamov or LinkedIn 🤝
Resources
- PR to apple/swift-syntax — https://github.com/apple/swift-syntax/pull/2328
- Previous PR to apple/swift-syntax — https://github.com/apple/swift-syntax/pull/2308
- swift-syntax — https://github.com/apple/swift-syntax
- Swift Type Checker — https://github.com/apple/swift/blob/main/docs/TypeChecker.md
- SwiftCompilationTimingParser — https://github.com/qonto/SwiftCompilationTimingParser