Swift Sequences

Gordon Fontenot

We’re incredibly excited about the new Swift programming language announced by Apple at this year’s WWDC. As a way of experimenting, we’ve begun looking into what it would be like if we rewrote Liftoff, our command line Xcode project generation/configuration tool, in Swift.

Liftoff supports a few options on the command line, so the first thing we’re trying to do is write a small command line parsing library in Swift.

We want to try avoiding importing Foundation, so we are relying on the top level constants C_ARGV and C_ARGC to get the arguments passed on the command line. Instead of working with these primitive types, we’d really rather have our own object that can hold onto a native String[]. By implementing the Sequence protocol, we could quickly iterate over these options to do whatever we need to do with them.

EDIT: After publishing this post, I received feedback from a few people about a different way to get arguments from the command line. Apparently, there is a Process struct available without importing Foundation. This struct has an arguments property, and it conforms to Sequence. This knowledge is gained entirely through trial and error, as there is currently no available documentation on it. In fact, this struct doesn’t even show up in Xcode’s generated header. I’ll proceed with this post using C_ARGV and C_ARGC, but in the final version of this, I’ll probably end up using Process.arguments.

Creating the Argument List

The requirements for the ArgumentList object are as follows:

  • Instantiate it with C_ARGV and C_ARGC
  • Transform those into a native property with the type String[]

C_ARGV is of the type UnsafePointer<CString>. It contains all of the arguments passed to our process from the command line. From the type definition alone, we know that the internal contents of the object are instances of CString. This is good, because it means that once we get to those contents, we can use the method fromCString() on String to convert them to a nicer type. We also know that we’ll be able to access the contents via subscripting, but since UnsafePointer doesn’t conform to Sequence itself, we can’t iterate through it.

C_ARGC is of the type CInt. It represents the number of arguments that were passed to our object on the command line. We can use this to generate a loop so that we can convert each CString inside C_ARGV into a String.

We can start with a struct:

struct ArgumentList {
    var arguments: String[]

    init(argv: UnsafePointer<CString>, count: CInt) {
    }
}

Here, we’ve defined a basic constructor that will take C_ARGV and C_ARGC, and a property named arguments of the type String[]. So now, we can implement our constructor to loop through the provided input from the command line and convert the arguments into String instances:

init(argv: UnsafePointer<CString>, count: CInt) {
    for i in 1..count {
        let index = Int(i);
        let arg = String.fromCString(argv[index])
        arguments.append(arg)
    }
}

This gives us an object that satisfies our basic requirements. Now we can start to look into what it would take to conform this object to Sequence.

Inspecting Sequence

Now that we have an object that behaves how we want as a container, we can start to implement the methods that will let us transparently iterate through the internal list.

The protocol that lets us do this is called Sequence, and although it seems very straightforward, it took three of us in a room watching the Advanced Swift session video, looking through the session slides, and implementing it three times to fully understand what we needed to do.

So here’s a quick overview of how the protocol works when used with for in:

When you use the for <object> in <sequence> syntax, Swift actually does some re-writing of the code under the covers. As described in the Advanced Swift session, when you write:

for x in mySequence {
    // iterations here
}

Swift actually turns that into:

var __g: Generator = mySequence.generate()
while let x = __g.next() {
    // iterations here
}

So, breaking this down:

  • Swift calls generate() on the provided Sequence, returning a Generator. This object is stored in a private variable, __g.
  • __g then gets called with next(), which returns a type of Element?. This object is conditionally unwrapped in the while let statement, and assigned to x.
  • It then performs this operation until next() has nothing left to return.

For the record, I’m not crazy about the naming here. I think it’s probably best to think Enumerator instead of Generator, at least in this use case. I’ve filed a radar to this effect, but have already gotten some feedback that this change might not be so simple.

So it looks like we’ll need to actually implement two protocols to conform to Sequence. We’ll need our ArgumentList to conform to Sequence, and we’ll need another object to conform to Generator. We can start with Generator, since it’s the one that’s actually going to be doing the work.

Implementing Generator

As previously shown, we’ll need to implement one method for Generate: next(). This method has the return type of Element?, which is really just a catch-all type defined internally due to some weirdness with protocols and Generics. For now, we’ll ignore that, and think of it as being <T>?. The important thing to get is that we need to return an Optional.

In order to iterate through our array of arguments, we’re going to use a new type: Slice. This type holds a reference to a range of an existing array. This is a bit odd, but essentially, if we create a Slice with a range from an Array, and then update that Array, the Slice is updated as well:

let array: Array = ["foo", "bar", "baz"]
let slice: Slice<String> = array[1...2]
println(slice) // prints ["bar", "baz"]

array[1] = "bat"
println(slice) // prints ["bat", "baz"]

Note that I’m adding the type signatures for those constants for illustrative purposes. The return type of a range of an array is already Slice<T>, so Swift is able to infer this information.

We’ll create a light weight ArgumentListGenerator that conforms to Generator, and has an internal items property:

struct ArgumentListGenerator: Generator {
    var items: Slice<String>
}

If you try to compile, you’ll see that the compiler throws an error, because we haven’t implemented Generator properly. We need to implement next() for the compiler to be happy:

mutating func next() -> String? {
  if items.isEmpty? { return .None }
  let element = items[0]
  items = items[1..items.count]
  return element
}

Our implementation performs a quick check to see if our Slice is empty, and performs an early return with Optional.None if so. Note that since the return type is already Optional<String>, we can omit the Optional prefix for the enum.

We can then grab the top item from items, then reset items to the rest of the Slice. This is why we declared items as mutable, and also why we declared next() as mutating.

Now note that none of this implementation is specific to Strings, or even to our ArgumentList. In fact, with a quick refactor, we can modify this object to use Generics:

struct CollectionGenerator<T>: Generator {
    var items: Slice<T>

    mutating func next() -> T? {
        if items.isEmpty { return .None }
        let item = items[0]
        items = items[1..items.count]
        return item
    }
}

This is such a generic object that seems to solve so much of the common use case here, I’m a bit baffled as to why it hasn’t been included as a part of the standard library. I’ve already filed a radar on the issue.

Now that we have our Generator, we can finally conform our ArgumentList to Sequence.

Implementing Sequence

We can start by creating an extension on ArgumentList to hold the required method:

extension ArgumentList: Sequence {
}

We can then declare the required method, generate():

extension ArgumentList: Sequence {
    func generate() -> CollectionGenerator<String> {
    }
}

Note that we’re using our generic CollectionGenerator<T> type as the return type here. All that’s left is to create a Collection Generator with our arguments:

extension ArgumentList: Sequence {
    func generate() -> CollectionGenerator<String> {
        return CollectionGenerator(items: arguments[0..arguments.endIndex])
    }
}

Now, we can quickly and easily create a list of arguments passed on the command line, and iterate through them using for in:

let arguments = ArgumentList(argv: C_ARGV, count: C_ARGC)

for argument in arguments {
    println(argument)
}

EDIT: After getting some feedback on Twitter, it looks like there are two other solutions to this problem:

Matt Bridges pointed out that there is a GeneratorOf<T> type that takes a closure as a constructor argument. That means that our generate() method can be changed to look like so:

func generate() -> GeneratorOf<String> {
    var i = 0
    return GeneratorOf<String> {
        if (i > self.arguments.count) {
            return .None
        } else {
            return self.arguments[i++]
        }
    }
}

But there’s actually an even better solution. Looking closely at the type signature for Array<T>, you can see that it’s returning a struct of the type IndexingGenerator<T[]>. That means that instead of creating our own generator, we could return the result of arguments.generate(), as long as we set our return type to IndexingGenerator<String[]>:

func generate() -> IndexingGenerator<String[]> {
    return arguments.generate()
}

Thanks to Adam Roben for pointing this out to me.

Both of these solutions remove our need for any custom Generator implementation at all, but it’s still worth understanding how all of these pieces fit together.

What’s next