Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: [Swift+Wasm] initial support for compiling Swift to WebAssembly #24684

Closed
wants to merge 1,843 commits into from

Conversation

zhuowei
Copy link
Contributor

@zhuowei zhuowei commented May 10, 2019

What's in this pull request?

This pull request adds initial support for compiling Swift code to WebAssembly.

"Hello world" works, and a large subset of the stdlib already works on WebAssembly.

You can try this yourself with our cloud-hosted toolchain incorporating this port:

https://swiftwasm.org

This patch uses the WASI SDK, so WebAssembly executables generated by this port will work both in browsers and in standalone WebAssembly runtimes such as Wasmtime or Fastly's Lucet.

Links to issues

See SR-9307 for some background and the existing Swift+WASM changes in Swift that this patch depends on.

Also see emscripten#2427 for discussion and previous attempts on porting Swift to WebAssembly, which this port draws heavily upon.

Thanks

Thank you to everyone who helped make this possible:

If you would like to help, you can join us at https://github.com/swiftwasm.

Status of the port

This port is not ready for merging. The biggest blocking issues currently are:

  • Tests are disabled
  • Crash when passing non-throwing closure to function taking throwing closure

We're opening this pull request now to get advance feedback and advice, so we can fix these remaining issues and start cleaning up our patches.

Per Swift's contributing guidelines, we're planning to split each change into a separate pull request.

We've already created pull requests for some minor changes:

We welcome advice on how best to submit these changes for review.

Note that this port also requires changes to Clang and LLVM: the corresponding pull requests are:


Here's a more detailed explanation of the changes included in this pull request, and on what still needs to be done.

How WebAssembly differs from other platforms

WebAssembly is a new platform, with unique attributes that pose issues to Swift's runtime.

  • Functions have strict argument checking
    • Swift often calls functions with extra arguments
    • couldn't find a way to fix this
    • so eg Optional.Map crashes because tries to call non-throwing closure with an extra error pointer
  • Limited relocation support in linker: can't take difference between two symbols
    • Metadata relies on this. Solution: switch to absolute pointers
    • Not the first port that required this
    • We would like to find a way to merge absolute metadata support
    • or find a different workaround
    • eg only emitting Bitcode for LTO, avoiding intermediate .wasm files

What's done

Swift changes

LLVM changes

0x0 -> value witness table pointer
0x4 -> metadata entry
... etc

https://github.com/apple/swift/blob/master/docs/ABI/TypeMetadata.rst

So to access the metadata entry at 0x4 bytes past the table pointer, a symbol is exported using LLVM's alias directive, pointing 4 bytes past the value witness table pointer

Other metadata use similar aliases, again to export a symbol with an offset inside another symbol

Clang doesn't emit alias directives with offset, so this wasn't already implemented by LLVM's Wasm backend. This change adds it.

  • cherry-pick wasm linking info v2 patch from LLVM master:

Clang changes

What still needs to be done

Most important: fix swift calling convention and extra arguments

WebAssembly has strict function signature checking, so this crashes:

struct Test {
    var a:Int
}
print(Test(a: 1))

inside Optional.Map.

Why? Optional.Map takes a throwing closure, but is passed a non-throwing closure.

A throwing closure is compiled down to a signature similar to this:

void closure(void* arg1, void* arg2, void* swiftSelf, error** swiftError)

A non-throwing closure has signature like this:

void closure(void* arg1, void* arg2, void* swiftSelf)

without an end error pointer.

Swift assumes passing extra parameters to a function pointer is ignored, so it doesn't generate a thunk if a non-throwing closure is called as a throwing closure, or if a thin function is called as a thick function.

lib/IRGen/GenFunc has a comment that explains this further.

This assumption is valid on all platforms that Swift currently supports, but doesn't work on WebAssembly thanks to the strict signature checking.

Unfortunately, I have no idea how to fix this.

Modifying the Swift compiler to generate the thunks might be difficult. Currently, thunks are only generated when calling to a function with different calling conventions, not between functions with the same calling convention but different number of arguments. SIL's SILFunctionType doesn't even track if a function throws.

I've never worked with Swift compiler internals, so I don't even know where to start modifying IRGen.

I asked @jrose-apple, who suggested that one short-term alternative is to standardize all swiftcall functions to take only one extra parameter:

void closure(void* arg1, void* arg2, void** extraArgs)

extraArgs would point to an area on the stack, containing swiftself, swifterror, and any other extra parameters.

This way, thin, thick, and throwable thick functions would have the same number of arguments.

I'm guessing this would require either Clang+Swift changes or an LLVM pass.

I found an example in Chrome PNaCL that transforms function arguments https://chromium.googlesource.com/native_client/pnacl-llvm/+/mseaborn/merge-34-squashed/lib/Transforms/NaCl/ExpandVarArgs.cpp#170, so the LLVM pass might not be too complicated.

We would really appreciate help and advice on how best to approach this.

Reenable and run tests

  • Tests are completely disabled right now.

Upstream the LLVM patches

  • Currently, the LLVM/Clang patches are based on Swift's stable branches
  • Our goal is to get the LLVM changes into LLVM upstream, so they'll be pulled into swift's upstream-with-swift branches
  • I've never worked with LLVM before, so I would appreciate advice on how best to structure patches for upstreaming.

Support building Swift stdlib for Wasm using a macOS host

  • It seems that a macOS host can only cross compile for Darwin platforms, as compiling stdlib for Wasm fails on macOS. (It works on Linux.)
  • Is the existing Android cross-compile support also affected by this?
  • @MaxDesiatov is working on getting the stdlib on Wasm building on macOS.

Split this PR into small, reviewable chunks

  • We started to do this with the simpler stdlib signature changes already

Longer-term goals

Get Swift's other libraries working

  • We have not tried building Foundation or any other Swift libraries.
  • Would these libraries/tools even work in WebAssembly, which, as of now, only supports single threaded execution?

Support link-time optimization

  • @ddunbar discussed the importance of link-time optimizations for Swift on WebAssembly in SR-9307.
  • This would reduce executable size (currently, a Swift Hello world compiles to a 7.8 MB .wasm)
    • Emscripten heavily depends on LTO to keep binary sizes down when compiling C/C++ to WebAssembly, so it's probably worth supporting
  • it can also help us avoid the relative/absolute metadata issue since relative relocations can be represented in LLVM bitcode.

Work on ways to interop with JavaScript from Swift

@jckarter
Copy link
Member

I asked @jrose-apple, who suggested that one short-term alternative is to standardize all swiftcall functions to take only one extra parameter:

void closure(void* arg1, void* arg2, void** extraArgs)
extraArgs would point to an area on the stack, containing swiftself, swifterror, and any other extra parameters.

This way, thin, thick, and throwable thick functions would have the same number of arguments.

I'm guessing this would require either Clang+Swift changes or an LLVM pass.

Rather than spill to the stack, it should be sufficient to make the two swiftself and swifterror arguments always be provided, and leave them undef when they aren't needed. Those are the only two extra arguments that you should need to worry about. Since there's already a distinct swiftcc convention at the LLVM level, it seems natural to me to introduce these arguments, if they don't exist in the LLVM signature, into the wasm-level signature in the backend.

Note that swifterror isn't a real pointer-to-pointer argument for the x86 or arm backends, but really represents an in-out register. If it's possible to model it that way in wasm too, it'd probably lead to better native code size and quality when lowered to native code.

Copy link
Member

@jckarter jckarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're never emitting relative references to begin with, you ought to be able to stop emitting GOT-relative pointers altogether. They're only necessary in native code image formats for referencing symbols from other binaries that don't have fixed relative offsets. You should be able to chase down all the IRGen helpers that return ConstantReference values and modify them to always return Direct references. That should save you having to support the tagging scheme here as well.

@ddunbar
Copy link
Member

ddunbar commented May 11, 2019

Metadata relies on this. Solution: switch to absolute pointers

Given the newness of WASM, it would be nice to instead see if support for this can be added to WASM; the feature is useful and could eventually benefit other things.

@zhuowei
Copy link
Contributor Author

zhuowei commented May 11, 2019

@jckarter That was one of the alternatives that @jrose-apple suggested. However, this may not work for all methods: According to GenFunc.cpp there could be extra "witness_method generic parameters" after the error parameter, so just adding two parameters might not be enough.

Also, re swiftcc calling convention: Wasm doesn't have registers: it is a stack machine, similar to the Java virtual machine.

@jckarter
Copy link
Member

@zhouwei For witness table entry points, there is still a consistent calling convention that all witnesses use. That logic duct tapes over some representation issues in SIL; they should end up ultimately lowering to compatible LLVM level signatures.

If wasm is a stack machine, then it may still be worth considering the intent of the swiftcc special arguments in designing how it lowers to the stack machine representation. The self argument is intended to be mapped to a stable, callee-preserved register, so that (a) context-free functions are ABI compatible with closures that have captures, and (b) in a series of method calls on the same object, the caller can save code size by not having to constantly reload the self argument register. In a stack machine, you could get a similar effect by always pushing the self argument first, and maybe having the callee leave it on the stack after the call (though for a stack machine that's not an obvious win; it has a tradeoff in code size for calls to methods on different objects, since the caller will then have to pop more).

The error argument is similarly intended to be mapped to a fixed, normally callee-preserved register, which the caller sets to zero, and the callee sets to nonzero on error. This is so that nonthrowing functions are ABI compatible when used as throwing functions, and so that propagating errors through multiple stack frames can be done with minimal code size cost for the test and early return. For a stack machine, it seems like we don't really gain anything from passing a value in to the callee, since it's always zero or undef, but you could push the error value after the primary return value when the callee returns, so that the caller can easily test it and either pop or return to propagate the error upward.

@jckarter
Copy link
Member

Given the newness of WASM, it would be nice to instead see if support for this can be added to WASM; the feature is useful and could eventually benefit other things.

@ddunbar Swift's intended use is to reduce load time for memory mapped native code binaries. Since wasm AIUI generally goes through another compilation stage on the client, it seems to me like the cost of relocating wouldn't be that big a part of the load time cost. Hopefully the wasm binary format already has a reasonably efficient way of representing references to local symbols...

@zhuowei
Copy link
Contributor Author

zhuowei commented May 11, 2019

@jckarter Wasm call instructions seem to consume the arguments from the stack: https://godbolt.org/z/NjLc0f so it might not matter whether self is pushed first or last, from a code density or performance point of view.

I have no experience working on compilers, though, so that's just my guess. for what it's worth, Java's JVM, which has a similar stack machine and call semantics, pushes the self pointer first, but I'm not sure why they do that.

@jrose-apple
Copy link
Contributor

Swift's intended use [for relative indirections] to reduce load time for memory mapped native code binaries. Since wasm AIUI generally goes through another compilation stage on the client, it seems to me like the cost of relocating wouldn't be that big a part of the load time cost.

Some of them do reduce in-memory static data size too, but maybe that's not worth pushing a whole feature through WASM.

@apple apple deleted a comment from rnantes Jul 10, 2019
@MaxDesiatov MaxDesiatov changed the base branch from master to main September 24, 2020 08:40
Copy link

@OlaSam OlaSam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's empty :(

[pull] swiftwasm from main
…am/refactor-wasi-libc

Cherry-pick WASILibc upstreaming changes
[wasm] Add more errno support in WASILibc overlay
…esources/linux

This is a preparatory change for adding a static executable support for WASI
…eneration

This patch makes the build system to copy the lnk files for each
stdlib targets if needed instead of only for the Linux target.
This patch adds static-executble-args.lnk file for WASI target
This part was not intended to be upstreamed for now because we should
not hack pthread ideally
[wasm] Revert unnecessary CC attr in Numeric.cpp
…am/driver-wasm-toolchain

Cherry-pick static-stdlib changes
[pull] swiftwasm from main
[pull] swiftwasm from main
[pull] swiftwasm from main
[pull] swiftwasm from main
[pull] swiftwasm from main
[pull] swiftwasm from main
[pull] swiftwasm from main
[wasm] Revert unnecessary TaskGroup.cpp include directive
[pull] swiftwasm from main
@MaxDesiatov MaxDesiatov added the WebAssembly Platform: WebAssembly label Sep 28, 2023
@MaxDesiatov
Copy link
Member

MaxDesiatov commented Nov 8, 2023

This PR has accumulated too many conflicts and no longer reflects the actual difference between the fork and the upstream codebase. Most if not all relevant changes were split into separate PRs. I'm closing this one as unused.

@MaxDesiatov MaxDesiatov closed this Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WebAssembly Platform: WebAssembly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants