WebAssembly and its platform targets

When talking about WebAssembly, one of the things that often confuses people is the lack of a great analogy for the various platform targets that WebAssembly supports. This ends up being a bit important as it dictates whether your WebAssembly code will (not) work with another bit as it all needs to target the same platform. It also dictates what your WebAssembly code can actually do, like have network access. Since I have needed to explain this topic a few times this week, I figured I would write out my explanation on this topic.

What is a "platform target triple"?

From ChatGPT:

In software development, a platform target triple (also known as a "target triple") is a string that specifies the target platform for which a piece of software is built.

This platform target triple specifies the CPU, vendor, and platform/OS that you're targeting. You typically see this come up when using a compiler as you are specifying what your platform target is for the compiler. Examples of this are x86_64-unknown-linux or aarch64-apple-darwin.

🤔
Platform target triples can also specify the libc, e.g. x86_64-unknown-linux-gnu specifically runs using glibc. This isn't pertinent to our discussion and it isn't exactly a triple at that point. 😅

This matters to Python users because CPython is compiled for and targets specific platform target triples. So the CPython interpreter you run when you type python3 is for a specific platform target triple. PEP 11 lists what platforms are supported by CPython in some official fashion. In the case of WebAssembly, there are two triples at the tier 3 level:

  1. wasm32-unknown-emscripten
  2. wasm32-unknown-wasi

You will notice the CPU architecture is the same, but that the platform/OS parts differ. The key goal of this post is to explain what those two platforms/OSs represent and how they are different.

Python code itself does not compile down to WebAssembly. You compile a Python interpreter like CPython to WebAssembly and have that run your Pytho code. So when I talk about compiling in this blog post, I'm referring to compiling CPython to WebAssembly, not your personal Python code.

wasm32 as a CPU

WebAssembly is an assembly language and equivalent binary format; it essentially defines a virtual 32-bit CPU. But because WebAssembly/wasm32 isn't a real CPU, you need a runtime to execute the code. And this is where the emscripten and wasi platform/OS targets become important. These platforms specify how to talk to the runtime to provide higher-level things that normally something like an OS would provide, like reading a file. These platforms then run on top of some runtime to execute WebAssembly.

💡
You can transpile WebAssembly to a concrete CPU target, e.g. x64. For instance, wasmtime has a compile command to create a compiled version of your WebAssembly code for your physical CPU. But even in this instance you still need to run the binary through wasmtime due to WebAssembly's security model.

Emscripten

One of the key runtime targets for WebAssembly is the browser. When you're targeting the browser, chances are you are using emscripten as your target platform. Emscripten is a "complete compiler toolchain to WebAssembly, ... with a special focus on ... the Web platform." So you use Emscripten as a tool to compile code to WebAssembly that is targeting the browser as a runtime.

One way that Emscripten helps your code target the browser is to use WebAssembly's JavaScript API to provide functionality to WebAssembly. Since the JavaScript API doesn't provide something like reading files, Emscripten itself has to come up with its own solution leveraging the JavaScript API. Since Emscripten isn't an API itself, it can also add features that normally wouldn't be there like dynamic linking as quickly as Emscripten can figure out how to support something.

The issue is that anything not specified by WebAssembly is tightly coupled to how Emscripten chooses to do something. Now, if you are compiling all of your code at once, like when you're compiling all of your C code for the browser, this isn't a big deal. And since Emscripten targets the browser, it just has to make sure all the code it produces is compatible with browser standards.

But when it comes to Python and packaging (in a PyPI sense), this gets tricky since wheels are designed to contain compiled code by someone who probably didn't also compile your Python interpreter. Since Emscripten has no API/version compatibility guarantees, it means you have to compile all of your Python extension modules with the same Emscripten version for it to all work together in harmony.

💡
If you're now thinking, "doesn't conda-forge compile everything with the same compiler toolchain?", you would be be right. There is work on an emscripten-forge to compile projects in conda-forge using Emscripten so every project uses the same Emscripten version.

WASI

The other runtime target for WebAssembly is WASI. You can think of WASI as POSIX for WebAssembly: a standard specifying functions which, when provided, should act a certain way to provide some functionality. For instance, WASI specifies an fd_read function for reading a file. A WASI-compatible runtime like wasmtime can then implement that function, so when wasmtime runs WebAssembly code targeting WASI that code can read files because wasmtime provided the agreed-upon function for reading a file.

When you hear about companies using WebAssembly for edge compute or running WebAssembly workloads with Kubernetes, it's on top of WASI. My personal interest in WASI stems from us experimenting with WASI for Python support in vscode.dev as we can use it anywhere you can run VS Code: desktop or web. Pretty much all non-browser uses of WebAssembly use WASI as their runtime target. You can polyfill WASI for the browser, but since Emscripten has more feature support I don't think it's widely used.

The great thing about WASI being a standard is you can rely on certain compatibility guarantees. The drawback of WASI being a standard is it evolves at the speed of a standard. 😉 Because Emscripten can add new support for something at any point, it can evolve much faster. But for WASI to evolve, the group managing the spec need to come to an agreement on changing the standard. Dynamic loading is a good example of this dichotomy. Emscripten has support because they came up with their own solution in JavaScript, but WASI doesn't have a solution right now and it won't until the spec adds it. That means the only real solution for supporting extension modules in WASI is to statically link the extension module in with CPython itself. Unfortunately that's not something Python packaging was designed for, which makes it potentially tricky.

💡
wasmtime-py lets you run WebAssembly code inside of CPython, but that's not for loading extension modules compiled for WebAssembly, just pure WebAssembly code. The key difference is an extension module coming from a wheel would need WASI to support dynamic loading, which it does not. Wasmtime-py essentially provides wasmtime as a Python extension module.

Summary

There's a stack of abstractions involved in making code work when it's compiled to WebAssembly. The CPU layer, wasm32, abstracts the execution of code. The next abstraction layer above that is the one that provides things like file reading. There are two key platform targets that provide the file-reading layer of abstractions. The emscripten platform target uses the browser as a runtime and Emscripten as the platform implementation on top of the browser to provide functionality like file reading. The wasi platform target is for WASI which is a standard like POSIX for WebAssembly runtimes that aren't targeting the browser. As a standard, WASI specifies what functions a runtime is expected to provide and is not a runtime itself. Taken together, this is how you end up with wasm32-unknown-emscripten and wasm32-unknown-wasi as platform target triples when compiling your code for WebAssembly.