Packaging a WebAssembly Audio Tool

Martin MoxonJuly 6, 2021 • 10 min read

A project I'm working on led me to attempt to create a reusable package from an existing WebAssembly tool by utilising modern tooling and APIs.

The tool is quiet.js - a WebAssembly build of libquiet for encoding and decoding binary data to and from audio.

The main motivation in this case for compiling this project to WebAssembly is that the end user effectively gains the benefits of a compiled executable (in particular not having to download and compile multiple dependencies) whilst also being available on any modern browser and on any architecture.

Since I wanted to use the tool in a project of my own, there were a few shortcomings that I felt could be addressed:

quiet.js is not an npm package, nor structured in such a way that it could be easily be used as one, making it difficult to use as part of an application.
It is compiled using Emscripten and dependent on a large amount of generated Emscripten JavaScript glue code, and only accessible via via the global Module object. Modern browser APIs make it relatively easy to interface directly with WebAssembly modules and access their exports directly.
The quiet.js Transmitter and Receiver logic is tightly coupled, and the control flow somewhat complex.
quiet.js depends on some deprecated Web Audio APIs, such as ScriptProcessorNode.
The build process for the WASM module itself was not available, making it difficult to determine exactly what source went into the compiled WebAssembly.

I decided to attempt a rewrite of quiet.js with a goal of resolving these issues.

Standardising the build process

While the existing WASM is available as a pre-compiled object in the repository source, I felt an important first step was to reproduce the build process.

By building the WASM in in an openly accessible build pipeline, contributors can collaborate not only on the JavaScript interface code but also on the compilation steps of the WASM.

Additionally, providing a start to finish build script enables the project to adhere to the LGPL licensing of some of the dependencies, since the statically linked code is accompanied with tools that could be used to create a build using different versions of the LGPL components.

Building the WASM version was a little tricky, since many of the compilation steps assume the compilation target is native executable code, and that dependencies will exist in certain locations on disk.

While this was easier to resolve for most of the libraries through use of various compiler flags, the current pipeline definition does currently include a very specific CMakeLists.txt patch that is tightly coupled to the version of libquiet it is building.

The full pipeline for building the WASM can be found here

Replacing the Emscripten Glue Code

The accompanying glue code generated by Emscripten amounts to over 650kb of minified JavaScript. By targeting the specific functions needed from libquiet, it was relatively simple to replace the functionality provided by Emscripten, as well as having more fine-grained control over the compilation and instantiation of the WASM.

The first step involves fetching and instantiating the WASM using the streaming compilation API.

const { instance } = WebAssembly.InstantiateStreaming(fetch('quiet.wasm'))

A major benefit of decoupling the WASM build step from JavaScript specific Emscripten features is that the WASM output can potentially run on other WebAssembly runtimes, such as Wasmer.

Additionally, loading the WASM this way instead of via the existing Emscripten glue code avoids polluting the global namespace with the Module object (there are also Emscripten build flags than can generate an ES module version of the JS glue code).

One part that required replacing was logic for handling non-numeric data types, in particular strings. Whilst WebAssembly currently only supports numeric values as parameters, it is possible to pass strings between JavaScript and WebAssembly by reading and writing bytes directly to and from the WebAssembly linear memory.

utils.js

...

function allocateArrayOnStack(instance, arr) {
  const ret = instance.exports.stackAlloc(arr.length);
  const HEAP8 = new Int8Array(instance.exports.memory.buffer);
  HEAP8.set(arr, ret);
  return ret;
}

...

Similarly, managing things like the stack pointer (which is typically managed for you when compiling C) must be explicitly handled by calling stackSave() and stackRestore() at appropriate points:

const stack = instance.exports.stackSave();

// Operations that allocate to stack here

instance.exports.stackRestore(stack)

Caveat: WASM imports

Since some of the dependencies of libquiet depend on system APIs such as the file system, these are automatically specified in the compiled WASM as imports.

quiet.wat

(import "wasi_snapshot_preview1" "proc_exit" (func (;0;) (type 0)))
(import "wasi_snapshot_preview1" "clock_time_get" (func (;1;) (type 65)))
(import "wasi_snapshot_preview1" "fd_close" (func (;2;) (type 1)))
(import "wasi_snapshot_preview1" "fd_write" (func (;3;) (type 8)))
(import "wasi_snapshot_preview1" "fd_seek" (func (;4;) (type 66)))
(import "wasi_snapshot_preview1" "fd_read" (func (;5;) (type 8)))
(import "env" "__sys_getpid" (func (;6;) (type 7)))

Normally it would make sense to use a WASI implementation to provide these imports, such as @wasmer/wasi:

@wasmer/wasiWasmer Engineering Team

which includes implementations for the required imports above, such as a virtual file system.

However, since the parts of the library this tool exposes definitely won't be using any of the functions that access the system APIs, I simply provided stub no-op functions as imports.

This is a workaround: A later step may be to optimise the build process to not depend on any file system APIs and therefore not require these module imports.

Modernising Web Audio API usage

The quiet.js core code makes heavy use of the Web Audio API to send Audio data to the user's speakers and receive data from the user's microphone.

Re-implementing the Transmitter

An issue with the original quiet.js Transmitter implementation is that it used ScriptProcessorNode to both to transmit audio as well as process incoming audio from the Microphone.

This is now a deprecated API, and for the case of playback of data in memory, AudioBufferSourceNode is a more modern approach.

Re-implementing the Transmitter was relatively simple - playing the audio from the speakers was a matter of creating an AudioBufferSourceNode from the encoded data in the audiobuffer and connecting it to the existing audioContext.

Transmitter.js

...
  function transmit() {
      ...
      const audioBufferNode = new AudioBufferSourceNode(audioContext);
      audioBufferNode.buffer = audioBuffer;
      audioBufferNode.connect(audioContext.destination);
      audioBufferNode.start(t);
      t += audioBuffer.duration;
      ...
  }
...

Here the t is the playback time offset, which is necessary for when the input exceeds the size of a single buffer. By keeping a running duration total each frame is transmitted sequentially.

Re-implementing the Receiver

The original Receiver logic also made use of ScriptProcessorNode to process incoming audio from the Microphone.

This is now deprecated in favour of AudioWorklets, which can process data in a separate AudioWorklet thread.

AudioWorklets provide several benefits, including not running on the main thread, as well as providing stronger guarantees about the real-time nature of the incoming data.

However, switching to use AudioWorklets posed two additional problems:

The first is that The AudioWorkletThread cannot make network requests, and getting access to the WASM on the AudioWorkletThread meant taking a slightly different approach to the one taken for Transmitting encoded data.

Having read a similar problem discussed in the google developer blog, as well as the MDN documentation for WebAssembly.Module, I found that unlike a WebAssembly.Instance, an uninstantiated WebAssembly.Module can be serialised and passed between threads.

Passing the WASM module from the main thread to the Web Audio thread

The WebAssembly compilation step was updated as so:

const { instance, module } = WebAssembly.InstantiateStreaming(fetch('quiet.wasm'))

While the instance object is used on the main thread, the module is passed to the AudioWorkletThread which can then instantiate it into an instance of its own and use it in the process method of the custom AudioWorklet:

index.js

this.quietProcessorNode = new AudioWorkletNode(
  this.audioContext, 
  'quiet-receiver-worklet', 
  {
    processorOptions: {
      quietModule: module
    },
  }
);

quiet.audioworklet.js

class ReceiverWorklet extends AudioWorkletProcessor {
  constructor(options) {
    super();
    const { quietModule } = options.processorOptions;
    ...
  }
}

The second problem with transitioning to AudioWorklets is that unlike ScriptProcessorNode which could provide the required number of bytes on demand, process() would give 128 bytes at a time, and decoding the data was not initially possible.

Fortunately, the blog post mentioned above also described a pattern for using a RingBuffer to buffer an arbitrary number of bytes.

Using this pattern, the custom AudioWorkletNode can decode incoming data and use postMessage() to send the decoded string back to the main thread.

Packaging it all up for npm

An added layer of complexity in packaging this into an npm package is that there are two files that need special consideration - the quiet.wasm, which should be fetched asynchronously at runtime, and the quiet AudioWorklet, which as well as being in a separate thread is also in it's own file.

Using an AudioWorklet involves fetching the WASM at runtime:

await WebAssembly.instantiateStreaming(fetch(PATH_TO_WASM))

and similarly evaluating the path to the AudioWorklet file at runtime:

audioContext.audioWorklet.addModule(PATH_TO_MODULE_JS)

When trying to use the package with other bundlers such as Parcel and Webpack V4, the dependencies were not detected and bundled, and were not accessible from the compiled package.

There are several ways these issues can be resolved, though they typically require either bundling the external assets as Blobs in the main entry file (which has it's own performance downsides) or using non-standard syntax, resulting in code that may be tied to a specific bundler implementation.

Fortunately, Webpack V5 implements several ECMAScript compatible features for handling these cases - in particular Asset Modules and Web Worker Support

Webpack V5 can detect relative paths to assets in dependencies by using URL and import.meta.url, like so for the WASM:

index.js

await WebAssembly.instantiateStreaming(fetch(new URL('./quiet.wasm', import.meta.url)))

and for the AudioWorklet:

index.js

audioContext.audioWorklet.addModule(new URL('./quiet.worklet.js', import.meta.url))

Using Webpack as the bundler for application code allows the package to be written in such a way that it is compatible with native import.meta ES module syntax, which is already implemented in most modern browsers.

However, this did not solve every issue - when the modules are bundled together and transpiled, an implementation assumption is that all bundles have access to the same global require scope. This is false, however, for the AudioWorklet, which runs on the AudioWorkletGlobalScope and so any import statements would fail to resolve.

In order to resolve this issue, I used rollup as a build step for the package to resolve imports before publishing, producing a single ES module for both the main bundle as well as the AudioWorklet.

The result is @moxon6/quiet-js:

@moxon6/quiet-jsMartin Moxon

A live demo is available here, as well as an editable CodeSandbox below: