Implementing The Stages

In the last section we decomposed the general WebGPU compute process into three bite sized concepts. Now, we will implement a couple of them. This simplified example illustrates the use of buffers and commands to set up and copy data from input to output arrays.

Along the way, we collect some general methods into the WebgpuCompute class, and use them in the CopyBuffer class to implement our example.

Is WebGPU supported

WebGPU support is still spinning up, so we need to be very pedantic here because it may not be supported, or not be fully supported on any platform we try, especially mobile. navigator.gpu is the entry point for the WebGPU API, so the first step is to check if navigator.gpu is present.


    if (!navigator.gpu) {
        throw Error("WebGPU is not supported.");
    }

The specific application will catch this error and handle it as appropriate. Common approaches include popping up a user alert, or at least logging the error.

Get an Adapter

Now that we know WebGPU is supported, the next step is to get an adapter. The WebGPU adapter corresponds roughly a physical GPU. Many systems have two physical GPUs. One power efficient GPU built into the processor, and a second, high performance and high power, discrete GPU. When we request an adapter we can specify low power or high performance options.


    let adapter = await navigator.gpu.requestAdapter(options);
    if (!adapter) {
        if (options) {
            throw Error("Request WebGPU adapter failed with options: " + JSON.stringify(options));
        } else {
            throw Error("Request WebGPU adapter failed.");
        }
    }

Laptops, desktops, and mobile devices have a wide range of graphics adapters with a correspondingly wide range of capabilities. WebGPU encapsulates this variability through optional features and limits. All implementations of WebGPU have a baseline of capabilities, with optional features and resource limits that can be requested. If no optional capabilities are requested, then you receive the base set of capabilities. This is a boon for compatability, a project that works without requesting additional resources should work on most if not all WebGPU capable systems.

Resource Limits

Resource limits are one of the optional capabilities that can be specified when acquiring a WebGPU device. There is a base level of resources that is expected to be available on all WebGPU systems. Most importantly, if we make the default requests we get this base set of resources, even on more capable systems. This is a boon for compatability, a project that works without requesting additional resources should work on most if not all WebGPU capable systems. If possible, we will avoid dependencies on optional limits or features.

We get the resource limits for a particular system from adapter.limits. adapter.limits is an object where each property is the name of a limit, and the value of the property is the limit on the system. We can then log the limits for a system with:


    for (const prop in adapter.limits) {
        console.log(`${prop}: ${adapter.limits[prop]}`);
    }

Or, we can use some slightly more elegant code to build a user-friendly table.

Limits for this system

Limit	Available Limit	Default Limit

Features

While limits vary even within a generation of graphics cards, features vary most strongly from generation to generation. Many optional features center around texture compression and data representation.

Getting the Device

Once we have a physical device, the adapter, we get the logical device. The WebGPU device is a connection to the adapter that manages and isolates the applications resources. All of our interactions with WebGPU will be through the device.

When we get the device, we take only the default limits and features. The only option we specify is the label, to identify the source of any diagnostic messages.


    device = await adapter.requestDevice({label: "Our compute device"});

    device.addEventListener('uncapturederror', (event) => {
        console.error(`Uncaught WebGPU error from ${device.label}: `, event.error);
    });

Buffers

Fundamentally, buffers are blocks of memory on the GPU. Here, we allocate a buffer for a dataset, which in this example is a Float32Array.

This example sets all the available options for creating a buffer.

The label identifies the buffer for use in diagnostic messages.

The size is the number of bytes to be allocated to the buffer.

The usage is bit flags that indicate the operations we plan to make with this buffer. GPUBufferUsage.COPY_SRC means that will copy data from the buffer.

A mapped buffer has a corresponding CPU side block of memory, which normally can be read or written as we specify MAP_READ or MAP_WRITE for our usage flags. mappedAtCreation is a special case, where any buffer can be mapped and written at its creation, streamlining the initialization process. So, we map and write to this buffer even though the MAP_WRITE flag is not set.


    // Get a GPU buffer in a mapped state and an arrayBuffer for writing.
    const gpuWriteBuffer = device.createBuffer({
        label: "my compute input buffer",
        size: dataset.byteLength,
        usage: GPUBufferUsage.COPY_SRC
        mappedAtCreation: true,
    });

Initializing the Buffer

getMappedRange() returns the contents of the buffer as an ArrayBuffer.


    const arrayBuffer = gpuWriteBuffer.getMappedRange();

We wrap the ArrayBuffer with any of the JavaScript typed arrays, and assign values.


    new Float32Array(arrayBuffer).set(dataset);

Finally, we unmap the buffer. Unmapping the buffer copies the data to the GPU and returns control of the buffer to the GPU.


    gpuWriteBuffer.unmap();

We also need a buffer to read our data back from the GPU. This buffer is the same size as the input buffer, but, as we might expect, the usage is different. COPY_DST flags that this buffer is the destination for a copy operation. MAP_READ flags that this will be mapped to CPU side memory for reading.


    // Get a GPU buffer for reading in an unmapped state.
    const gpuReadBuffer = device.createBuffer({
        label: "my compute read buffer",
        size: dataset.byteLength,
        usage: GPUBufferUsage.COPY_DST | GPUBufferUsage.MAP_READ
    });

Commands

Now let's do something with these buffers. Specifically, we plan to copy data from the WriteBuffer to the ReadBuffer, then verify that we read back the original data. We construct a series of commands, then dispatch the commands from the CPU to the GPU.

We start with the command encoder, which allows us to build a list of commands to submit to the GPU. The only option available while creating the command encoder is the label, which we set.


    const copyEncoder = device.createCommandEncoder({label: "GPU compute command encoder"});

Many commands, or sequences of commands, can be attached to a command encoder. This example exercises a single command, copyBufferToBuffer, which as the name implies, copies the contents of one buffer to another.

We copy the contents of gpuWriteBuffer to gpuReadBuffer. For each buffer we start at the beginning of the buffer, and copy the same count of bytes we set in gpuWriteBuffer. This operation works because we set the usage for gpuWriteBuffer and gpuReadBuffer to COPY_SRC and COPY_DST respectively.


    copyEncoder.copyBufferToBuffer(
        gpuWriteBuffer,     // source buffer
        0,                  // source offset
        gpuReadBuffer,      // destination buffer
        0,                  // destination offset
        dataset.byteLength, // byte count to be copied
    );

Now that we have our commands, call finish to get a command buffer. This command buffer is then submitted to the device queue.


    const copyCommands = copyEncoder.finish({label: "GPU Compute Command Buffer"});
    device.queue.submit([copyCommands]);

The queue copies the commands and any needed data to the GPU, where they will be executed as the gpu is able. Most applications have more and more complex commands. In these more complex cases the asynchronous nature of the queue shows a much greater value.

Retrieving Data

Once our computations are done, we need to retrieve data from the GPU. mapAsync(GPUMapMode.READ) returns a promise, which resolves when the queued operations on the buffer are complete.


    await gpuReadBuffer.mapAsync(GPUMapMode.READ);
    const copyArrayBuffer = gpuReadBuffer.getMappedRange();
    const readData =  new Float32Array(copyArrayBuffer.slice());