Unleash Your Inner Supercomputer

Your Guide to WebGPU Compute

New and Improved

Now with WebGPU

The GPU provides nearly ubiquitous powerful parallel compute capabilities. The knowledge to make use of these capabilities however, is far from ubiquitous. Built around rendering pixels to the screen, the programming model and architecture are well outside most programmer's experience. In what follows we provide detailed explanations and examples of programming idioms common in developing computational models and programs suitable for harnessing the GPU for general purpose computations.

Super speed

Quite simply, it's all about performance. We will learn a new way of thinking about computational problems. A way of thinking that is more work, but that gives us access to design elements - simulations and computations - that would otherwise be unreachable. We will take the graphics hardware driven by the billion dollar gaming industry and use it in an entirely different way.

GPUs are so effective that an entire ecosystem of specialized languages and environments has grown up around general-purpose computing on graphics processing units (GPGPU) focusing on high performance code. We, though, will focus on portability and instructional design, limiting our discussion to techniques available from within a web browser.

GPU performance far outpaces CPU performance, and the gap is getting larger!

The performance shown is for single precision, 32 bit, floating point numbers. The GPU deals extremely well with these numbers. However, using higher precision numbers, such as 64 bit floating point, dramatically decreases performance. Indeed, these higher precision numbers are not widely available through WebGPU or Vulkan. Their use is more common with specialized tools such as OpenCL or CUDA. Even there it has a significant performance cost. This 32 bit precision is a great fit for many applications, especially in instruction.

Wait, What? Computations on the GPU?

The GPU is for graphics, what do we even mean when we talk about computations on the GPU?

Modern computers use small programs, called shaders, to compute the color of each pixel on the screen. Moreover, computers have specialized hardware, graphics processing units (GPUs), to run large numbers of these shaders in parallel. Computer games depend on this both for a high frame rate, and for many effects such as lighting and shadows. We can bring all this parallelism and performance to bear on our problems as long as we can make them look similar to computing pixels for the screen, that is we can arrange the computations on a grid. Fortunately, there are a large number of problems in the sciences, engineering, and mathematics that are addressable on a grid.

Graphics vertices build triangles, which
are filled in by fragments.

That look just like a grid for
numerically solving a
differential equation.

Shaders were introduced into mainstream graphics in early 2001, and were almost immediately adopted for use beyond graphics. Dedicated compute shaders were introduced in 2006, and now in 2024 have made their way into the newest Web graphics API, WebGPU. Compute shaders are a significant step forward and provide a much clearer path than leveraging graphics shaders for compute tasks. We will see that shaders are a great fit for numeric calculations on a mesh or grid. Many other computing models have also been mapped onto graphics hardware. Examples include differential equations, financial modeling, machine learning, image processing, and even databases.

How do we do it?

What are the main elements that map a problem onto a GPU? GPUs are designed for graphics, so it will take some effort to wrap our minds around using them in a wider context. As we will see, for certain problems it is well worth the effort.

The basic structure of most gpu compute implementations.

Most gpu compute implementations share a common structure. This is the result of fitting the problem into a form that can be readily addressed using graphics hardware and software. We will start with a general description, then walk through concrete examples to provide a strong introduction to using GPUs for computation.

The Data

Buffers are the primary means to exchange data with compute shaders. Buffers are blocks of memory on the GPU. They are created from a GPUDevice with GPUDevice.createBuffer. These buffers can also be mapped to system memory so that they can be read or written from our CPU side code.

The Code

Compute Shaders can loosely be thought of as fragment shaders on steroids. Where a fragment shader would be invoked once per fragment, we invoke the compute shader once per result element. For example consider matrix multiplication.

R_{i j} = \sum_{k = 1}^{n} A_{i k} B_{k j}

Matrices are easily seen as arrays of numbers.

We invoke the computer shader for each $R_{i j}$

The Commands

The last step is to accumulate commands for the gpu. These commends tell the GPU which shader to execute and which resources to use. The commands are then submitted to the GPU for execution.