Displaying GPGPU Results

Our Schrödinger example shows how to carry out physical simulations with a compute shader. Now that we have some results, we want to see what they look like. This will take us closer to mainstream graphics, but with a slight twist.

At the end of each step of our Schrödinger solver, the waveFunction array holds values for $Ψ (x, t)$ at some time $t$ . We will map these values into a curve on a canvas. Repeating this periodically produces an animation of the simulation in progress.

The obvious thing might be to do some processing and make draw calls to render our data as WebGPU lines. However, as this is a strictly 2-dimensional curve, we do something a bit more elegant. We create a couple of triangles that cover the canvas, and color each fragment, or pixel, according to whether it falls on the curve or not.

For each fragment, or pixel, we read the waveFunction and compare its value with the value represented by the pixel. If they match, we set the color for the pixel, otherwise we leave the pixel blank.

Color in canvas pixels corresponding to

{| Ψ (x, t) |}^{2}

from the waveFunction array.

The renderer takes the waveFunction array as input and draws a representative curve. We represent ${| Ψ (x, t) |}^{2}$ by coloring the pixels on an xResolution x yResolution canvas that correspond to the values of ${| Ψ (x, t) |}^{2}$ at each $x$ .

We use the built-in position and work with the frame buffer coordinates in the fragment shader. These coordinates range from (0, 0) in the upper left to (xResolution-1, yResolution-1) in the lower right. We ignore the z coordinate to create a 2D curve in the z=0 plane.

To compute the $Ψ$ value corresponding to a pixel, we need to find the fraction of the y height that the pixel occupies. The y pixel coordinate ranges from 0 to yResolution top to bottom. The value we want is then yHeight = (yResolution - pixel.y)/yResolution.

We color the pixel if ${| Ψ |}^{2} > {| Ψ_{max} |}^{2} \times (yHeight - 0.5 \frac{1}{yResolution})$ and ${| Ψ |}^{2} \leq {| Ψ_{max} |}^{2} \times (yHeight + 0.5 \frac{1}{yResolution})$ We can make a straightforward translation into a shader.


 struct WaveFunctionParameters
 {
   dt: f32,              // The time step size
   xResolution: u32,     // The number of points along the x-axis, the number of elements in the array.
   length: f32,          // The full length for our simulation
 }

 struct PlotParameters
 {
   // Psi*Psi color
   psiColor: vec4f,
   // Y scale for the psi plot
   psiMax: f32,
   // Number of points along the y axis.
   yResolution: u32
 }

 // group 0 and 1, things that never change within a simulation.
 // The parameters for the simulation
 @group(0) @binding(0) var<storage, read> waveFunctionParameters: WaveFunctionParameters;
 // Plotting parameters, line colors, width, etc.
 @group(1) @binding(0) var<uniform> plotParameters : PlotParameters;

 // Group 1, the wave function at t, changes on each invocation.
 @group(2) @binding(0) var<storage, read> waveFunction : array<vec2f>;

 @fragment
 fn fs_main(@builtin(position) fragPos: vec4<f32>) -> @location(0) vec4<f32>
 {
   let psiMax2          = plotParameters.psiMax*plotParameters.psiMax;
   // Remember, frag position ranges from 0.5 to xResolition-0.5,
   // see https://www.w3.org/TR/webgpu/#rasterization
   let index            = i32(fragPos.x);
   let psi              = waveFunction[index];
   let yResolution1     = 1.0/f32(plotParameters.yResolution);
   let adjustedPixel    = (f32(plotParameters.yResolution)-fragPos.y) * yResolution1;
   let absPsi2          = psi.r*psi.r + psi.g*psi.g;

   return plotParameters.psiColor*(smoothstep(psiMax2*(adjustedPixel-1.5*yResolution1),
                                              psiMax2*(adjustedPixel-yResolution1),
                                              absPsi2)
                                   - smoothstep(psiMax2*(adjustedPixel+yResolution1),
                                                psiMax2*(adjustedPixel+1.5*yResolution1),
                                                absPsi2));
 }

 @vertex
 fn vs_main(@location(0) inPos: vec3<f32>) -> @builtin(position) vec4f
 {
   return vec4(inPos, 1.0);
 }

We use the built-in smoothstep function to smooth out the hard edge of the lines.

Ψ (x) vs x

This is a rather flat and uninteresting wave function, but, if you think about it, this is what we expect. The time dependent Schrödinger equation describes the time evolution of a wave function. We start out with $Ψ (x) = 0$ and nothing in the physics changes that. Later we will look at injecting a moving particle into our simulation.

There are some interesting aspects to setting up this shader. We take as input the simulation parameters and the wave function values. In order to reuse these buffers we must create the result rendering shader and its buffers from the same WebGPU device that was used with the original simulation.

We create the shader module just as we did for the compute shader. However, we are careful to reuse the same device object as was used in the computer shader.


  rendererShaderModule = device.createShaderModule({
    label: 'Schrodinger renderer shader',
    code: rendererShader
  });

This time we are doing rendering, so we need vertices. These vertices are the corners of the canvas in normalized device coordinates. This means we have a consistent way to reference the corners of the canvas, and we pass them through the vertex shader unaltered.


  // A pair of triangles that cover the canvas in normalized device coordinates
  const vertexData = new Float32Array([
    -1.0,  1.0, 0.0, // upper left
    -1.0, -1.0, 0.0, // lower left
     1.0,  1.0, 0.0, // upper right
     1.0, -1.0, 0.0  // lower right
  ]);

Create a buffer on the GPU to hold this data. This is different from other buffers we have created in a couple of ways. The usage flag is set for VERTEX data, and we easily know the size of the buffer. Because the data is well known floating point values, we directly created an array holding it, and get the size of this array.


  vertexBuffer = device.createBuffer({
    label: 'Position',
    mappedAtCreation: true,
    size: vertexData.byteLength,
    usage: GPUBufferUsage.VERTEX
  });

The process of loading the data into the buffer follows what is now a familiar path.


  const vertexArrayBuffer = vertexBuffer.getMappedRange();
  new Float32Array(vertexArrayBuffer).set(vertexData);
  vertexBuffer.unmap();

We need to describe the content type and layout of the data.

The arrayStride is the spacing between data elements for each vertex. We have x, y, z coordinates for each vertex, hence the stride is 3 * the size of a floating point number.

The stepMode distinguishes between per vertex and per instance data. We have per vertex data, so the first vertex gets the first three coordinates, the second vertex gets the second set of coordinates, etc. For an instance buffer, the data retrieved advances on each instance rather than on each vertex.

The shaderLocation identifies the @location to match within the shader.

The offset is how far into the array, in bytes, that the data begins. We have only the vertex data in this array, so, as we expect, the offset is 0. It has been common in GPU programming to use the offset and the stride to mix multiple data sets within a buffer.

The format The vertex coordinates we supply are sets of 3 32-bit floats, but there are many, many, other formats that would be possible.


  vertexBuffersDescriptor = [{
    arrayStride: 3 * Float32Array.BYTES_PER_ELEMENT,
    stepMode: 'vertex',
    attributes: [{
      shaderLocation: 0, // @location in shader
      offset: 0,
      format: 'float32x3'
    }]
  }];

The next interesting step is to set up the plot parameters buffer. In this case, we only need them visible to the fragment shader. This time we use a uniform because it is fixed size constant data.


  plotParametersBindGroupLayout = device.createBindGroupLayout({
    entries: [{
      binding: 0,
      visibility: GPUShaderStage.FRAGMENT,
      buffer: {
        type: "uniform"
      }
    }]
  });

Now we create the actual buffer. As before, we set the size of the buffer from the length and type of the data fields. What might stand out here is the extra eight bytes at the end of the buffer. There are rules for the layout of buffers, roughly, we see that the size must be an integral multiple of the structs largest member. In this case, the psiColor consists of four floats, hence occupies 16 bytes. We then expect the size of the struct to be a multiple of 16, requiring the 8 byte padding.


  plotParametersBuffer = device.createBuffer({
    label: 'Plot Parameters',
    mappedAtCreation: true,
    size: 4*Float32Array.BYTES_PER_ELEMENT // psiColor
          + Float32Array.BYTES_PER_ELEMENT   // psiMax
          + Uint32Array.BYTES_PER_ELEMENT    // yResolution
          + 8,                               // Required padding
    usage: GPUBufferUsage.UNIFORM
  });

We load the data into the buffer just as we did in the previous cases, wrap the array buffer in the appropriate typed array, and load the desired value. We simply ignore the padding.


  // Get the raw array buffer for the mapped GPU buffer
  const plotParametersArrayBuffer = plotParametersBuffer.getMappedRange();

  let bytesSoFar = 0;
  new Float32Array(plotParametersArrayBuffer, bytesSoFar, 4).set(psiColor);
  bytesSoFar += 4*Float32Array.BYTES_PER_ELEMENT;
  new Float32Array(plotParametersArrayBuffer, bytesSoFar, 1).set([psiMax]);
  bytesSoFar += Float32Array.BYTES_PER_ELEMENT;
  new Uint32Array(plotParametersArrayBuffer, bytesSoFar, 1).set([yResolution]);

  plotParametersBuffer.unmap();

WebGPU, like WebGL, renders onto a canvas. We set up the canvas width to match the problem size, and choose the to provide a good view of the data.


  <canvas width="1024" height="100" id="schrodingerResults">
  </canvas>

Once we have a canvas we get and configure a WebGPU context. We get the preferred canvas format, and set this as the format for the canvas current texture. Other formats should work, however, they would incur a performance overhead as color values are translated to and from the preferred format.

We also set premultiplied alpha. If I want transparency, I'll use the appropriate color values. Also, in a little while we will include the imaginary and real parts of the wave function to the plot, and make the adjustments to the color in the revised shader.


  // Get a WebGPU context from the canvas and configure it
  const canvas = document.getElementById(canvasID);
  webGPUContext = canvas.getContext('webgpu');
  // This will be either rgba8unorm or bgra8unorm
  presentationFormat = navigator.gpu.getPreferredCanvasFormat();
  webGPUContext.configure({
    device: device,
    format: presentationFormat,
    alphaMode: 'premultiplied'
  });

We reuse the wave function buffer, but need to create new layouts because we use only the one buffer containing the current values. We also make it visable to the fragment shader this time.


  const waveFunctionBindGroupLayout = device.createBindGroupLayout({
    label: "Wave function layout",
    entries: [{
      binding: 0,
      visibility: GPUShaderStage.FRAGMENT,
      buffer: {
        type: "read-only-storage"
      }
    }]
  });

We also need a new bind group containing only the single wave function buffer.


  const waveFunctionBindGroup = device.createBindGroup({
    layout: waveFunctionBindGroupLayout,
    entries: [{
      binding: 0,
      resource: {
        buffer: waveFunctionBuffer
      }
    }]
  });

The pipeline layout is built from the data buffer layouts.


  const pipelineLayout = device.createPipelineLayout({
    bindGroupLayouts: [
      parametersBindGroupLayout,     // Simulation parameters
      plotParametersBindGroupLayout, // Plot parameters
      waveFunctionBindGroupLayout    // The wave function values
    ]
  });

We see an immediate difference with the use of createRenderPipeline. This creates, you guessed it, a pipeline to render graphics rather than to do computations. It turns out that render pipelines are more complicated than compute pipelines. Render pipelines have vertex and fragment shaders, vertex buffers in addition to data buffers, and they have a render target where the graphics are displayed on screen.

The primitive option specifies this as a triangle strip. For our triangle strip, vertices 1, 2, and 3 form a triangle, then vertices 2, 3, and 4 form a second triangle. These two triangles completely cover our drawing surface, so they are all we need.

The vertex option gives the shader module, the entry point for the vertex shader, and the vertex buffers. This is pretty straight forward, except to remember that the vertex buffers, and the storage buffers are specified separately.

The fragment option also specifies the shader module and the entry point for the fragment shader. Instead of input buffers, the fragment option specifies output targets. In this case the only output target we have is the canvas, which, as we will soon see, is the @location(0) output from the fragment shader.


  const pipeline = device.createRenderPipeline({
    label: 'Render triangles to cover the rectangular canvas.',
    layout: pipelineLayout,
    primitive: {
      topology: "triangle-strip",
    },
    vertex: {
      module: rendererShaderModule,
      entryPoint: 'vs_main',
      buffers: vertexBuffersDescriptor
    },
    fragment: {
      module: rendererShaderModule,
      entryPoint: 'fs_main',
      targets: [{
        format: presentationFormat
      }]
    }
  });

We start issuing commands to the GPU with a command encoder, just as with previous instances.


  const commandEncoder = device.createCommandEncoder();

We diverge from the compute case pretty quickly by creating a render pass. Also, to render we need a target, this is the color attachment. The color attachments are in an array with only one element. This zeroth element targets the webGPUContext we retreived from our canvas, making it the @location(0) output target for the fragment shader.

The loadOp is the operation carried out before we render to the canvas texture. The clear operation paints the target with the clearValue, giving us the transparent canvas, except for the curve which we draw.

The storeOp is done after the rendering is finished. The store option retains the rendered values, another option would be discard, which would zero the render attachment. We want to draw the attachment to the screen, so we need to store its contents.

After the attachment we set the pipeline. The pipeline describes the exceution of the shader, and the resources it will consume. The resources are bound to groups just as with the compute shader. The exception being the vertex buffer, which is set explicitly via setVertexBuffer.

Finally, the draw command renders the data to the screen. The simplest form of the WebGPU draw command takes a vertex count. So this draw(4) draws the four vertices as a triangle strip.


  const passEncoder = commandEncoder.beginRenderPass({
    colorAttachments: [{
      view: webGPUContext.getCurrentTexture().createView(),
      loadOp: 'clear',
      clearValue: [0.0, 0.0, 0.0, 0.0],
      storeOp: 'store',
    }]
  });
  passEncoder.setPipeline(pipeline);
  passEncoder.setBindGroup(0, parametersBindGroup);
  passEncoder.setBindGroup(1, plotParametersBindGroup);
  passEncoder.setBindGroup(2, waveFunctionBindGroup);
  passEncoder.setVertexBuffer(0, vertexBuffer);
  passEncoder.draw(4);
  passEncoder.end();

The final step is consistent across pipeline types. We capture the commands from the encoder and submit them to the GPU for execution.


  const commandBuffer = commandEncoder.finish();
  device.queue.submit([commandBuffer]);

Next, we look at setting up an initial wave function to make this a physically interesting simulation.