Leapfrog Boundary Conditions

Once again we must address the question of interaction with the boundary of the simulation. We have shown that the finite difference method encounters difficulties at the edges of the domain. Luckily, the staggered time and central difference methods are so similar that we can reuse the central difference solution.

A realistic simulation requires absorbing boundary conditions.

Clearly the finite difference method has problems at the boundary, where x = 0 or x = L . Mur boundary conditions are a simple and effective boundary condition that absorbs incoming waves, thus preventing their reflection back into the simulation. We worked through its application for the central difference case, and with some minor adjustments we can apply it here as well.

Start with the same update expression we used for the central difference method. We use this expression to replace the values at the boundary with those of a purely outbound wave.

Ψ ( X , t + Δ t ) = Ψ ( X + ϵ Δ x , t ) + v p Δ t - Δ x v p Δ t + Δ x ( Ψ ( X + ϵ Δ x , t + Δ t ) - Ψ ( X , t ) )

where

ϵ = { + 1 if   X = 0 ; - 1 if   X = L ;

We immediately see an issue. This equation requires Ψ at two distinct times while the leapfrog method performs the updates in place and does not maintain values from the prior time. Luckily this is easily resolved.

We create a storage buffer to hold the old wave function values. This buffer is written by the finite difference shader, and read by the boundary value shader.


  oldWaveFunctionBuffer = device.createBuffer({
    label: "Old Wave function",
    size: 2*xResolution*Float32Array.BYTES_PER_ELEMENT,
    usage: debug ? GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_SRC : GPUBufferUsage.STORAGE
  });
          

The finite difference and boundary value shaders have slightly different usage requirements for the wave function buffers, nonetheless, we use common bind groups for them. It would be wasteful to create a new bindGroup and bindGroupLayout for the boundary value shader when we already have ones that meet or exceed its requirements.


  waveFunctionBindGroupLayout = device.createBindGroupLayout({
    label: "Wave function data.",
    entries: [
      {
        binding: 0,
        visibility: GPUShaderStage.COMPUTE,
        buffer: {
          type: "storage"
        }
      },
      {
        binding: 1,
        visibility: GPUShaderStage.COMPUTE,
        buffer: {
          type: "storage"
        }
      }
    ]
  });
          

  waveFunctionBindGroup = device.createBindGroup({
    label: "Wave function buffer binding",
    layout: waveFunctionBindGroupLayout,
    entries: [
      {
        binding: 0,
        resource: {
          buffer: waveFunctionBuffer
        }
      },
      {
        binding: 1,
        resource: {
          buffer: oldWaveFunctionBuffer
        }
      }
    ]
  });
          

Both of the compute shaders are modified to include a storage buffer holding the old values for the wave function.


  // Group 1, Current and old wave function with Ψ_r at t and Ψ_i at t+Δt/2.
  @group(1) @binding(0) var<storage, read_write> waveFunction : array<vec2f>;
  @group(1) @binding(1) var<storage, read_write> oldWaveFunction: array<vec2f>;
          

The FDTD solver simply stashes the wave function in this new buffer before updating it.


  oldWaveFunction[index].y = waveFunctionAtX.y;
  waveFunction[index].y = waveFunctionAtX.y
                           + ((waveFunctionAtXPlusDx.x - 2.0*waveFunctionAtX.x + waveFunctionAtXMinusDx.x)
                              / dx22 - V*waveFunctionAtX.x) * parameters.dt;
          

The updates for the Mur boundary shader are a bit more complex. Remember that the laepfrog method updates the real part of the wave function then the imaginary part of the wave function.

  1. Update the real part of the wave function
  2. Update the imaginary part of the wave function

We need to modify it to enforce the relevant boundary condition at the end of each step.

  1. Update the real part of the wave function.
    1. Enforce the real part of the boundary conditions.
  2. Update the imaginary part of the wave function.
    1. Enforce the imaginary part of the boundary conditions.

Take another look at the boundary update expression. There is no explicit complex term so we directly break it into real and imaginary parts, for example:

Ψ r ( X , t + Δ t ) = Ψ r ( X + ϵ Δ x , t ) + v p Δ t - Δ x v p Δ t + Δ x ( Ψ r ( X + ϵ Δ x , t + Δ t ) - Ψ r ( X , t ) )

We need explicit, separate, methods to update the real and imaginary parts of the wave function at the boundary. These have exactly the same form as the original Mur boundary code, they simply separate out the updates for the real and imaginary parts of the wave function.


  @compute @workgroup_size(2)
  fn recomputeRealBoundary(@builtin(global_invocation_id) global_id : vec3u) {
    // Index will be 0 at the left edge, and 1 at the right edge.
    let index = global_id.x;
    let sotrageBufferIndex = index*(parameters.xResolution-1);
    let offset = 1 - 2*index;
    let dx = parameters.length / f32(parameters.xResolution-1);

    waveFunction[sotrageBufferIndex].x = oldWaveFunction[sotrageBufferIndex+offset].x
                 + ((phaseVelocity*parameters.dt-dx)/(phaseVelocity*parameters.dt+dx))
                    *(waveFunction[sotrageBufferIndex+offset].x-oldWaveFunction[sotrageBufferIndex].x);
  }
          

The compute pipeline identifies which shader we use, as well as the specific entry point into that shader. This pipeline executes the real part of the boundary value update, reusing the parametersBindGroupLayout and waveFunctionBindGroupLayout from the finite difference compute shader.


  boundaryValueRealPipeline = device.createComputePipeline({
    label: "Real part update pipeline",
    layout: device.createPipelineLayout({
      bindGroupLayouts: [
        parametersBindGroupLayout,
        waveFunctionBindGroupLayout,
        boundaryValueParametersLayout
      ]
    }),
    compute: {
      module: boundaryValueShaderModule,
      entryPoint: "recomputeRealBoundary"
    }
  });
          

It is convenient to set up methods to append the boundary value update to the command stream.


  /**
   * Append a compute pass to implement the boundary conditions.
   *
   * @param {GPUCommandEncoder} commandEncoder The command encoder currently in use to collect GPU commands.
   * @param {GPUBindGroup} The bind group, describing which wave function buffers are bound to which indices,
   *                       is currently in use.
   */
  makeRealComputePass(commandEncoder, waveFunctionBindGroup)
  {
    const passEncoder = commandEncoder.beginComputePass();
    passEncoder.setPipeline(boundaryValueRealPipeline);
    passEncoder.setBindGroup(0, parametersBindGroup);
    passEncoder.setBindGroup(1, waveFunctionBindGroup);
    passEncoder.setBindGroup(2, boundaryValueBindGroup);
    // We just need the one two thread workgroup, one for each edge.
    passEncoder.dispatchWorkgroups(1);
    passEncoder.end();
  }
          

This all comes together when we interleave the boundary value shader invocations with the FDTD shader invocations. This loop executes count finite difference steps between animation frames. This code block references the makeRealComputePass and makeImaginaryComputePass methods. We have traced the development of the makeRealComputePass method, makeImaginaryComputePass follows a similar path.

The bcEnabled flag is set to true for the "Launch with absorbing boundary conditions" case, and set to false for the "Launch without boundary conditions" case.


  running = true;

  // Recreate this because it can not be reused after finish is invoked.
  const commandEncoder = device.createCommandEncoder();
  const workgroupCountX = Math.ceil(xResolution / WORKGROUP_SIZE);
  for (let i=0; i<count && running; i++)
  {
    const realPassEncoder = commandEncoder.beginComputePass();
    realPassEncoder.setPipeline(realPartTimeStep);
    realPassEncoder.setBindGroup(0, parametersBindGroup);
    realPassEncoder.setBindGroup(1, waveFunctionBindGroup);
    realPassEncoder.dispatchWorkgroups(workgroupCountX);
    realPassEncoder.end();

    if (bcEnabled) {
      makeRealComputePass(commandEncoder, waveFunctionBindGroup);
    }

    const imaginaryPassEncoder = commandEncoder.beginComputePass();
    imaginaryPassEncoder.setPipeline(imaginaryPartTimeStep);
    imaginaryPassEncoder.setBindGroup(0, parametersBindGroup);
    imaginaryPassEncoder.setBindGroup(1, waveFunctionBindGroup);
    imaginaryPassEncoder.dispatchWorkgroups(workgroupCountX);
    imaginaryPassEncoder.end();

    if (bcEnabled) {
      makeImaginaryComputePass(commandEncoder, waveFunctionBindGroup);
    }
  }

  // Submit GPU commands.
  const gpuCommands = commandEncoder.finish();
  device.queue.submit([gpuCommands]);
          

Task Manager

As we might expect we see only slight differences in the performance plots with and without the boundary conditions in effect.

Staggered Time With Boundary Conditions

With boundary conditions enabled we double the number of compute passes. However, the extra passes for the boundary conditions are each a single workgroup consisting of only two points, and so generate little additional work.

With boundary conditions enabled, the graphics engine utilization fluctuates among the upper 40%'s.
With or without boundary conditions enabled, the memory copy engine workload is light.

Without Boundary Conditions

Disabling the boundary conditions reduces the compute load just a little. Except for a few cases where you are certain it will have no effect, such as a particle in a box, I would simply leave it enabled.

With boundary conditions disabled, the graphics engine utilization peaks at just over 40%, then levels off at about 38% for a while.
With or without boundary conditions enabled, the memory copy engine workload is light.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.