Skip to content

Instantly share code, notes, and snippets.

@rmccorm4
Last active May 11, 2020 18:22
Show Gist options
  • Select an option

  • Save rmccorm4/bb4a0d505ca2d355b39739de9497eee2 to your computer and use it in GitHub Desktop.

Select an option

Save rmccorm4/bb4a0d505ca2d355b39739de9497eee2 to your computer and use it in GitHub Desktop.

Revisions

  1. rmccorm4 revised this gist May 11, 2020. 1 changed file with 6 additions and 5 deletions.
    11 changes: 6 additions & 5 deletions setup_binding_shapes.py
    Original file line number Diff line number Diff line change
    @@ -12,13 +12,14 @@ def setup_binding_shapes(

    assert context.all_binding_shapes_specified

    host_outputs = [None] * len(output_binding_idxs)
    device_outputs = [None] * len(output_binding_idxs)
    for i, binding_index in enumerate(output_binding_idxs):
    host_outputs = []
    device_outputs = []
    for binding_index in output_binding_idxs:
    output_shape = context.get_binding_shape(binding_index)
    # Allocate buffers to hold output results after copying back to host
    host_outputs[i] = np.empty(output_shape, dtype=np.float32)
    buffer = np.empty(output_shape, dtype=np.float32)
    host_outputs.append(buffer)
    # Allocate output buffers on device
    device_outputs[i] = cuda.mem_alloc(host_outputs[i].nbytes)
    device_outputs.append(cuda.mem_alloc(buffer.nbytes))

    return host_outputs, device_outputs
  2. rmccorm4 created this gist May 10, 2020.
    24 changes: 24 additions & 0 deletions setup_binding_shapes.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,24 @@
    def setup_binding_shapes(
    engine: trt.ICudaEngine,
    context: trt.IExecutionContext,
    host_inputs: List[np.ndarray],
    input_binding_idxs: List[int],
    output_binding_idxs: List[int],
    ):
    # Explicitly set the dynamic input shapes, so the dynamic output
    # shapes can be computed internally
    for host_input, binding_index in zip(host_inputs, input_binding_idxs):
    context.set_binding_shape(binding_index, host_input.shape)

    assert context.all_binding_shapes_specified

    host_outputs = [None] * len(output_binding_idxs)
    device_outputs = [None] * len(output_binding_idxs)
    for i, binding_index in enumerate(output_binding_idxs):
    output_shape = context.get_binding_shape(binding_index)
    # Allocate buffers to hold output results after copying back to host
    host_outputs[i] = np.empty(output_shape, dtype=np.float32)
    # Allocate output buffers on device
    device_outputs[i] = cuda.mem_alloc(host_outputs[i].nbytes)

    return host_outputs, device_outputs