Utility code for performing inference using the TensorRT framework with settings optimized for AIR-T hardware.
Initializes a CUDA context for use with the selected GPU and makes it active.
This context is created with a set of flags that will allow us to use device mapped (pinned) memory that supports zero-copy operations on the AIR-T.
gpu_index (int) – Which GPU in the system to use, defaults to the first GPU (index 0)
- Return type
A device-mapped memory buffer for sharing data between CPU and GPU.
Once created, the
hostfield can be used to access the memory from CPU as a
numpy.ndarray, and the
devicefield can be used to access the memory from the GPU.
# Create a buffer of 16 single-precision floats buffer = MappedBuffer(num_elems=16, dtype=numpy.float32) # Zero the buffer by writing to it on CPU buffer.host[:] = 0.0 # Pass the device pointer to an API that works with GPU buffers func_that_uses_gpu_buffer(buffer.device)
Device-mapped memory is meant for Jetson embedded GPUs like the one found on the AIR-T, where both the host and device pointers refer to the same physical memory. Using this type of memory buffer on desktop GPUs will be very slow.
num_elems (int) – Number of elements in the created buffer
dtype (numpy.dtype) – Data type of an element (e.g.,
host (numpy.ndarray) – Access to the buffer from the CPU
device (CUdeviceptr) – Access to the buffer from the GPU