This application is used as an example on how to deploy a neural network for inference on the AIR-T. The method provided here leverages the PyCUDA interface for the shared memory buffer. PyCUDA is installed by default in AirStack.
airt_infer(model_file, cplx_input_len=2048, batch_size=128, num_batches=- 1, samp_rate=31250000.0, freq=2400000000.0, chan=0, input_node_name='input', input_port_name='')¶
Function to receive samples and perform inference using the AIR-T
The function will input a model file (onnx, uff, or plan), optimize it if necessary using TensorRT, setup the shared memory infrastructure between the radio buffer and the cudnn/TensorRT inference, configure the AIR-T’s radio hardware, stream samples from the radio to the inference engine, and print the inference result.
Example usage for
from airpack_scripts.airt.run_airt_inference import airt_infer onnx_file = saved_model.onnx # Trained model file airt_infer( onnx_file, cplx_input_len=2048, batch_size=128, num_batches=-1, samp_rate=31.25e6, freq=2.4e9, chan=0, input_node_name='input', input_port_name='' )
This function expects the input to be defined in number of complex samples even though the DNN expects real samples. The number of real samples is calculated to be
real_input_len = 2 * cplx_input_len.
The AIR-T’s radio will produce data of type SOAPY_SDR_CF32 which is the same as np.complex64. Because a np.complex64 value has the same memory construct as two np.float32 values, the GPU memory buffer is defined as twice the size of the SDR buffer but np.float32 dtypes. This is done because the neural network expects an input of np.float32. The SOAPY_SDR_CF32 can be copied directly to the np.float32 buffer.
This utility uses PyCUDA to create a shared memory buffer (zero-copy) that will receive samples from the AIR-T’s radio to be fed into the DNN. Note that this buffer is shared between the SDR and the DNN to prevent unnecessary copies. The buffer fed into the DNN for inference will be a 1-dimensional array that contains the samples for an entire mini-batch with length defined below. For example an
input_len = 2048complex samples and a
batch_size = 128will have a buffer of size
2048 * 128 = 262,144complex samples.
model_file (Union[str, os.PathLike]) – Trained neural network model file (onnx, uff, or plan)
cplx_input_len (int) – Complex samples per inference
batch_size (int) – Mini-batch size for inference
num_batches (int) – Number of batches to execute. -1 -> Infinite
samp_rate (float) – Radio’s sample rate in samples per second
freq (float) – Tuning frequency of radio in Hz
chan (int) – Radio channel to receive on (0 or 1)
input_node_name (str) – Name of input node
input_port_name (str) – Name of input node port (for TensorFlow 1)
- Return type