Perform Inference on AIR-T¶
Ensure that you have successfully completed the steps in
Training the Model and have a trained DNN in the .onnx
file format.
Click on the above video to make it large and open in new window.
Installing AirPack on the AIR-T¶
Deepwave recommends using an Anaconda virtual environment for operating on the AIR-T. This ensures that the Python environment is set up correctly and does not affect system python. Anaconda is natively supported for AirStack 0.4.0+. We recommend upgrading to the latest version of AirStack before using AirPack on the AIR-T.
Copy the AirPack director to a location on the AIR-T’s file system
From the AIR-T, open a terminal
Create a conda environment called
airstack
using theAirPack/environments/airstack.yml
file. You may modify this to fit your application.$ cd AirPack/environments $ conda env create -f airpack_airt.yml $ conda activate airpack
Install AirPack on the AIR-T:
$ pip install -e <path-to-AirPack>/AirPack
where `
’ is the location in which you copied the AirPack directory on the AIR-T. You should now be able to import the AirPack package from your conda python environment.
Perform Inference with Trained Model¶
You may use the AirPack/airpack_scripts/airt/run_airt_inference.py
function to evaluate
the performance of the model and display the result. This workflow assumes that you have a
trained model in a .onnx
file format. More information is available
in API Documentation of the Table of Contents:
Copy the
.onnx
file to the AIR-T.Activate the
airpack
conda environment:$ conda activate airpack
Run the inference script
$ python AirPack/airpack_scripts/airt/run_airt_inference.py -m `<path_to_onnx_file>`
This script will automatically create an optimized plan file and then execute that plan file on the received RF samples. For future runs, the plan file may be called directly by substituting the
.plan
file for the.onnx
file for the-m
option in the above command.The output of this script will look similar to the following:
$ python airpack_scripts/airt/run_airt_inference.py -m output/tf2/saved_model.plan [TensorRT] VERBOSE: Deserialize required 1579412 microseconds. TensorRT Inference Settings: Batch Size : 128 Explicit Batch : True Input Layer Name : input Shape : (128, 4096) dtype : float32 Output Layer Name : output Shape : (128, 12) dtype : float32 Receiver Output Size : 524,288 samples TensorRT Input Size : 524,288 samples TensorRT Output Size : 1,536 samples linux; GNU C++ version 7.3.0; Boost_106501; UHD_003.010.003.000-0-unknown Receiving Data Signal classes found for batch 0 = [0] Signal classes found for batch 1 = [0] Signal classes found for batch 2 = [0] Signal classes found for batch 3 = [0] Signal classes found for batch 4 = [0] Signal classes found for batch 5 = [0] Signal classes found for batch 6 = [0] Signal classes found for batch 7 = [0] Signal classes found for batch 8 = [0] Signal classes found for batch 9 = [0] Signal classes found for batch 10 = [0] Signal classes found for batch 11 = [0] Signal classes found for batch 12 = [0] Signal classes found for batch 13 = [0] Signal classes found for batch 14 = [0] Signal classes found for batch 15 = [0]
Python Code¶
This application is used as an example on how to deploy a neural network for inference on the AIR-T. The method provided here leverages the PyCUDA interface for the shared memory buffer. PyCUDA is installed by default in AirStack.
# Copyright 2021, Deepwave Digital, Inc.
# SPDX-License-Identifier: BSD-3-Clause
import os
import pathlib
from typing import Union
import numpy as np
from SoapySDR import Device, SOAPY_SDR_RX, SOAPY_SDR_CF32, SOAPY_SDR_OVERFLOW
from airpack.deploy import trt, trt_utils
def airt_infer(model_file: Union[str, os.PathLike], cplx_input_len: int = 2048,
batch_size: int = 128, num_batches: int = -1, samp_rate: float = 31.25e6,
freq: float = 2400e6, chan: int = 0) -> None:
""" Function to receive samples and perform inference using the AIR-T
The function will input a model file (onnx, uff, or plan), optimize it if necessary
using TensorRT, setup the shared memory infrastructure between the radio buffer and
the cudnn/TensorRT inference, configure the AIR-T's radio hardware, stream samples
from the radio to the inference engine, and print the inference result.
:param model_file: Trained neural network model file (onnx, uff, or plan)
:param cplx_input_len: Complex samples per inference
:param batch_size: Mini-batch size for inference
:param num_batches: Number of batches to execute. -1 -> Infinite
:param samp_rate: Radio's sample rate in samples per second
:param freq: Tuning frequency of radio in Hz
:param chan: Radio channel to receive on (0 or 1)
:return: None
"""
model_file = pathlib.Path(model_file) # convert from string if needed
real_input_len = 2 * cplx_input_len
# Optimize model if given an onnx or uff file
if model_file.suffix == '.plan':
plan_file_name = model_file
elif model_file.suffix == '.onnx':
plan_file_name = trt.onnx2plan(model_file,
input_len=real_input_len,
max_batch_size=batch_size)
elif model_file.suffix == '.uff':
plan_file_name = trt.uff2plan(model_file,
input_len=real_input_len,
max_batch_size=batch_size)
else:
raise ValueError(f'Unknown file extension {model_file.suffix}')
# Setup the CUDA context
trt_utils.make_cuda_context()
# Setup the shared memory buffer
buff_len = 2 * cplx_input_len * batch_size
sample_buffer = trt_utils.MappedBuffer(buff_len, np.float32)
# Set up the inference engine. Note that the output buffers are created for
# us when we create the inference object.
dnn = airpack.deploy.trt.TrtInferFromPlan(plan_file_name, batch_size, sample_buffer)
# Create, configure, and activate AIR-T's radio hardware
sdr = Device()
sdr.setGainMode(SOAPY_SDR_RX, chan, True)
sdr.setSampleRate(SOAPY_SDR_RX, chan, samp_rate)
sdr.setFrequency(SOAPY_SDR_RX, chan, freq)
rx_stream = sdr.setupStream(SOAPY_SDR_RX, SOAPY_SDR_CF32, [chan])
sdr.activateStream(rx_stream)
# Start receiving signals and performing inference
print('Receiving Data')
ctr = 0
while ctr < num_batches or num_batches == -1:
try:
# Receive samples from the AIR-T buffer
sr = sdr.readStream(rx_stream, [sample_buffer.host], buff_len)
if sr.ret == SOAPY_SDR_OVERFLOW: # Data was dropped, i.e., overflow
print('O', end='', flush=True)
continue
# Run samples through neural network
dnn.feed_forward()
# Get data from DNN output layer.
output_arr = dnn.output_buff.host
# Reshape into matrix of shape = (batch_size, ouput_len)
output_mat = output_arr.reshape(batch_size, -1)
# Determine what the predicted signal class is for each window
infer_result = np.argmax(output_mat, axis=1)
# Determine the unique class values found in current batch
classes_found = np.unique(infer_result)
print(f'Signal classes found for batch {ctr} = {classes_found}')
except KeyboardInterrupt:
break
ctr += 1
sdr.closeStream(rx_stream)