Perform Inference on AIR-T

Ensure that you have successfully completed the steps in Training the Model and have a trained DNN in the .onnx file format.

TF2 Training Click on the above video to make it large and open in new window.

Installing AirPack on the AIR-T

Deepwave recommends using an Anaconda virtual environment for operating on the AIR-T. This ensures that the Python environment is set up correctly and does not affect system python. Anaconda is natively supported for AirStack 0.4.0+. We recommend upgrading to the latest version of AirStack before using AirPack on the AIR-T.

  • Copy the AirPack director to a location on the AIR-T’s file system

  • From the AIR-T, open a terminal

  • Create a conda environment called airstack using the AirPack/environments/airstack.yml file. You may modify this to fit your application.

    $ cd AirPack/environments
    $ conda env create -f airpack_airt.yml
    $ conda activate airpack
    
  • Install AirPack on the AIR-T:

    $ pip install -e <path-to-AirPack>/AirPack
    

    where `’ is the location in which you copied the AirPack directory on the AIR-T. You should now be able to import the AirPack package from your conda python environment.

Perform Inference with Trained Model

You may use the AirPack/airpack_scripts/airt/run_airt_inference.py function to evaluate the performance of the model and display the result. This workflow assumes that you have a trained model in a .onnx file format. More information is available in API Documentation of the Table of Contents:

  • Copy the .onnx file to the AIR-T.

  • Activate the airpack conda environment:

    $ conda activate airpack
    
  • Run the inference script

    $ python AirPack/airpack_scripts/airt/run_airt_inference.py -m `<path_to_onnx_file>`
    

    This script will automatically create an optimized plan file and then execute that plan file on the received RF samples. For future runs, the plan file may be called directly by substituting the .plan file for the .onnx file for the -m option in the above command.

  • The output of this script will look similar to the following:

    $ python airpack_scripts/airt/run_airt_inference.py -m output/tf2/saved_model.plan
    [TensorRT] VERBOSE: Deserialize required 1579412 microseconds.
    TensorRT Inference Settings:
    Batch Size           : 128
    Explicit Batch       : True
    Input Layer
    Name               : input
    Shape              : (128, 4096)
    dtype              : float32
    Output Layer
    Name               : output
    Shape              : (128, 12)
    dtype              : float32
    Receiver Output Size : 524,288 samples
    TensorRT Input Size  : 524,288 samples
    TensorRT Output Size : 1,536 samples
    linux; GNU C++ version 7.3.0; Boost_106501; UHD_003.010.003.000-0-unknown
    
    Receiving Data
    Signal classes found for batch 0 = [0]
    Signal classes found for batch 1 = [0]
    Signal classes found for batch 2 = [0]
    Signal classes found for batch 3 = [0]
    Signal classes found for batch 4 = [0]
    Signal classes found for batch 5 = [0]
    Signal classes found for batch 6 = [0]
    Signal classes found for batch 7 = [0]
    Signal classes found for batch 8 = [0]
    Signal classes found for batch 9 = [0]
    Signal classes found for batch 10 = [0]
    Signal classes found for batch 11 = [0]
    Signal classes found for batch 12 = [0]
    Signal classes found for batch 13 = [0]
    Signal classes found for batch 14 = [0]
    Signal classes found for batch 15 = [0]
    

Python Code

This application is used as an example on how to deploy a neural network for inference on the AIR-T. The method provided here leverages the PyCUDA interface for the shared memory buffer. PyCUDA is installed by default in AirStack.

# Copyright 2021, Deepwave Digital, Inc.
# SPDX-License-Identifier: BSD-3-Clause

import os
import pathlib
from typing import Union
import numpy as np
from SoapySDR import Device, SOAPY_SDR_RX, SOAPY_SDR_CF32, SOAPY_SDR_OVERFLOW
from airpack.deploy import trt, trt_utils

def airt_infer(model_file: Union[str, os.PathLike], cplx_input_len: int = 2048,
               batch_size: int = 128, num_batches: int = -1, samp_rate: float = 31.25e6,
               freq: float = 2400e6, chan: int = 0) -> None:
    """ Function to receive samples and perform inference using the AIR-T

    The function will input a model file (onnx, uff, or plan), optimize it if necessary
    using TensorRT, setup the shared memory infrastructure between the radio buffer and
    the cudnn/TensorRT inference, configure the AIR-T's radio hardware, stream samples
    from the radio to the inference engine, and print the inference result.

    :param model_file:      Trained neural network model file (onnx, uff, or plan)
    :param cplx_input_len:  Complex samples per inference
    :param batch_size:      Mini-batch size for inference
    :param num_batches:     Number of batches to execute. -1 -> Infinite
    :param samp_rate:       Radio's sample rate in samples per second
    :param freq:            Tuning frequency of radio in Hz
    :param chan:            Radio channel to receive on (0 or 1)
    :return:                None
    """

    model_file = pathlib.Path(model_file)  # convert from string if needed
    real_input_len = 2 * cplx_input_len

    # Optimize model if given an onnx or uff file
    if model_file.suffix == '.plan':
        plan_file_name = model_file
    elif model_file.suffix == '.onnx':
        plan_file_name = trt.onnx2plan(model_file,
                                       input_len=real_input_len,
                                       max_batch_size=batch_size)
    elif model_file.suffix == '.uff':
        plan_file_name = trt.uff2plan(model_file,
                                      input_len=real_input_len,
                                      max_batch_size=batch_size)
    else:
        raise ValueError(f'Unknown file extension {model_file.suffix}')

    # Setup the CUDA context
    trt_utils.make_cuda_context()

    # Setup the shared memory buffer
    buff_len = 2 * cplx_input_len * batch_size
    sample_buffer = trt_utils.MappedBuffer(buff_len, np.float32)

    # Set up the inference engine. Note that the output buffers are created for
    # us when we create the inference object.
    dnn = airpack.deploy.trt.TrtInferFromPlan(plan_file_name, batch_size, sample_buffer)

    # Create, configure, and activate AIR-T's radio hardware
    sdr = Device()
    sdr.setGainMode(SOAPY_SDR_RX, chan, True)
    sdr.setSampleRate(SOAPY_SDR_RX, chan, samp_rate)
    sdr.setFrequency(SOAPY_SDR_RX, chan, freq)
    rx_stream = sdr.setupStream(SOAPY_SDR_RX, SOAPY_SDR_CF32, [chan])
    sdr.activateStream(rx_stream)

    # Start receiving signals and performing inference
    print('Receiving Data')
    ctr = 0
    while ctr < num_batches or num_batches == -1:
        try:
            # Receive samples from the AIR-T buffer
            sr = sdr.readStream(rx_stream, [sample_buffer.host], buff_len)
            if sr.ret == SOAPY_SDR_OVERFLOW:  # Data was dropped, i.e., overflow
                print('O', end='', flush=True)
                continue
            # Run samples through neural network
            dnn.feed_forward()
            # Get data from DNN output layer.
            output_arr = dnn.output_buff.host
            # Reshape into matrix of shape = (batch_size, ouput_len)
            output_mat = output_arr.reshape(batch_size, -1)
            # Determine what the predicted signal class is for each window
            infer_result = np.argmax(output_mat, axis=1)
            # Determine the unique class values found in current batch
            classes_found = np.unique(infer_result)
            print(f'Signal classes found for batch {ctr} = {classes_found}')
        except KeyboardInterrupt:
            break
        ctr += 1
    sdr.closeStream(rx_stream)