airpack.deploy
¶
Tools for deploying trained neural networks and benchmarking their performance.
In this sense, “deploying” a neural network model means running it using a standalone inference engine (and not from within the learning framework that was used to define and train the model).
This module supports two different inference engines: onnxruntime and TensorRT. For deployment on an AIR-T, it is currently recommended to use TensorRT for maximum performance, but onnxruntime is quite good for quickly testing a model during development or testing a model on the server used to train it.