Perform inference and benchmark inference performance using the ONNX Runtime.

Benchmarks are performed by repeatedly running inference on a random input vector and measuring the total time taken.

Module Contents

airpack.deploy.onnx.onnx_bench(onnx_file, cplx_samples, batch_size=128, num_inferences=100, input_dtype=np.float32)

Benchmarks a saved model using the onnxruntime inference engine.

  • onnx_file (Union[str, os.PathLike]) – Saved model file (.onnx format)

  • cplx_samples (int) – Input length of the neural network, in complex samples; this is half of the input_length of the neural network which operates on real values

  • batch_size (int) – How many sets of cplx_samples inputs are batched together in a single inference call

  • num_inferences (Optional[int]) – Number of iterations to execute inference between measurements of inference throughput (if None, then run forever)

  • input_dtype (numpy.number) – Data type of a single value (a single I or Q value, not a complete complex (I, Q) sample): use one of numpy.int16 or numpy.float32 here

Return type