Training the Model¶

Deepwave currently recommends using TensorFlow 2 or PyTorch as the training framework. While we support the legacy TensorFlow 1, the interface for TensorFlow 2 is more user friendly and will be the supported version of TensorFlow going forward.

Start AirPack Docker Container¶

The AirPack Docker container is started the same way for all training framework. Ensure that all the steps in AirPack Installation have been completed.

Note: the AirPack directory is not contained within the docker image. It must be mounted when the container is started via the -v option. This also allows for the code and output of training to be accessible by the host machine. See below for details.

To start the docker container:
```
docker run -it \
-v <path_to_AirPack>:/AirPack \
-v <path_to_AirPack_data>:/data \
--gpus all \
<docker-image-name>
```
where:
- <path_to_AirPack> - the path to the AirStack folder on the host
- <path_to_AirPack_data> - the path to the AirStack data set. See here for more information.
- <docker-image-name> - is the name of the docker image assigned when the image was created during the AirPack Installation.
After executing this command you are in a Linux environment within the Docker container. If you are unfamiliar with Docker, it is very similar to a virtual machine. the -v flag will mount the AirPack toolbox and data in /AirPack and /data, respectively.
Set up your copy of the AirPack Python code for use. This will allow you to edit code from both inside and outside of the container, as well as import the airpack module and use it from your own custom code. Do this each time you start a new container.
```
$ pip install -e /AirPack
```

TensorFlow¶

Click on the above video to make it large and open in new window.

Train the Model¶

Run the training script from within the Docker container:
```
$ python /AirPack/airpack_scripts/tf2/run_training.py
```

The script will periodically display a terminal output similar to the following:

$ python /AirPack/airpack_scripts/tf2/run_training.py
...
Epoch 1/10
610/610 [==============================] - 11s 17ms/step - loss: 0.8916 - categorical_accuracy: 0.6776 - val_loss: 0.4115 - val_categorical_accuracy: 0.8332
Epoch 2/10
610/610 [==============================] - 9s 15ms/step - loss: 0.3341 - categorical_accuracy: 0.8670 - val_loss: 0.2753 - val_categorical_accuracy: 0.8781
Epoch 3/10
610/610 [==============================] - 9s 15ms/step - loss: 0.2468 - categorical_accuracy: 0.9020 - val_loss: 0.2411 - val_categorical_accuracy: 0.8997
Epoch 4/10
610/610 [==============================] - 9s 15ms/step - loss: 0.1669 - categorical_accuracy: 0.9376 - val_loss: 0.1700 - val_categorical_accuracy: 0.9510
Epoch 5/10
610/610 [==============================] - 9s 15ms/step - loss: 0.1401 - categorical_accuracy: 0.9542 - val_loss: 0.2257 - val_categorical_accuracy: 0.9282
Epoch 6/10
610/610 [==============================] - 9s 15ms/step - loss: 0.1013 - categorical_accuracy: 0.9691 - val_loss: 0.1021 - val_categorical_accuracy: 0.9682
Epoch 7/10
610/610 [==============================] - 9s 15ms/step - loss: 0.0772 - categorical_accuracy: 0.9771 - val_loss: 0.1036 - val_categorical_accuracy: 0.9659
Epoch 8/10
610/610 [==============================] - 9s 15ms/step - loss: 0.0691 - categorical_accuracy: 0.9800 - val_loss: 0.0849 - val_categorical_accuracy: 0.9773
Epoch 9/10
610/610 [==============================] - 9s 15ms/step - loss: 0.0446 - categorical_accuracy: 0.9881 - val_loss: 0.1164 - val_categorical_accuracy: 0.9678
Epoch 10/10
610/610 [==============================] - 9s 14ms/step - loss: 0.0616 - categorical_accuracy: 0.9834 - val_loss: 0.1057 - val_categorical_accuracy: 0.9731

Once the script has completed the training iterations, it will produce multiple files in the /AirPack/output directory including the following:
- saved_model.onnx - File that will be used for deployment on the AIR-T

Perform Inference with Trained Model¶

You may use the run_inference.py script to evaluate the performance of the model and plot the result. This script will load the saved_model.onnx file produced during training, feed the test data through the network, and create a result plot.

Run the inference script

$ python /AirPack/airpack_scripts/tf2/run_inference.py

Running this script will produce an image in /AirPack/output/saved_model_plot.png demonstrating the inference performance for each signal type.

PyTorch¶

Train the Model¶

Run the training script from within the Docker container:

$ python /AirPack/airpack_scripts/pytorch/run_training.py

The script will periodically display a terminal output similar to the following:

$ python /AirPack/airpack_scripts/pytorch/run_training.py
...
Training Progress:   0%|                                                    | 0/10 [00:00<?, ?epoch/s]Epoch 1 of 10
Train Loss: 0.0128, Train Accuracy: 0.6765
Val Loss: 0.0068, Val Accuracy: 0.8188
Training Progress:  10%|████▍                                               | 1/10 [00:28<04:20, 28.99s/epoch]Epoch 2 of 10
Train Loss: 0.0057, Train Accuracy: 0.8443
Val Loss: 0.0045, Val Accuracy: 0.8744
Training Progress:  20%|████████████                                        | 2/10 [00:58<03:52, 29.07s/epoch]Epoch 3 of 10
Train Loss: 0.0041, Train Accuracy: 0.8846
Val Loss: 0.0035, Val Accuracy: 0.9019
Training Progress:  30%|███████████████████                                 | 3/10 [01:26<03:20, 28.71s/epoch]Epoch 4 of 10
Train Loss: 0.0033, Train Accuracy: 0.9102
Val Loss: 0.0039, Val Accuracy: 0.8978
Training Progress:  40%|█████████████████████████                           | 4/10 [01:54<02:51, 28.56s/epoch]Epoch 5 of 10
Train Loss: 0.0025, Train Accuracy: 0.9328
Val Loss: 0.0031, Val Accuracy: 0.9165
Training Progress:  50%|██████████████████████████████                      | 5/10 [02:21<02:20, 28.15s/epoch]Epoch 6 of 10
Train Loss: 0.0021, Train Accuracy: 0.9449
Val Loss: 0.0022, Val Accuracy: 0.9429
Training Progress:  60%|██████████████████████████████████                  | 6/10 [02:48<01:51, 27.77s/epoch]Epoch 7 of 10
Train Loss: 0.0018, Train Accuracy: 0.9526
Val Loss: 0.0018, Val Accuracy: 0.9542
Training Progress:  70%|███████████████████████████████████████              | 7/10 [03:16<01:23, 27.74s/epoch]Epoch 8 of 10
Train Loss: 0.0015, Train Accuracy: 0.9608
Val Loss: 0.0016, Val Accuracy: 0.9599
Training Progress:  80%|████████████████████████████████████████████         | 8/10 [03:43<00:55, 27.69s/epoch]Epoch 9 of 10
Train Loss: 0.0013, Train Accuracy: 0.9666
Val Loss: 0.0016, Val Accuracy: 0.9636
Training Progress:  90%|███████████████████████████████████████████████      | 9/10 [04:11<00:27, 27.73s/epoch]Epoch 10 of 10
Train Loss: 0.0012, Train Accuracy: 0.9701
Val Loss: 0.0013, Val Accuracy: 0.9691
Training Progress: 100%|█████████████████████████████████████████████████████| 10/10 [04:40<00:00, 28.06s/epoch]

Once the script has completed the training iterations, it will produce multiple files in the /AirPack/output/pytorch directory including the following:
- saved_model.onnx - File that will be used for deployment on the AIR-T

Perform Inference with Trained Model¶

You may use the run_inference.py script to evaluate the performance of the model and plot the result. This script will load the saved_model.onnx file produced during training, feed the test data through the network, and create a result plot.

Run the inference script:

$ python /AirPack/airpack_scripts/pytorch/run_inference.py

Running this script will produce an image in /AirPack/output/pytorch/saved_model_plot.png demonstrating the inference performance for each signal type.

TensorFlow 1 (Legacy)¶

Note: Deepwave strongly recommends transitioning to TensorFlow 2 as TensorFlow 1 is deprecated.

Train the Model¶

Run the training script

$ python /AirPack/airpack_scripts/tf1/run_training.py

The script will periodically display a terminal output similar to the following:

$ python /AirPack/airpack_scripts/tf1/run_training.py
...
(0 of 6094): Training Loss = 2.494922, Testing Accuracy = 0.109375
(100 of 6094): Training Loss = 1.590902, Testing Accuracy = 0.445312
(200 of 6094): Training Loss = 0.962753, Testing Accuracy = 0.664062
(300 of 6094): Training Loss = 0.617013, Testing Accuracy = 0.812500
(400 of 6094): Training Loss = 0.499497, Testing Accuracy = 0.773438
(500 of 6094): Training Loss = 0.317061, Testing Accuracy = 0.890625
(600 of 6094): Training Loss = 0.381197, Testing Accuracy = 0.867188
(700 of 6094): Training Loss = 0.347956, Testing Accuracy = 0.843750
(800 of 6094): Training Loss = 0.464664, Testing Accuracy = 0.796875
(900 of 6094): Training Loss = 0.384519, Testing Accuracy = 0.820312
...
...
(5800 of 6094): Training Loss = 0.025528, Testing Accuracy = 0.968750
(5900 of 6094): Training Loss = 0.068839, Testing Accuracy = 0.960938
(6000 of 6094): Training Loss = 0.017364, Testing Accuracy = 0.975000
(6100 of 6094): Training Loss = 0.046915, Testing Accuracy = 0.976562

Once the script has completed the training iterations, it will produce multiple files in the /AirPack/data/output directory including the following:
- checkpoint - file that defines the location of the saved model files
- saved_model.meta - file that contains the graph and protocol buffer
- saved_model.onnx - File that will be used for deployment on the AIR-T