Audio Tokens Part 3: The First Run

5 Sep 2024 audio-tokens

Time to try the first run at training the model. Let’s see what happens!

2024-09-05 13:59:48,289 - INFO - Epoch 1
2024-09-05 13:59:48,289 - INFO - Train Loss: 0.0429, Train F1 (macro): 0.4998, Train F1 (micro): 0.9946, Train Hamming Loss: 0.0054, Train mAP: 0.0106
2024-09-05 13:59:48,289 - INFO - Val Loss: 0.0210, Val F1 (macro): 0.4991, Val F1 (micro): 0.9962, Val Hamming Loss: 0.0038, Val mAP: 0.1074
Training: 100%|████████████████████████████████████| 1022/1022 [07:40<00:00,  2.22it/s, loss=0.0196]
Validating: 100%|█████████████████████████████████████████████████| 146/146 [00:20<00:00,  7.16it/s]
2024-09-05 14:07:58,594 - INFO - Epoch 2
2024-09-05 14:07:58,594 - INFO - Train Loss: 0.0208, Train F1 (macro): 0.5210, Train F1 (micro): 0.9962, Train Hamming Loss: 0.0038, Train mAP: 0.1135
2024-09-05 14:07:58,594 - INFO - Val Loss: 0.0205, Val F1 (macro): 0.5629, Val F1 (micro): 0.9962, Val Hamming Loss: 0.0038, Val mAP: 0.1144
Training: 100%|████████████████████████████████████| 1022/1022 [07:40<00:00,  2.22it/s, loss=0.0153]
Validating: 100%|█████████████████████████████████████████████████| 146/146 [00:20<00:00,  7.17it/s]
[...]
2024-09-05 15:21:42,568 - INFO - Epoch 11
2024-09-05 15:21:42,568 - INFO - Train Loss: 0.0177, Train F1 (macro): 0.5805, Train F1 (micro): 0.9963, Train Hamming Loss: 0.0037, Train mAP: 0.1851
2024-09-05 15:21:42,568 - INFO - Val Loss: 0.0188, Val F1 (macro): 0.5663, Val F1 (micro): 0.9963, Val Hamming Loss: 0.0037, Val mAP: 0.1548
Training: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 1022/1022 [07:41<00:00,  2.21it/s, loss=0.0215]
Validating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 146/146 [00:20<00:00,  7.15it/s]
[...]
2024-09-05 16:27:17,502 - INFO - Epoch 19
2024-09-05 16:27:17,502 - INFO - Train Loss: 0.0116, Train F1 (macro): 0.6765, Train F1 (micro): 0.9968, Train Hamming Loss: 0.0032, Train mAP: 0.4616
2024-09-05 16:27:17,502 - INFO - Val Loss: 0.0208, Val F1 (macro): 0.5698, Val F1 (micro): 0.9962, Val Hamming Loss: 0.0038, Val mAP: 0.1095
Training: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 1022/1022 [07:41<00:00,  2.21it/s, loss=0.0116]
Validating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 146/146 [00:20<00:00,  7.13it/s]
2024-09-05 16:35:29,332 - INFO - Epoch 20
2024-09-05 16:35:29,332 - INFO - Train Loss: 0.0103, Train F1 (macro): 0.7092, Train F1 (micro): 0.9970, Train Hamming Loss: 0.0030, Train mAP: 0.5503
2024-09-05 16:35:29,332 - INFO - Val Loss: 0.0212, Val F1 (macro): 0.5895, Val F1 (micro): 0.9960, Val Hamming Loss: 0.0040, Val mAP: 0.1201

So, um. Reading the mAP values, it starts at bad, increases to slightly less bad, then the overfitting kicks in and training mAP gets excellent and validation mAP drops to bad again. Not great.

There are many things to tune here, but the most glaringly obvious one is the number of tokens. Fifty tokens in the vocabulary seems really small. So I’m going to try this again, changing only the number of tokens to something a little more serious.