A complete pytorch pipeline for training, cross-validation and inference notebooks used in Kaggle competition Global Wheat Detection (May-Aug 2020)
Wheat heads were from various sources:
A few labeled images are as shown: (Blue bounding boxes)
A brief content description is provided here, for detailed descriptions check the notebook comments
-
Pre-Processing:
- Handled the noisy labels (too big/small boxes etc.)
- Stratified 5 fold split based on source -
Augmentations:
- Albumentations - RandomSizedCrop, HueSaturationValue, RandomBrightnessContrast, RandomRotate90, Flip, Cutout, ShiftScaleRotate
- Mixup - https://arxiv.org/pdf/1710.09412.pdf
2 images are mixed
- Mosaic - https://arxiv.org/pdf/2004.12432.pdf
4 images are cropped and stitched together. YoloV5 by default has a canvas where it stitches images in size multiple of 32 pixels. For batch size = 4 the canvas looks like:
for batch size = 2
-
Configurations:
- Default YoloV5 configuration -
TensorBoard Analysis:
- YoloV5 by default uses TensorBoard during training, the best model is selected using "fitness" criteria based on following parameters:
Some of my TensorBoard training logs can be found at TensorBoard.dev
-
Pre-Processing:
- Same as in [TRAIN] -
Test Time Augmentations:
- Flips and Rotate
- Color shift
- Scale (scale down with padding) -
Ensemble:
- Support for ensembling of multiple folds of the same model
- Non-Maximum Supression (NMS) is used to ensemble final predicted boxes -
Automated Threshold Calculations:
- Confidence level threshold is calculated based on ground truth labels
- Optimal Final CV score (Metric: IoU) is obtained through this
-
Test Time Augmentations:
- Same as in [CV] -
Pseudo Labelling:
- Multi-Round Pseudo Labelling pipeline based on https://arxiv.org/pdf/1908.02983.pdf
- Implemented Cross Validation calculations at the end of each round to decide the best thresholds for Pseudo Labels in the next round
- Training pipeline same as in [TRAIN]
-
Post-Processing and Result:
- Final predictions made with ensembled combinations of TTA
Just change the directories according to your environment.
Google Colab deployed versions are available for
[TRAIN]
[CV]
In case of any deprecation issues/warnings in future, use the modules available in YoloV5-Mixup folder.
Acknowledging the shortcomings is the first step for progress. Thus, listing the possible improvements that could've made my Model better:
- Ensemble Multi-Model/Fold predictions for Pseudo Labels, currently single model is used to make pseudo labels. Would've made the model more robust to noise too.
- GAN or Style Transfer could've been used to produce more similar labeled images from the current train images for better generalization.
- Relabeling of noisy labels using multi-folds.