Distillation Knowledge for training Multi-exit Model We implement a Distillation knowledge to train a Multi-Exit (ME) Model ResNet50. Our ME consists of 4 early exit gates from each residual block. Using this schenario, we yield 82.5% 85% 89% 92% accuracy from gate 1 to 4 respectively. To reproduce the result, run the main.py script.
Reference: