Training deep neural network (DNN) models, which has become an important task in today's software development, is often costly in terms of computational resources and time. With the inspiration of software reuse, building DNN models through reusing existing ones has gained increasing attention recently. Prior approaches to DNN model reuse have two main limitations: 1) reusing the entire model, while only a small part of the model's functionalities (labels) are required, would cause much overhead (e.g., computational and time costs for inference), and 2) model reuse would inherit the defects and weaknesses of the reused model, and hence put the new system under threats of security attack. To solve the above problem, we propose SeaM, a tool that re-engineers a trained DNN model to improve its reusability. Specifically, given a target problem and a trained model, SeaM utilizes a gradient-based search method to search for the model's weights that are relevant to the target problem. The re-engineered model that only retains the relevant weights is then reused to solve the target problem. Evaluation results on widely-used models show that the re-engineered models produced by SeaM only contain 10.11% weights of the original models, resulting 42.41% reduction in terms of inference time. For the target problem, the re-engineered models even outperform the original models in classification accuracy by 5.85%. Moreover, reusing the re-engineered models inherits an average of 57% fewer defects than reusing the entire model. We believe our approach to reducing reuse overhead and defect inheritance is one important step forward for practical model reuse.
- advertorch 0.2.3
- fvcore 0.1.5.post20220512
- matplotlib 3.4.2
- numpy 1.19.2
- python 3.8.10
- pytorch 1.8.1
- torchvision 0.9.0
- tqdm 4.61.0
- GPU with CUDA support is also needed
|--- README.md : user guidance
|--- data/ : experimental data
|--- src/ : source code of our work
|------ global_config.py : setting the path
|------ binary_class/ : direct reuse on binary classification problems
|--------- model_reengineering.py : re-engineering a trained model and then reuse the re-engineered model
|--------- calculate_flop.py : calculating the number of FLOPs required by reusing the re-engineered and original models
|--------- calculate_time_cost.py : calculating the inference time required by reusing the re-engineered and original models
|--------- ......
|------ multi_class/ : direct reuse on multi-class classification problems
|--------- ......
|------ defect_inherit/ : indirect reuse
|--------- reengineering_finetune.py : re-engineering a trained model and then fine-tuning the re-engineered model
|--------- standard_finetune.py : using standard fine-tuning approach to fine-tune a trained model
|--------- eval_robustness.py : calculating the defect inheritance rate
|--------- ......
The following sections describe how to reproduce the experimental results in our paper.
- We provide the trained models and datasets used in the experiments, as well as the corresponding re-engineered models.
One can downloaddata/
from here and then move it toSeaM/
.
The trained models will be downloaded automatically by PyTorch when running our project. If the download fails, please move our provided trained models to the folder according to the failure information given by PyTorch.
Due to the huge size of ImageNet, please download it from its webpage. - Modify
self.root_dir
insrc/global_config.py
.
- Go to the directory of experiments related to the binary classification problems.
cd src/binary_class
- Re-engineer VGG16-CIFAR10 on a binary classification problem.
python model_reengineering.py --model vgg16 --dataset cifar10 --target_class 0 --lr_mask 0.01 --alpha 1
- Compute the number of FLOPs required by the original and re-engineered VGG16-CIFAR10, respectively. This command also presents the accuracy of models.
python calculate_flop.py --model vgg16 --dataset cifar10 --target_class 0 --lr_mask 0.01 --alpha 1
- Compute the time cost for inference required by the original and re-engineered VGG16-CIFAR10, respectively. This command also presents the number of a model's weights.
python calculate_time_cost.py --model vgg16 --dataset cifar10 --target_class 0 --lr_mask 0.01 --alpha 1
- Go to the directory of experiments related to the multi-class classification problems.
cd src/multi_class
- Re-engineer ResNet20-CIFAR100 on a multi-class classification problem.
python model_reengineering.py --model resnet20 --dataset cifar100 --target_superclass_idx 0 --lr_mask 0.1 --alpha 2
- Compute the number of FLOPs required by the original and re-engineered ResNet20-CIFAR100, respectively. This command also presents the accuracy of models.
python calculate_flop.py --model resnet20 --dataset cifar100 --target_superclass_idx 0 --lr_mask 0.1 --alpha 2
- Compute the time cost for inference required by the original and re-engineered ResNet20-CIFAR100, respectively. This command also presents the number of a model's weights.
python calculate_time_cost.py --model resnet20 --dataset cifar100 --target_superclass 0 --lr_mask 0.1 --alpha 2
NOTE: When computing the time cost for inference, DeepSparse runs a model on several CPUs. The inference process would be interfered with other active processes, leading to fluctuations in inference time cost. In our experiments, we manually kill as many other processes as possible and enable the inference process to occupy the CPUs exclusively.
- Go to the directory of experiments related to the defect inheritance.
cd src/defect_inherit
- Re-engineer ResNet18-ImageNet and then fine-tune the re-engineered model on the target dataset Scenes.
python reengineering_finetune.py --model resnet18 --dataset mit67 --lr_mask 0.05 --alpha 0.5 --prune_threshold 0.6
- Compute the defect inheritance rate of fine-tuned re-engineered ResNet18-Scenes.
python eval_robustness.py --model resnet18 --dataset mit67 --eval seam --lr_mask 0.05 --alpha 0.5 --prune_threshold 0.6
- Fine-tune the original ResNet18-ImageNet on the target dataset Scenes.
python standard_finetune.py --model resnet18 --dataset mit67
- Compute the defect inheritance rate of fine-tuned original ResNet18-Scenes.
python eval_robustness.py --model resnet18 --dataset mit67 --eval standard
The following table shows the default hyperparameters. The details of settings for re-engineering each trained model on each target problem can be obtained according to the experimental result files.
For instance, the values of learning rate and alpha for the re-engineered model file SeaM/data/multi_class_classification/resnet20_cifar100/predefined/tsc_0/lr_head_mask_0.1_0.05_alpha_1.0.pth
are 0.05 and 1.0, respectively.
Target Problem | Model Name | Learning Rate | Alpha |
---|---|---|---|
Binary Classification |
VGG16-CIFAR10 | 0.01 | 1.00 |
VGG16-CIFAR100 | 0.05 | 1.50 | |
ResNet20-CIFAR10 | 0.05 | 1.00 | |
ResNet20-CIFAR100 | 0.12 | 1.50 | |
Multi-class Classification |
ResNet20-CIFAR100 | 0.10 | 2.00 |
ResNet50-ImageNet | 0.05 | 2.00 |
We investigate the impact of the reduction in the number of weights on the ACC and DIR.
A threshold