In various scenarios motivated by real life, such as medical data analysis, autonomous driving, and adversarial training, we are interested in robust deep networks. A network is robust when a relatively small perturbation of the input cannot lead to drastic changes in output (like change of class, etc.). This falls under the broader scope field of Neural Network Certification (NNC). Two crucial problems in NNC are of profound interest to the scientific community: how to calculate the robustness of a given pre-trained network and how to construct robust networks. The common approach to constructing robust networks is Interval Bound Propagation (IBP). This paper demonstrates that IBP is sub-optimal in the first case due to its susceptibility to the wrapping effect. Even for linear activation, IBP gives strongly sub-optimal bounds. Consequently, one should use strategies immune to the wrapping effect to obtain bounds close to optimal ones. We adapt two classical approaches dedicated to strict computations -- Dubleton Arithmetic and Affine Arithmetic -- to mitigate the wrapping effect in neural networks. These techniques yield precise results for networks with linear activation functions, thus resisting the wrapping effect. As a result, we achieve bounds significantly closer to the optimal level than IBPs.
The Affine Arithmetic (AA) method is able to reduce the wrapping effect compared to the Interval Bound Propagation (IBP) method.
- Install requirements neccessary to build the CAPD library. This library enables implementing methods described in our paper.
- Build the CAPD library as it is described here.
- Install rapidjson to read datasets from
.txt
files. - Use the
environment.yml
file to create a conda environment with the necessary libraries:conda env create -f environment.yml
. These Python libraries enable neural network training by both standard and IBP training.
For the experiments and ablation study, we use 4 publicly available datasets:
The datasets may be downloaded when the algorithm runs.
The folder AffineAndDoubletonArithmetic
contains the necessary tools to calculate bounds using Affine Arithmetic (AA), Doubleton Arithmetic (DA), Interval Bound Propagation (IBP), and Lower Bound (LB). In the main.cpp
file, the following functions are available:
runFullyConnectedTest
- to calculate bounds using the AA, DA, IBP, and LB methods for fully-connected networks.runConvolutionalTest
- to calculate bounds using the AA, IBP, and LB methods for convolutional neural networks (CNNs).runConvolutionalDoubletonTest
- to calculate bounds using the DA method for CNNs.
In general, to run the script, you need to convert the dataset and neural network architectures into a format acceptable by the C++ script. These are contained in the Utils/cpp_utils.py
file. For example, you can convert the weights of a neural network to a .txt
file by running the function save_weights2txt
and providing the correct arguments. To convert dataset points to the acceptable format, you need to run the function save_data2txt
. The weights and data points should be saved in the AffineAndDoubletonArithmetic/data
folder for the appropriate dataset (such as svhn
).
When the weights and data points are saved in .txt
format, you can run the main.cpp
file to calculate optimal bounds. After compiling the script using the make
command, you can run the compiled file from the command line using two command patterns:
./your_program method path_to_weights.txt path_to_dataset.txt eps_start eps_end arch_type
whenmethod
isrunConvolutionalTest
../your_program method path_to_weights.txt path_to_dataset.txt eps_start eps_end
whenmethod
isrunFullyConnectedTest
orrunConvolutionalDoubletonTest
.
The parameters are described below:
method
- one of three functions (runFullyConnectedTest
,runConvolutionalTest
,runConvolutionalDoubletonTest
) to indicate which method should be used.path_to_weights.txt
- path to the weights saved with the.txt
extension.path_to_dataset.txt
- path to the dataset saved with the.txt
extension.eps_start
- float, the starting value of perturbation size applied to the data.eps_end
- float, the ending value of perturbation size applied to the data.arch_type
- type of architecture used:cnn_small
,cnn_medium
,cnn_large
. These architectures are described in our paper.
In the Experiments
folder, there are .json
files where one can define a set of hyperparameters to be used in the training process. Vanilla training will be run with the epsilon
hyperparameter set to 0. To perform the training, one needs to invoke python train.py --config=<path_to_config_file>
.
The functions to reproduce the experiments described in the paper are located in the evaluation.py
file.