Out-of-Distribution (OOD) Detection Benchmarks

Code to reproduce results from the paper:

Back to the Basics: Revisiting Out-of-Distribution Detection Baselines. ICML 2022 Workshop on Principles of Distribution Shift

Out-of-distribution (OOD) detection is the task of determining whether a datapoint comes from a different distribution than the training dataset. For example, we may train a model to classify the breed of dogs and find that there is a cat image in our dataset. This cat image would be considered out-of-distribution. This work evaluates the effectiveness of various scores to detect OOD datapoints.

This repository is only for intended for scientific purposes. To detect outliers in your own data, you should instead use the implementation from the official cleanlab library.

File Structure

This repository is broken into two major folders (inside src/experiments/):

OOD/: primary benchmarking code used for the paper linked above.
adjusted-OOD-scores/: additional benchmarking code to produce results from the article:

A Simple Adjustment Improves Out-of-Distribution Detection for Any Classifier. Towards AI, 2022.

This additional code considers OOD detection based solely on classifier predictions and adjusted versions thereof.

Experiments

For each experiment, we perform the following procedure:

Train a Neural Network model with ONLY the in-distribution training dataset.
Use this model to generate predicted probabilties and embeddings for the in-distribution and out-of-distribution test datasets (these are considered out-of-sample predictions).
Use out-of-sample predictions to generate OOD scores.
Threshold OOD scores to detect OOD datapoints.

Experiment ID	In-Distribution	Out-of-Distribution
0	cifar-10	cifar-100
1	cifar-100	cifar-10
2	mnist	roman-numeral
3	roman-numeral	mnist
4	mnist	fashion-mnist
5	fashion-mnist	mnist

Download datasets

For our experiments, we use AutoGluon's ImagePredictor for image classification which requires the training, validation, and test datasets to be image files.

Links below to download the training and test datasets in PNG format:

cifar-10 and cifar-100: https://github.com/knjcode/cifar2png
roman-numeral: https://worksheets.codalab.org/bundles/0x497f5d7096724783aa1eb78b85aa321f

There are duplicate images in the dataset (exact same image with different file names). We use the following script to dedupe: src/preprocess/remove_dupes.py
mnist: https://github.com/myleott/mnist_png
fashion-mnist: https://github.com/DeepLenin/fashion-mnist_png

Instructions to reproduce results

Prerequisite

NVIDIA Container Toolkit: allows us to properly utilize our NVIDIA GPUs inside docker environments
autogluon==0.4.0

1. Run docker-compose to build the docker image and run the container

Clone this repo and run below commands:

sudo docker-compose build
sudo docker-compose run --rm --service-port dcai

2. Start Jupyter Lab

Run command below.

Note that we use a Makefile to run jupyter lab for convenience so we can save args (ip, port, allow-root, etc).

make jupyter-lab

3. Train models

Run notebook below to train all models.

src/experiments/OOD/0_Train_Models.ipynb

Note that we use 2 neural net architectures below with AutoGluon and each use different backends:

swin_base_patch4_window7_224 (torch backend)
resnet50_v1 (mxnet backend)

4. Run experiments

Here is a notebook that runs all experiments:

src/experiments/OOD/1_Evaluate_All_OOD_Experiments.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
requirements_MXNET.txt		requirements_MXNET.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Out-of-Distribution (OOD) Detection Benchmarks

File Structure

Experiments

Download datasets

Instructions to reproduce results

Prerequisite

1. Run docker-compose to build the docker image and run the container

2. Start Jupyter Lab

3. Train models

4. Run experiments

About

Releases

Packages

Contributors 3

Languages

License

cleanlab/ood-detection-benchmarks

Folders and files

Latest commit

History

Repository files navigation

Out-of-Distribution (OOD) Detection Benchmarks

File Structure

Experiments

Download datasets

Instructions to reproduce results

Prerequisite

1. Run docker-compose to build the docker image and run the container

2. Start Jupyter Lab

3. Train models

4. Run experiments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages