ART Attacks

Work in progress ...

Evasion
Poisoning
Extraction
Inference

The attack descriptions include a link to the original publication and tags describing framework-support of implementations in ART:

all/Numpy: implementation based on Numpy to support all frameworks
TensorFlow: implementation optimised for TensorFlow
PyTorch: implementation optimised for PyTorch

1. Evasion Attacks

Auto-Attack (Croce and Hein, 2020)

Auto-Attack runs one or more evasion attacks, defaults or provided by the user, against a classification task. Auto-Attack optimises the attack strength by only attacking correctly classified samples and by first running the untargeted version of each attack followed by running the targeted version against each possible target label.

1.1 White-box

Auto Projected Gradient Descent (Auto-PGD) (Croce and Hein, 2020) all/Numpy

Auto Projected Gradient Descent attacks classification and optimizes its attack strength by adapting the step size across iterations depending on the overall attack budget and progress of the optimisations. After adapting its steps size Auto-Attack restarts from the best example found so far.
Shadow Attack (Ghiasi et al., 2020) TensorFlow, PyTorch

Shadow Attack causes certifiably robust networks to misclassify an image and produce "spoofed" certificates of robustness by applying large but naturally looking perturbations.
Wasserstein Attack (Wong et al., 2020) all/Numpy

Wasserstein Attack generates adversarial examples with minimised Wasserstein distances and perturbations according to the content of the original images.
PE Malware Attacks (Suciu et al., 2018, Demetrio et al., 2020, Demetrio et al., 2019) TensorFlow

White-box attacks related to PE malware.
Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition (Qin et al., 2019) TensorFlow, PyTorch

The attack extends the previous work of Carlini and Wagner (2018) to construct effective imperceptible audio adversarial examples.
Brendel & Bethge Attack (Brendel et al., 2019) all/Numpy

Brendel & Bethge attack is a powerful gradient-based adversarial attack that follows the adversarial boundary (the boundary between the space of adversarial and non-adversarial images as defined by the adversarial criterion) to find the minimum distance to the clean image.
Targeted Universal Adversarial Perturbations (Hirano and Takemoto, 2019) all/Numpy

This attack creates targeted universal adversarial perturbations combining iterative methods to generate untargeted examples and fast gradient sign method to create a targeted perturbation.
Audio Adversarial Examples: Targeted Attacks on Speech-to-Text (Carlini and Wagner, 2018) all/Numpy

The attack constructs targeted audio adversarial examples on automatic speech recognition.
High Confidence Low Uncertainty (HCLU) Attack (Grosse et al., 2018) GPy

The HCLU attack Creates adversarial examples achieving high confidence and low uncertainty on a Gaussian process classifier.
Iterative Frame Saliency (Inkawhich et al., 2018)

The Iterative Frame Saliency attack creates adversarial examples for optical flow-based image and video classification models.
DPatch (Liu et al., 2018) all/Numpy

DPatch creates digital, rectangular patches that attack object detectors.
Robust DPatch (Liu et al., 2018, (Lee and Kolter, 2019)) all/Numpy

A Robust version of DPatch including sign gradients and expectations over transformations.
ShapeShifter (Chen et al., 2018)
Projected Gradient Descent (PGD) (Madry et al., 2017)
NewtonFool (Jang et al., 2017)
Elastic Net (Chen et al., 2017)
Adversarial Patch (Brown et al., 2017) all/Numpy, TensorFlow, PyTorch

This attack generates adversarial patches that can be printed and applied in the physical world to attack image and video classification models.
Decision Tree Attack (Papernot et al., 2016) all/Numpy

The Decision Tree Attack creates adversarial examples for decision tree classifiers by exploiting the structure of the tree and searching for leaves with different classes near the leaf corresponding to the prediction for the benign sample.
Carlini & Wagner (C&W) L_2 and L_inf attack (Carlini and Wagner, 2016) all/Numpy

The Carlini & Wagner attacks in L2 and Linf norm are some of the strongest white-box attacks. A major difference with respect to the original implementation (https://github.com/carlini/nn_robust_attacks) is that ART's implementation uses line search in the optimization of the attack objective.
Basic Iterative Method (BIM) (Kurakin et al., 2016) all/Numpy
Jacobian Saliency Map (Papernot et al., 2016)
Universal Perturbation (Moosavi-Dezfooli et al., 2016)
Feature Adversaries (Sabour et al., 2016) all/Numpy

Feature Adversaries manipulates images as inputs to neural networks to mimic the intermediate representations/layers of the original images while changing its classification.
DeepFool (Moosavi-Dezfooli et al., 2015) all/Numpy

DeepFool efficiently computes perturbations that fool deep networks, and thus reliably quantifies the robustness of these classifiers.
Virtual Adversarial Method (Miyato et al., 2015)
Fast Gradient Method (Goodfellow et al., 2014) all/Numpy

1.2 Black-box

Square Attack (Andriushchenko et al., 2020)
HopSkipJump Attack (Chen et al., 2019)
Threshold Attack (Vargas et al., 2019)
Pixel Attack (Vargas et al., 2019, Su et al., 2019)
Simple Black-box Adversarial (SimBA) (Guo et al., 2019)
Spatial Transformation (Engstrom et al., 2017)
Query-efficient Black-box (Ilyas et al., 2017)
Zeroth Order Optimisation (ZOO) (Chen et al., 2017)
Decision-based/Boundary Attack (Brendel et al., 2018)
Geometric Decision-based Attack (GeoDA) (Rahmati et al., 2020)

2. Poisoning Attacks

Poisoning Attack on Support Vector Machines (SVM) (Biggio et al., 2013)
Backdoor Attack (Gu et. al., 2017)
Clean-Label Backdoor Attack (Turner et al., 2018)
Adversarial Embedding Backdoor Attack (Tan and Shokri, 2019)
Hidden Trigger Backdoor Attack (Saha et al., 2019)
Bullseye Polytope (Aghakhani et al., 2020)
Backdoor Attack on Deep Generative Models (DGM) (Rawat et al. 2021)
Clean Label Feature Collision Attack (Shafahi et. al., 2018)
Gradient Matching / Witches' Brew Attack (Geiping et al., 2020)
Sleeper Agent Attack (Souri et al., 2021)
BadDet Attacks (Chan et al., 2022)
- BadDet Object Generation Attack (OGA)
- BadDet Regional Misclassification Attack (RMA)
- BadDet Global Misclassification Attack (GMA)
- BadDet Object Disappearance Attack (ODA)

3. Extraction Attacks

Functionally Equivalent Extraction (Jagielski et al., 2019)
Copycat CNN (Correia-Silva et al., 2018)
KnockoffNets (Orekondy et al., 2018)

4. Inference Attacks

4.1 Attribute Inference

Attribute Inference Black-Box
Attribute Inference White-Box Lifestyle Decision Tree (Fredrikson et al., 2015)
Attribute Inference White-Box Decision Tree (Fredrikson et al., 2015)

4.2 Membership Inference

Membership Inference Black-Box
Membership Inference Black-Box Rule-Based
Label-Only Boundary Distance Attack (Choquette-Choo et al., 2020)
Label-Only Gap Attack (Choquette-Choo et al., 2020)

4.3 Model Inversion

MIFace (Fredrikson et al., 2015)

Inference attack exploiting adversarial access to an model to learn information its training data using confidence values revealed in predictions.

4.4 Reconstruction

Database Reconstruction

Implementation of a database reconstruction attack inferring the missing row of a training dataset for trained model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ART Attacks

1. Evasion Attacks

1.1 White-box

1.2 Black-box

2. Poisoning Attacks

3. Extraction Attacks

4. Inference Attacks

4.1 Attribute Inference

4.2 Membership Inference

4.3 Model Inversion

4.4 Reconstruction

Clone this wiki locally