Skip to content

Releases: Trusted-AI/adversarial-robustness-toolbox

ART 1.4.0

20 Sep 22:24
Compare
Choose a tag to compare

This release of ART v1.4.0 introduces framework-specific preprocessing defences, Membership Inference attacks, and support for attacks on Automatic Speech Recognition (ASR) tasks to ART. This release also adds and improves multiple evasion and poisoning attacks and defenses.

Added

  • Added framework-specific preprocessing defences for PyTorch and TensorFlow v2 in all estimators. This extends the preprocessing defences of ART beyond the framework-independent implementations in Numpy of earlier ART versions and enables to use the automatic differentiation of a framework to pass accurate loss gradients backwards through the preprocessing defences. Furthermore this also adds first framework-specific implementations of preprocessing Spatial Smoothing defences in PyTorch and TensorFlow v2, art.defences.preprocessor.SpatialSmoothingPyTorch and art.defences.preprocessor.SpatialSmoothingTensorFlowV2. (#510, #574)
  • Added Membership Inference attacks to evaluate leaks of information about individual training data recordsart.attacks.inference.membership_inference (#573)
  • Added Neural Cleanse defense against poisoned models. This is the first transformation defense against poisoning which accept a potentially poisoned model and returns a transformed version of the model defended against the effects of the poisoning art.defences.transformer.poison.NeuralCleanse (#604)
  • Added Imperceptible ASR evasion attack against Automatic Speech Recognition in Pytorch art.attacks.evasion.ImperceptibleASRPytorch (#605)
  • Added Adversarial Embedding poisoning attack art.attacks.poisoning.PoisoningAttackAdversarialEmbedding (#561)
  • Added new framework- and model-specific estimator for DeepSpeech in PyTorch art.estimators.speech_recognition.PyTorchDeepSpeech (#581)
  • Added support for string type for infinity norm in evasion attacks to facilitate serialisation of arguments (#575)
  • Added support for targeted attack in art.attacks.evasion.AutoAttack (#494)
  • Added targeted version of DPatch evasion attack against object detectors art.attacks.evasion.DPatch (#599)
  • Added property targeted to evasion attacks representing if attack is targeted art.attacks.EvasionAttack (#500)
  • Added new framework- and model-specific estimator for Faster-RCNN in TensorFlow art.estimators.object_detection.TensorFlowFasterRCNN (#487)
  • Added ShapeShifter evasion attack against object detectors art.attacks.evasion.ShapeShifter (#487)
  • Added Simple Black-box Adversarial (SimBA) evasion attack art.attacks.evasion.SimBA (#469)

Changed

  • Changed progress bars to adversarial trainer and Projected Gradient Descent implementations (#603)
  • Changed import paths of Attribute Inference and Model Inversion attacks (#592)

Removed

[None]

Fixed

  • Fixed bug in Thermometer Encoding preprocessor defense and extended it to support channels first data and video data formats (#591)
  • Fixed denormalizing in create_generator_layers in utils/resources/create_inverse_gan_models.py (#491)

ART 1.3.3

21 Aug 19:59
Compare
Choose a tag to compare

This release of ART 1.3.3 provides updates to ART 1.3.

Added

  • Added support for rectangular images and videos (with square and rectangular frames) to the attacks in art.attacks.evasion.adversarial_patch.AdversarialPatch. The framework-independent implementation AdversarialPatchNumpy supports videos of shape NFCHW or NFHWC and the framework-specific implementation for TensorFlow v2 AdversarialPatchTensorFlowV2 supports videos of shape NFHWC. For video data the same patch will be located at the same position on all frames. (#567)
  • Added a warning to ShadowAttack to inform users that this implementation currently only works on a single sample in a batch size of one. (#556)

Changed

  • The Dockerfile will now automatically check if requirements.txt contains newer versions of the dependencies.
  • Changed the CLEVER metric art.metric.clever_t to only calculate required class gradients which results in a speed up of a factor of ~4. (#539)
  • Changed the metric art.metrics.wasserstein_distance to automatically flatten the weights of the two inputs. (#545)
  • Changed art.attacks.evasion.SquareAttack to use model predictions if true labels are not provided to method generate to follow the convention of the other attacks in ART. (#537)

Removed

[None]

Fixed

  • Fixed method set_params in art.attacks.evasion.projected_gradient_descent.ProjectedGradientDescent to correctly update the attributes of the parent class. The attributes of the actual attack implementation have been set correctly before this fix. (#560)

ART 1.3.2

07 Aug 21:28
Compare
Choose a tag to compare

This release of ART 1.3.2 provides updates to ART 1.3.1.

Added

  • Added verbose parameter for CarliniL2Method, CarliniLInfMethod, and DeepFool attacks to disable progress bars.

Changed

  • Changed the Wasserstein attack to support rectangular images as input (#527)
  • Changed UniversalPerturbation attack to use true labels if provided in internal attacks (#526)
  • Allow None as input for parameter `preprocessing of estimators (#493)
  • Allow eps to be larger than eps_step in ProjectedGradientDescent attacks if norm is not np.inf (#495)

Removed

[None]

Fixed

  • Fixed import path for ProjectedGradientDescend option in UniversalPerturbation attack (#525)
  • Fixed support for arrays as clip_values in ProjectedGradientDescentPyTorch attack for PyTorch (#521)
  • Fixed success criteria for targeted attacks with AutoProjectedGradientDescend (#513)
  • Fixed success criteria for attacks used in AutoAttack (#508)
  • Fixed example for Fast-is-better-than-Free adversarial training (#506)
  • Fixed dtype in AutoProjectedGradientDescent and SquareAttack for testing output type of estimator (#499)
  • Fixed parameters in _augment_images_with_patch calls of attack DPatch (#493)

ART 1.3.1

23 Jun 14:17
Compare
Choose a tag to compare

This release of ART 1.3.1 provides updates to ART 1.3.0.

Added

[None]

Changed

  • Changed the method fit of the deep-learning classifiers KerasClassifier, TensorFlowClassifier, TensorFlowV2Classifier, PyTorchClassifier, and MXClassifier in art.estimators.classification to support index labels in addition to one-hot-encoded labels. (#479)
  • Changed the preprocessing defence art.defences.preprocessing.Mp3Compression to support input in format np.float32 in addition to np.int16 and updated related notebooks. (#482)

Removed

[None]

Fixed

  • Fixed art.attacks.evasion.DeepFool to correctly apply the over-shoot step, previously the over-shoot vector was alwasy zero independent of epsilon. (#476)
  • Fixed method set_params for attacks with multiple framework-specific implementations (art.attacks.evasion.AdversarialPatch, and art.attacks.evasion.ProjectedGradientDescent) to set attributes correctly and updated related notebooks, previously these set attributes would have been ignored by the attack. (#481)

ART 1.3.0

15 Jun 19:17
Compare
Choose a tag to compare

This release of ART v1.3.0 is extending ART to a library for machine learning security covering Evasion, Poisoning, Extraction and Inference. The Inference module is a new addition and includes implementations of attribute inference and model inversion attacks. A new Estimator API has been implemented and extends ART 1.3.0 from a library for classification tasks towards a library supporting all possible machine learning tasks including object detection. Multiple state-of-the-art attacks and defenses have been implemented. The READMEs have been redesigned and new Wiki pages have been created.

Added

  • Added a new Estimator API art.estimators to abstract machine learning models in ART. It is replacing the previous Classifier API art.classifiers. The new Estimators API is flexible and extensible to support all possible machine learning tasks. The estimator API currently contains implementations for classification, object detection, certification, encoding, generation models. (#350)
  • Added a framework-specific and model-specific estimator implementation for PyTorch FasterRCNN (torchvision.model.detection.fasterrcnn_resnet50_fpn` as first object detector estimator. All object detector estimators currently support DPatch, ProjectedGradientDescent, BasicIterativeMethod, and FastGradientMethod evasion attacks. (#350)
  • Add a new type of attacks with Inference in art.attacks.inference and first implementations of Attribute Inference and Model Inversion attacks (#439, #428)
  • Added progress bars using tqdm to all attacks and defenses to provide information about progress to the user. (#447)
  • Added install options to setup.py for frameworks and complete installs. So far ART only installed general non-framework dependencies. This update provides complete install for options all, tensorflow, pytorch, keras, mxnet, xgboost, lightgm, catboost, gpy, and docs. (#446)
  • Added dependabot.yml to use GitHub’s Dependabot to propose updates to ART’s dependencies. (#449)
  • Added AutoAttack as a new evasion attack. AutoAttack applies a group of white- and back-box attacks (default: AutoPGD with cross-entropy and with difference-logits-ratio loss, SquareAttack, DeepFool) and is an attack approach that achieves state-of-the-art performance in defense evaluations. (#400)
  • Added Auto Projected Gradient Descent (AutoPGD) as a new evasion attack. AutoPGD adapts its step size to guarantee increasing loss in each step. (#400)
  • Added SquareAttack as a new evasion attack. SquareAttack is a black-box attack based on random search and achieves white-box performance. (#400)
  • Added ShadowAttack as new evasion attack. ShadowAttack creates large, but naturally looking perturbations that can to spoof certificates of classifiers certified for example by Randomised Smoothing. (#409)
  • Added Wasserstein Attack as a new evasion attack. Wasserstein Attack generates adversarial examples with minimized Wasserstein distances which allow large Lp perturbations in still naturally looking examples. (#422)
  • Added DefenceGAN and InverseGAN as new preprocessor defenses. These defenses are based on Generative Adversarial Networks to remove adversarial perturbations. (#411)
  • Added the adversarial training protocol Fast Is Better Than Free as a trainer defense for PyTorch models. The Fast Is Better Than Free protocol allows very fast training of adversarially robust models. (#435)
  • Added H.264/MPEG-4 AVC video compression as preprocessor defense. This defense attempts to remove adversarial perturbations with compression of video data. (#438)
  • Added Feature Collision Clean Label attack as a new poisoning attack for KerasClassifier. This attack allows poisoning the training of a model without modifying the training labels just by adding a modified training example. (#389)
  • Added support for custom loss gradients at any layer of neural network in KerasClassifier. This method allows very sophisticated loss functions to create adversarial examples that imitate the feature representation of benign samples at any layer of the neural networks. Support of this method will be extended to other frameworks in future releases. (#389)
  • Added framework-specific implementations of ProjectedGradientDescent (PGD) evasion attack for TensorFlow v2 and PyTorch. It follows a new concept in ART where an attack implementation based on Numpy, if available, is compatible with all frameworks and framework-specific implementations can be added that take full advantage of a certain framework and only must support ART estimators for this framework. This enables ART to provide attack implementations that run as fast and accurate as possible and it will facilitate integration of original implementations by the attacks’ creators without the need to translate them into implementations based on Numpy. (#390)
  • Added utilities for deprecation of methods and arguments. (#421)
  • Added new metric for Wasserstein distance. (#410)
  • Added the Spectral Signature Defense as a new detector defense against poisoning. This defense uses spectral signatures to detect and defeat backdoor attacks. (#398)
  • Added Mp3 compression as a new preprocessor defense. This defense attempts to remove adversarial perturbations in audio data using MP3 compression. (#391)
  • Added resampling as a new preprocessor defense. This defense attempts to remove adversarial perturbations in audio data by resampling the data. (#397)
  • Added Feature Adversaries attack as a new evasion attack. This attack generates adversarial examples that minimize the difference in feature representation to a benign sample at a certain layer of a neural networks. (#364)
  • Added DPatch as new evasion attack against object detectors. This attack creates digital patches that draw the attention of object detectors to the patch area to prevent the detection of object outside of the patched area. (#362)
  • Added a new Docker image providing installations of all machine learning frameworks supported by ART and the dependencies of ART. (#386)
  • Added a new method to check a model for obfuscated/vanishing/masked gradients. (#376)
  • Added a framework-specific implementation of the AdversarialPatch physical evasion attack for TensorFlow v2. This implementation provides more accurate loss gradients than the Numpy implementation. (#357)
  • Added Frame Saliency Attack as a new evasion attack. This attack creates adversarial examples with sparse and imperceptible perturbations, primarily intended for video data. (#358)
  • Added Python typing to all source files of ART and a mypy check to all Travis CI runs. (#425)

Changed

  • Extended notebooks demonstrating attacks and defenses with audio and video data. (#463)
  • Changed KerasClassifier to accept wildcards in the models input shape. (#458)
  • Deactivated the gradients computation during model evaluation in PyTorchClassifier.predict which accelerates the prediction by a factor of ~2 or more. (#452)
  • Changed art.defence.detector.poison.ActivationDefence to also support data provided with art.data_generators to support datasets larger than the available memory. (#442)
  • Changed default value of apply_predict for art.defences.preprocessor.JpegCompression to True to apply it during prediction by default. (#440)
  • Removed smoothing factor in tanh to original transformation in CarliniL2Method and CarliniLInfMethod attacks to prevent input values that are extremely close to either of the clip values to be transformed to values outside of the clip values. (#428)
  • Changed art.defences.preprocessor.SpatialSmoothing preprocessor defense to support video data. (#415)
  • Changed art.defences.preprocessor.JpegCompression preprocessor defense to support video data. (#412)
  • Changed copyright notice to “The Adversarial Robustness Toolbox (ART) Authors” and listed original copyright holders in new file AUTHORS. (#406)
  • Changed internal format of clip_values from tuple of int or float to numpy.nadarray with dtpye=np.float32. (#392)
  • Moved poison detection defences to new module art.defences.detector.poison. (#399)
  • Moved Randomized Smoothing from wrapper art.wrappers to new estimators in module art.estimators.certification for TensorFlow and PyTorch and removed art.wrappers.RandomizedSmoothing. (#409)

Removed

  • Deprecated argument channel_index of art.classifiers and replaced it with argument channels_first in art.estimators. The new argument channels_first follows usage in the frameworks to describe as a Boolean if the channels dimension is the first or last dimension of a sample. The argument channel_index will be removed after ART 1.4. (#429)

Fixed

  • Fixed several bugs in ThermometerEncoding preprocessor defense, implementing the correct forward pass and implemented estimate_gradients to provide gradients in the original space instead of the discretized/encoded space. (#467, #468)
  • Fixed bug in Boundary Attack to ensure that the adversarial example is projected back to the sphere in each iteration. (#426)
  • Fixed memory leak in KerasClassifier.get_activations by reusing the Keras function calculating the activations. (#417)
  • Fixed RGB-BGR conversion bug in Boundary attack notebook. (#402)
  • Fixed bug in ActivationDefence for RGB images. (#388)
  • Fixed bug in PixelAttack and ThresholdAttack to now return the benign image if no adversarial example has been found. (#384)

ART 1.2.0

15 Mar 19:30
Compare
Choose a tag to compare

This release of ART v1.2.0 introduces new APIs and implementations of model transforming, model training and output post-processing defences, along with new APIs and implementations of poisoning attacks and new implementations of evasion and extraction attacks. Furthermore, ART now also supports Pandas Dataframe as input to its classifier and attack methods.

Added

  • Added support for Pandas Dataframe as input to Classifiers and Attacks in addition to numpy.ndarray enabling defences and attacks on models expecting dataframes as input (#244)
  • Started a collection of notebooks of adversarial robustness evaluations by adding the evaluation of the EMPIR defence (#319)
  • Added an example notebook for adversarial attacks on video data classification (#321)
  • Added an example notebook for adversarial attacks on audio data classification (#271)
  • Added Backdoor Poisoning Attack (#292)
  • Added new API for Transformer defences (#293)
  • Added Defensive Distillation as a transformation defence (#293)
  • Added new API for Trainer defences (#)
  • Added Madry's Protocol for adversarial training as training defence (#294)
  • Added new API for Postprocessor defences (#267)
  • Added KnockoffNets as extraction attack (#230)
  • Added Few Pixel Attack as evasion attack (#280)
  • Added Threshold Attack as evasion attack (#281)
  • Added option for random epsilon as parameter to the projected gradient descent attack which selects the epsilon from a truncated normal distribution ranging [0, eps] with sigma of eps/2 (#257)

Changed

  • Started to refactor the unittests. The tests of KerasClassifier, TensorFlowClassifier, TensorFlowV2Classifier, Boundary attack and Fast Gradient Method have been moved to the new testing system based on pytest with the other tests planned to follow in future releases. (#270)
  • Boundary and HopSkipJump attack work now with non-square images (#288)
  • Applied Black style formatting
  • PyTorchClassifier now allows the user to select a specific GPU (#290)
  • The classifiers now accept soft-labels (probabilities) as input in their fit methods in addition to hard-labels (one-hot encoded or index labels) (#251)
  • Integrated the post-processing defences into the classifiers following the pre-processing defences (#267)
  • Run unittests with TensorFlow everywhere in v2 mode instead of compatibility mode (#264)
  • Updated Poisoning attack API (#305)
  • Increased definitions of test requirements (#302)

Removed

  • Removed implementations of post-processing defences as classifier wrappers (#267)

Fixed

  • Improved the logging of unitttests (#227)
  • Updated method fit_generator in all neural network classifiers (#323)

ART 1.1.1

08 Feb 02:12
Compare
Choose a tag to compare

This release of ART v1.1.1 fixes two bugs in TensorFlowV2Classifier and KerasClassifier.

Added

[None]

Changed

[None]

Removed

[None]

Fixed

  • Fixed a bug in TensorFlowV2Classifier resulting in incorrect loss calculation for loss_gradients except for tensorflow.keras.losses.SparseCategoricalCrossentropy. (#279)
  • Fixed a bug in KerasClassifier that allowed predicting the model with wrong input data shapes without raising any exceptions. We have now added checks for input data shape or are using the model's predict method where possible. This bug did not affect any classifier evaluated with the correct input data shape expected by the model. (#283)

ART 1.1.0

08 Jan 00:57
b8fdf2f
Compare
Choose a tag to compare

This release of ART v1.1.0 introduces a new class of attacks and defences for model extraction threats in addition to the existing attacks and defences for evasion and poisoning, enables top level package import of ART, and includes a Kubeflow component demonstrating an example application of ART for robustness evaluation of machine learning models.

Added

  • Added separate base classes for evasion, extraction, and poisoning attacks (#250)
  • Added the Functionally Equivalent Extraction attack for neural networks with two dense layers and ReLU activation (#231)
  • Added the Copycat CNN extraction attack (#232)
  • Added defences against model extraction attacks including output modification with reverse sigmoid, random noise, class labels, and high confidence (#234)
  • Added support for top level package import to enable import art (#240)
  • Added references to current limitations of defences (#228)
  • Added version to the ART package (#239)
  • Added a Kubeflow component using ART to run a robustness evaluation of PyTorch models with FGSM. This is a simple example and does not intend to represent a comprehensive robustness evaluation. (#206)
  • Added class gradients to art.classifiers.ScikitlearnSVC to enable targeted white-box attacks on SVM (#215)
  • Added checks to all classifiers raising an exception if the input data is of format np.uint8, np.uint16, np.uint32, or np.uint64 to avoid unexpected outcomes during input preprocessing (#226)
  • Added support for Keras 2.3 and later with TensorFlow v2 as backend (#200)

Changed

  • Changed the Fast Gradient Sign Method attack minimal perturbation implementation to prevent it from modifying the original input data (#213)
  • Changed the reporting of attack success rates to always report percentages across all attacks (#202)
  • Changed and improved the detection of the loss function in KerasClassifier (#212)

Removed

[None]

Fixed

  • Fixed a bug in the logging configuration (#190)
  • Fixed a bug in the HCLU attack by replacing the hard-coded confidence parameter (#228)
  • Fixed a bug in TensorFlowV2Classifier by adding the missing attribute _input_shape (#249)

ART 1.0.1

08 Oct 14:34
Compare
Choose a tag to compare

This release of ART 1.0.1 accounts for initial user feedback on v1.0.0

Added

  • add support for binary logistic regression with sklearn.linear_model.LogisticRegression in addition to the existing support for multi-class logistic regression (#171)

Changed

  • extended exception messages inside of attacks checking for valid combinations of attacks and classifiers to provide better explanations of the reason for the raised exception (#174)

  • update Travis unit-testing to use TensorFlow 2.0.0 (#183)

Removed

[None]

Fixed

  • Fixed an issue in art.attacks.PoisoningAttackSVMwhere sometimes a certain class label wouldn't create unique poison points (#168)

  • Fixed typos in README (#170, #184)

ART 1.0.0

12 Sep 23:29
Compare
Choose a tag to compare

This is the first major release of the Adversarial Robustness 360 Toolbox (ART v1.0)!

This release generalises ART to support all possible classifier models, in addition to its existing support for neural networks. Furthermore, it generalises the label format, to accept index labels as well as one-hot encoded labels, and the input shape, to accept, for example, tabular data as input features. This release also adds new model-specific white-box and poisoning attacks and provides new methods to certify and verify the adversarial robustness of neural networks and decision tree ensembles.

Added

  • Add support for all classifiers and pipelines of scikit-learn including but not limited to LogisticRegression, SVC, LinearSVC, DecisionTreeClassifier, AdaBoostClassifier, BaggingClassifier, ExtraTreesClassifier, GradientBoostingClassifier, RandomForestClassifier, and Pipeline. (#47)

  • Add support for gradient boosted tree classifier models of XGBoost, LightGBM and CatBoost.

  • Add support for TensorFlow v2 (rc0) by introducing a new classifier TensorFlowV2Classifier providing support for eager execution and accepting callable models. KerasClassifier has been extended to provide support for TensorFlow v2 tensorflow.keras Models without eager execution. (#66)

  • Add support for models of the Gaussian Process framework GPy. (#116)

  • Add the High-Confidence-Low-Uncertainty (HCLU) adversarial example formulation as an attack on Gaussian Processes. (#116)

  • Add the Decision Tree attack as a white-box attack for decision tree classifiers (#115)

  • Add support for white-box attacks on scikit-learn’s LogisticRegression, SVC, LinerSVC, and DecisionTreeClassifier, as well as GPy and black-box attacks on all scikit-learn classifiers and XGBoost, LightGBM and CatBoost models.

  • Add Randomized Smoothing as wrapper class for neural network classifiers to provide certified adversarial robustness under the L2 norm. (#114)

  • Add the Clique Method Robustness Verification method for decision-tree-ensemble classifiers and extend it for models of XGBoost, LightGBM, and scikit-learn's ExtraTreesClassifier, GradientBoostingClassifier, RandomForestClassifier. (#124)

  • Add BlackBoxClassifier expecting only a single Python function as interface to the classifier predictions. This is the most general and versatile classifier of ART. New tutorial notebooks demonstrate BlackBoxClassifier testing the adversarial robustness of remote, deployed classifier models and of the Optical Character Recognition (OCR) engine Tesseract. (#123, #152)

  • Add the Poisoning Attack for Support Vector Machines with linear, polynomial or radial basis function kernels. (#155)

Changed

  • Introduce a new flexible API for all classifiers with an abstract base class for basic classifiers (minimal functionality to support black-box attacks), and mixins for neural networks, gradient-providing classifiers (to support white-box attacks), and decision-tree-based classifiers.

  • Update, extend and introduce new get started examples and notebook tutorials for all supported frameworks. (#47, #140)

  • Extend label format to accept index labels in addition to the already supported one-hot-encoded labels. Internally ART continues to treat labels as one-hot-encoded. This feature allows users of ART to use the label format preferred by their machine learning framework and datasets. (#126)

  • Change the order of the preprocessing steps of applying defences and standardisation/normalisation in classifiers. So far the classifiers first applied standardisation followed by defences. With this release the defences will be applied first followed by standardisation to enable comparable defence parameters across classifiers with different standardisation/normalisation parameters. (#84)

  • Use the batch_size of an attack as argument to the method predict of its classifier to reduce out-of-memory errors for large models. (#105 )

  • Generalize the classifiers of TensorFlow, Keras, PyTorch, and MXNet by removing assumptions on their output (logits or probabilities). The Boolean parameter logits has been removed from Classifier API in methods predict and class_gradient. The predictions and gradients are now computed at the output of the model without any modifications. (#50, #75, #106, #150)

  • Rename TFClassifier to TensorFlowClassifier and keep TFClassifier for backward compatibility.

Removed

  • Sunset support for Python 2 in preparation for its retirement on Jan 1, 2020. We have stopped running unittests with Python 2 and do not require new contributions to run with Python 2. We keep existing compatibility code for Python 2 and 3 where possible. (#83)

Fixed

  • Improve VirtualAdversarialMethod by making the computation of the L2 data normalisation more reliable and raising an exception if it is used with a model providing logits as output. Currently, VirtualAdversarialMethod is expecting probabilities as output. (#120, #157)