GitHub - JayceeLee/adversarial-toolbox: Tools for working with adversarial images

Adversarial toolbox

WIP

Simply, this toolbox offers high level toolbox to generate and work with adversarial images on deep neural networks. Adversarial examples can be a pain to generate, with most implementations being in disparate frameworks, using different parameters. Here we consolidate some of that into an easy to use and setup testing setup.

Setting up the toolbox

This project uses the standard libraries needed for most ML projects. The main routines are built on Keras with a Tensorflow backend. Theano implementations of the attacks are not explicitly included, but they would be easy to add. In addition, you also need the below libraries

numpy
scipy
matplotlib
pillow

Or install everything by first cloning the repository

git clone https://github.com/neale/adversarial-toolbox

then install dependancies with

python -r requirements.txt

This toolbox relies heavily on Cleverhans so the library is included at the current master version to make CI easier.

Train Base Keras CNN on Cifar10

Since adversarial attacks work by computing gradients with respect to an input, output pair. We need a trained model in order to generate images. You can train this model with the included base model trainer train_base_classifier.py. Only Cifar10 is supported, but its easy to plug in your own data for training.

Some models are given in the toolkit/models/ directory. But any keras model definition can be loaded for training.

Input pooling models

In addition to standard VGG and Resnet models, a generic CNN with max pooling as the input layer has been defined for use. Input pooling can be thought of as feature blurring, which helps the model generalize, at the cost of accuracy. Deep networks need to overfit in order to achieve high accuracy, but this overfitting leaves holes in the learned manifold where the output label in that space is less stable. Adversarial algorithms on CNNs largely work by finding an exploiting these holes, so input pooling is introduced to try and close these unstable label spaces.

Generate images

offers multiple ways to generate adversarial examples for standard neural network architectures

Generate images with common techniques

Fast Gradient Sign Method - FSGM
Jacobian Based Saliency Map Approach - JSMA
DeepFool --> coming soon
Universal Adversarial Images --> coming soon

Architectures supported

Vanilla convnet
VGG_16 with batchnorm
Resnet (all)
MLP --> coming soon
Logistic Classifier --> coming soon

Features

Simply, this toolbox offers am easy way to generate adversarial images on deep neural networks. Adversarial images can be a pain to generate, with most implementations being in disparate frameworks, using different parameters. Here we consolidate some of that into an easy to use and setup testing setup.

Using one of the included models, or one that you've defined, its easy to train a classifier in Keras using the included trainer. Currently only training on Cifar10 (mnist coming soon) is supported.

With a trained .h5 model, it can be used to generate the adversarial images via any of the above supported methods

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
toolbox		toolbox
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

toolbox

toolbox

.gitignore

.gitignore

readme.md

readme.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Adversarial toolbox

WIP

Setting up the toolbox

Train Base Keras CNN on Cifar10

Input pooling models

Generate images

Architectures supported

Features

About

Releases

Packages

Languages

JayceeLee/adversarial-toolbox

Folders and files

Latest commit

History

Repository files navigation

Adversarial toolbox

WIP

Setting up the toolbox

Train Base Keras CNN on Cifar10

Input pooling models

Generate images

Architectures supported

Features

About

Resources

Stars

Watchers

Forks

Languages