Skip to content

Using Wasserstein Generative Adversarial Network to fool intrusion detection systems (IDS) into believing that malicious traffic is normal traffic.

Notifications You must be signed in to change notification settings

shahzadharoon/Intrusing-Detection-System-Attack

 
 

Repository files navigation

Intrusion Detection System Attack

A Wasserstein Generative Adversarial Network (WGAN) is used to fool Intrusion detection systems (IDS) into believing that malicious network traffic is normal traffic.

Requirements

Please make sure that you have the following requirements installed on your system:

Installation

First clone the project.

git clone https://github.com/ChrisFugl/Intrusing-Detection-System-Attack
cd Intrusing-Detection-System-Attack

Note: We recommend that you install the Python packages in a virtual environment. See the next section for how to do this, and then proceed with the rest of this section afterwards.

pip install -r requirements.txt

Virtual Environment (optional)

A virtual environment helps you to avoid that Python packages in this project does not conflict with other Python packages in your system. Follow the instructions on this site to install the virtualenv package, which enables you to create virtual environments.

Once virtualenv is installed, you will need to run the following commands to setup a virtual environment for this project.

virtualenv env

You may want to add the flag "--python python3" in case your default Python interpreter is not at version 3 (run python --version to check the Python version):

virtualenv --python python3 env

Either of the previous two commands will create a directory called env in the project directory. You need to run the following command to make use of the virtual environment.

source env/bin/activate

You are now up an running with the virtual environment. Run the following command when you want to exit this environment.

deactivate

Jupyter Support (optional)

The commands in this section should be run from inside of a virtual environment. Note that you only need to do these steps if you are using a virtual environment.

Run this command:

python -m ipykernel install --user --name=IDSA

Use this command to start a Jupyter notebook:

jupyter notebook

Select "IDSA" when creating a new notebook.

Tests

All tests are located in the tests directory. Use the following command to run all tests:

python -m unittest discover -s tests

Structure

The following is proposed for how to structure the codebase. The specific API of each file is to be decided after agreeing on a structure.

data/: Directory containing the dataset.

  • data/KDDTest.csv: NSL-KDD full testset.
  • data/KDDTrain.csv: NSL-KDD full trainingset.

data.py (already implemented): All functions needed to load and preprocess the data.

ids/: Module containing model implementations for each of the IDS algorithms. Each model implementation should expose a class with (at least) a train and a test method. These methods are to be used by train_ids.py, test_ids.py, and test_wgan.py.

  • decision_tree.py: Implementation of decision tree based IDS.
  • k_nearest_neighbours.py: Implementation of K-nearest neighbours based IDS.
  • linear_regression.py: Implementation of linear regression based IDS.
  • multi_layer_perceptron.py: Implementation of multi-layer perceptron (fully connected neural network) based IDS.
  • naive_bayes.py: Implementation of Naive Bayes based IDS.
  • random_forest.py: Implementation of random forest based IDS.
  • support_vector_machine.py: Implementation of support vector machine based IDS.

model.py: WGAN should be implemented as two classes (Generator and Discriminator) in this file.

test_ids.py: Script for evaluating the performance of a pretrained IDS model. The IDS model parameters should be accepted as a command line argument. For example:

python test_ids.py --ids saved_models/{IDS}.pt

test_wgan.py: Script for testing the performance of a pretrained WGAN against one of the pretrained IDS models. The script should accept command line arguments to specify which WGAN- and IDS parameters to use. For example:

python test_wgan.py --wgan saved_models/{WGAN}.pt --ids saved_models/{IDS}.pt

train_ids.py: Script for training the IDS models on the NSL-KDS trainingset. Command line arguments should be used to control which IDS model to train and to control hyperparameters. For example:

python train_ids.py --ids mlp --learning_rate 0.0001

train_wgan.py: Script for training the WGAN on the NSL-KDD trainingset. Command line arguments should be used to control the hyperparameters. For example:

python train_wgan.py --learning_rate 0.0001

See this site for how to save and load models in PyTorch.

API

Overview of the API from each file. See the comments in the files for documentation of each function.

data.py

This file contains functions to work with the datasets.

  • load_train(): Loads the trainingset (unprocessed).
  • load_test(): Loads the testset (unprocessed).
  • preprocess(dataframe, **kwargs): Performs preprocessing of a dataset.
  • remove_content(dataframe): Removes all content features.
  • remove_host_based(dataframe): Removes all host based features.
  • remove_time_based(dataframe): Removes all time based features.
  • get_content_columns(): Gives the column names of the content features.
  • get_host_based_columns(): Gives the column names of the host based features.
  • get_time_based_columns(): Gives the column names of the time based features.

Note: All of the functions, except the three getters, return (one or more) Pandas dataframes. Call .to_numpy() on the dataframe object to convert it to a NumPy array.

Example:

training_data, training_class, training_attack_class = preprocess(load_train())
test_data, test_class, test_attack_class = preprocess(load_test())

Preprocess is not automatically called when loading the datasets. This is intentionally designed that way in order to maintain easy access to the raw datasets.

About

Using Wasserstein Generative Adversarial Network to fool intrusion detection systems (IDS) into believing that malicious traffic is normal traffic.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%