Project that gives idea on how to use Pyspark on UNSW-NB datasets used for intrusion detection.
ICCUBEA Conference Publication Link
PySpark is collaboration of Apache Spark and Python.
PySpark also provide us MLlib which help in implementing ML algorithms.
Implementing queries on huge amount of data is difficult and Spark make our job easier.
Install Python ::
Download latest version of Python
Pycharm IDE can be used for easy work.
Install Spark RDD ::
Tutorial for installing spark
Install PySpark ::
Pyspark can be installed by executing below command in terminal or IDE.
$ pip install pyspark
Feel free to add your ideas or any changes.