Skip to content
This repository has been archived by the owner on Oct 19, 2021. It is now read-only.

moffeemoffee/cs-191-ml-naive-bayes-classifier-ham-spam

Repository files navigation

Setup

Folder structure

├── dataset
|  └── trec07p
|     ├── data
|     ├── delay
|     ├── full
|     ├── partial
|     └── README.txt
├── log.txt
├── naivebayes.py
├── preprocess.py
├── processed.csv
├── README.md
├── requirements.txt
└── train.py

Get the data set from https://plg.uwaterloo.ca/~gvcormac/treccorpus07/ and extract it into the dataset folder, following the provided structure.

Pre-requisites

Python 3.7.2 (un-tested on other versions)

Installation

pip install -r requirements.txt

Usage

Run train.py, it uses processed.csv:

python train.py

Example usage of NaiveBayes from naivebayes.py can be seen in train.py as well.

Also, you can generate a new processed.csv if you want with preprocess.py:

python preprocess.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages