Genre prediction on novels. This project uses the Gutenberg dataset.
This repository does not include the dataset itself. If you want to train your own model, download the dataset here.
The code has been tested on macOS High Sierra, Ubuntu and 'Ubuntu subsystem for windows'.
Make sure python3 and pip3 are installed. All dependencies will be installed by pip. If you encounter permission errors, you might have to run sudo make deps
.
Not al Linux disto's come with python-tk. To install it, run sudo apt install python-tk
.
git clone https://github.com/voschezang/books.git
cd books
make deps
Run a prediction on a sample book. The result should by 'histor'. Note that all output genres are abbreviated.
make predict
Run the predictor. _book_
should be a .txt file. E.g. book=datasets/test/1118.txt
.
make predict book=~mybook.txt
If you do not have pip3, you can try:
make deps2
python src/main.py mybook
(The project should have the following structure)
books/
src/
(jupyter notebooks)
(some python scripts)
datasets/
models/
(a pretrained model)
labels.csv
(other files)