Text classification is an important theme and basic classification task in natural language processing and machine learning.
There are many methods to classify texts. Recently classification method by deep learning has been invented.
I will introduce classification by CNN and LSTM which is a representative deep learning.
And also summarized svm and naive bayes which are classic methods.
python execute.py --method naive_bayes --dataset yelp_review_polarity
Classifier | Link |
---|---|
CNN | Paper |
LSTM | Keras |
Character Level CNN | Paper |
SVM | scikit-learn |
Naive Bayes | scikit-learn |
- Classes: 4
- Train Data Size: 120,000
- Test DataSize: 7,600
Classifier | validation loss | validation accuracy |
---|---|---|
CNN | 0.2994 | 0.9055 |
LSTM | 0.2587 | 0.9106 |
Character Level CNN | 0.3692 | 0.8709 |
SVM | - | 0.9007 |
Naive Bayes | - | 0.9182 |