Skip to content

arv100kri/NLP-Project1

Repository files navigation

NLP-Project1

The project has the following structure:

  1. A directory called src which contains the code for the classifiers and for the plotting functionality.
  2. A directory called Response which contains the response files of my “main” classifier and “special” classifier. My best “main” classifier was the Naive Bayes, with a smoothing parameter of α = 10−3. I tried to improve on the Naive Bayes Classifier according to the paper mentioned in the project specification. My best classifier from that was the Complement Naive Bayes classifier, with the same smoothing parameter as above. (More details in the report)
  3. A directory called generated files which contain files where I have dumped the data produced by the classifiers. Some of the files in that directory are used to plot the data, and some are used to just to verify the metrics.
  4. Directories train, dev and test contain the training, development and test data respectively.
  5. The files train.key, dev.key and scorer.py are used to obtain the accuracy metrics of the training and development data.
  6. The file sentiment-vocab.tff is the sentiment vocabulary

About

Text Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published