Sentiment Analysis (Movie Reviews)

This code classifies movie reviews.

Data

Inside the data/ folder, there are movie reviews, which are separated into positive and negative reviews. This data has been taken from an online database.

There are also words, which are split into positive and negative. Later the python files generate closely selected subset of words, which are then used for better classification.

Installation

This project uses Python 2.7. Necessary packages are:

numpy (requires Fotran to be installed)
nltk
scikit-learn

If you're on Linux or Mac, all of these packages should be available with a simple pip install.

Usage

`basic.py`

$ python basic.py

This will classify movie reviews using Naive Bayes, and 2 variants of SVM (with linear and polynomial kernel). It will train the classifiers and print

the accuracy of the classifiers
2 simple examples of movie reviews, and show what do our classifiers classify them to (positive or negative)
a confusion matrix + precision, recall and F score

You can choose whether you'll use all positive and negative words, or just more specific/helpful for the reviews, by setting the word_list variable in the top of the file.

`parameters.py`

$ python parameters.py

It contains code (unstable) for finding the right parameters for SVM.

`select_words_frequency.py`

$ python select_words_frequency.py

It analizes the list of words (positive and negative) that we already have, and selectes by frequency and TF-IDF, so that our classifiers can choose words which are more relevant to the actual reviews. It fills in data/words/selected-words-frequency.txt and data/words/selected-tfidf.txt

`lib/`

It contains the core logic. It loads appropriate words, and trains on them and does simple or k-fold cross validation. It has two classifiers: Naive Bayes and SVM (3 variants).

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
data		data
doc		doc
lib		lib
.gitignore		.gitignore
README.md		README.md
basic.py		basic.py
parameters.py		parameters.py
select_words_frequency.py		select_words_frequency.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

doc

doc

lib

lib

.gitignore

.gitignore

README.md

README.md

basic.py

basic.py

parameters.py

parameters.py

select_words_frequency.py

select_words_frequency.py

Repository files navigation

Sentiment Analysis (Movie Reviews)

Data

Installation

Usage

`basic.py`

`parameters.py`

`select_words_frequency.py`

`lib/`

About

Releases

Packages

Languages

seelamsainathreddy/college-machine_learning

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis (Movie Reviews)

Data

Installation

Usage

basic.py

parameters.py

select_words_frequency.py

lib/

About

Resources

Stars

Watchers

Forks

Languages

`basic.py`

`parameters.py`

`select_words_frequency.py`

`lib/`