GitHub - richelite/classify: My text classification playground

##Classify

My text mining playground

classifier/blog - Mood classification

Every document in ./blog/data/raw.txt is a blog entry extracted from social media (livejournal). The goal is to classify the mood/sentiment of each sentence into "positive" (happy or excited) or "negative" (depressed, sad, or disappointed). The raw.txt file contains 12,747 blog posts, already labeled with 1 (positive) or 0 (negative). 75% of raw data are extracted as training data, the rest 25% are for validation.

$ python main.py -m NaiveBayes -t ./data/train.txt -v ./data/validate.txt

Precision=0.74607658506

$ python main.py -m LinearSVM -t ./data/train.txt -v ./data/validate.txt

Precision=0.831136220967

$ python main.py -m RbfSVM -t ./data/train.txt -v ./data/validate.txt

Precision=0.634337727558

$ python main.py -m LogReg -t ./data/train.txt -v ./data/validate.txt

Precision=0.829880728186

classifier/tweet - Political Tweet Classification

Every document in ./tweet/data/raw.txt is a microblog extracted from twitter. The goal is to classify whether the tweet contains political content or not. The raw.txt file contains 200,570 tweets, already labeled with 1 (political) or 0 (non-political). 75% of raw data are extracted as training data, the rest 25% are for validation.

$ python main.py -m NaiveBayes -t ./data/train.txt -v ./data/validate.txt

Precision=0.918970906468

$ python main.py -m LinearSVM -t ./data/train.txt -v ./data/validate.txt

Precision=0.989420533782

$ python main.py -m RbfSVM -t ./data/train.txt -v ./data/validate.txt

Precision=0.937100613958

$ python main.py -m LogReg -t ./data/train.txt -v ./data/validate.txt

Precision=0.988739280276

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
classifier		classifier
clftweet		clftweet
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

classifier

classifier

clftweet

clftweet

.gitignore

.gitignore

README.md

README.md

Repository files navigation

classifier/blog - Mood classification

classifier/tweet - Political Tweet Classification

About

Releases

Packages

Languages

richelite/classify

Folders and files

Latest commit

History

Repository files navigation

classifier/blog - Mood classification

classifier/tweet - Political Tweet Classification

About

Resources

Stars

Watchers

Forks

Languages