Skip to content

policecar/kaggle-stackoverflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

kaggle-stackoverflow

Machine learning set-up for the [Stackoverflow competition at Kaggle] (https://www.kaggle.com/c/predict-closed-questions-on-stack-overflow) in 2012

( scored in the top 25% )

Usage:

$ cd src/
$ ./runme.py

which will extract 35 handcrafted features and their combinations from the training data ( using the NLTK for tokenization and stemming ), train an ensemble of classifiers on the feature matrix ( Random Forest, Linear Discriminant Analysis, and Gradient Boosting, all from the scikit-learn toolkit), and make predictions for the private leaderboard data.

Note: feature generation requires some of the NLTK corpora ( download should be prompted on first run )

About

Predicting closed questions on Stack Overflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages