===============
Nowadays social networks became a pillar for social interaction and Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web.
Long text classification is a classic task in natural language processing. This project aims to evaluate how well does the currenet classification techniques fit into short text classification also known as microblog classification.
To do so the Twitter Political Corpus data set is used. Given a tweet, the task is to distinguish political tweets from non-political ones.
Machine learning algorithms used:
- Naïve Bayes
- Support Vector Machine
- Maximum Entropy
Keywords: Python, Natural language Processing, NLTK, Machine Learning.