Taught by Dr. Aron Culotta at IIT
The estimated Timeline is January 2016 - May 2016. By the end of May, things will be done.
Divided into 6 phases :
- Inverted Index
- Ranking using cosine similarity
- Ranking using Cosine similarity,RSV and BM25
- Multinomial Naive Bayes classifier to classify emails as Spam and Non Spam
- K-means Clustering of twitter profiles based on their descriptions
- PageRank Implementation