Information retriveal
This repository contains codes that are required by course CS6200 of Summer 2015 in Northeastern Univiersity. Generally there are 7 parts of the homework in the repository:
- Retrieval Model: (1) native model (2) probalitity model (3) language model
- Index
- Crawl (muti-processing)
- Evaluation
- Link Analysis Algorithms
- Machine Learning
- Spam filter with ML
Over all, the Index and Crawl part are the most valuable part of this repository. They are both effencicy and bug free.