Skip to content

csailer/textretrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

textretrieval

Testing various algorithms for text retrieval

#TF-IDF Coupling the lecture information with this Wiki entry: http://en.wikipedia.org/wiki/Tf%E2%80%93idf#cite_note-understanding-5, on TF-IDF, I implemented a simple TF-IDF application in Python 2.7.8 I used Visual Studio 2013 to create the project because I wanted to take Python development in VS 2013 for a test drive. All I can say is "meh". Not great, but for this simple project not bad.

If you are not using VS 2013, and I wouldn't blame you, you can still take the three source files and and execute them from the Python terminal. The file app.py has the main() function.

These are the files and their uses

*app.py - Main file and what you should call from the terminal e.g. cd into the working directory and execute python app.py *corpus.py - Just a file that contains some short TextBlobs simulating 3 documents. *tfidf.py - A module containing the TFIDF class that has some various implementations of both TF and IDF.

YOU WILL HAVE TO install TextBlob e.g. pip install textblob AND READ THE INSTRUCTIONS - http://textblob.readthedocs.org/en/latest/api_reference.html

email me if you have questions: chuck@chucksailer.com AND by all means please Fork and enhance, critique, offer advise, leave abusive comments or whatever.

About

Testing various algorithms for text retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages