Skip to content

ajagarapu/SearchEngine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Search for static library files

Language: Pyhton Search based on Vector Space Model on large set of text files. It uses Porter Stemmer algorithm to stem the words of each of the text files. Below are the steps implemented in this project:

Step 1: Extract all the files from the folder. Step 2: For each document, split, strip, tokenize and stem the document data. Step 3: Create the document vectors. Step 4: Fetch the input query from the user. Step 5: Repeat Step 2 and Step 3 to create the query vector. Step 6: Remove the irrelevant documents and calculate the cosine similarity on the relevant documents. Step 7: Apply TF-IDF algorithm to the vectors. Step 8: Display the final results in a sorted order.

About

Search Engine using Vector Space Model, Cosine Similarity and TFIDF

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages