Skip to content

SanjanaSunil/wikipedia-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikipedia Search Engine

A search engine for searching Wikipedia XML dumps.

Requirements

python3 and nltk library is required to run the search engine.

Creation of Inverted Index

To create the inverted index, run the following command.

./index.sh <path_to_wiki_dump_file> <path_to_index_folder>

The arguments are the absolute paths to the Wikipedia XML dump file and the folder where the inverted index is to be created and stored.

Querying

To search, run the following command.

./search.sh <path_to_index_folder>

Enter the query one by one to get the top 10 ranked results.

About

Search engine for Wikipedia XML dumps

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published