Skip to content

Ajay-Thomas/Web-Crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web-Crawler

An application developed to facilitate search for data scrapped from specified website (Play Store).The search engine returns the top list of apps based on the number of downloads and the reviews provided by the users.

WebCrawl.py file uses BeautifulSoup library to hit the website and fetches the necessary information such as Name, Description, Reviews and Downloads to feed them into AppJson file.

Rank.py file is used to return the top k matches for the provided query. It uses TfidfVectorizer library to generate the term document matrix and similarities are calculated using cosine similarity library. Finally the top k similarities are ranked and the relevant links are returned.

App.py builds the UI part of the application using Tkinter Library.

Note: If the network from which HTTP Requests are sent needs to bypass using proxies, uncomment the proxy line in WebCrawl.py file

The example URL used here is google play store and we hit the topcharts page, based on the given search words relevant top k apps are returned.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages