Skip to content

This repo contains the code for scrapping the Question and context from google people may ask

Notifications You must be signed in to change notification settings

PExplorer/GooglePeoplMayAsk

 
 

Repository files navigation

It is only for learning purpose. 
It has following scripts:

# GooglePeoplMayAsk
This repo contains the code for scrapping the Question and context from google people may ask
- It extracts following details:
   - Context
   - Question
   - Answer
   - Web_link
It requires furthur filtering of the data:
- In few cases, if weblink is not properly scraped, it may fetch wrong context. If weblink is not proper (for ex: if it ends with "...", exclude the context for that data point)


# User Reviews Scraper
- Code for scraping reviews for tripoto.com 


Download mozilla driver for from link below
https://github.com/mozilla/geckodriver/releases/tag/v0.26.0
Place the downloaded drive into driver folder.

Make necessary changes to file to include more results .

Code is written using selenium libraries for python.

Possible UseCases
1) Improving the reading comprehension model (squad is the only dataset used currently)
2) Faq Generation
3) Any Natural language gneration usecases


About

This repo contains the code for scrapping the Question and context from google people may ask

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%