GrubMenuScraper

A web scraper for collecting menu data from restaurant websites

Methodology

Obtain a list of keywords related to food from recipe websites (e.g. ['tomato', 'broccoli'])
Scrape restaurant websites to obtain all DOM elements and associated text (e.g. ['<a href="???">Outback steakhouse</a>','<h3>Clam chowder</h3>', '<h3>Whopper</h3>'])
Apply the list of keywords to scraped text and automatically find similarities in DOM structures for the detected menu items, therefore obtaining the best selector rule (e.g. 'body > div > div > h3')
Use the selector rule to precisely extract menu text (e.g. ['Clam chowder', 'Whopper'])

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data_collection/scrapers/japanese		data_collection/scrapers/japanese
training		training
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
geckodriver		geckodriver
jquery.min.js		jquery.min.js
requirements		requirements
scrape.py		scrape.py