Skip to content

LPUCapstone2021/Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scraper

Setup

$ # build docker image
$ docker build --tag scraper .

$ # execute container
$ docker run --interactive --tty \
--name capstone-scraper \
--mount type=bind,source=`pwd`,target=/app \
scraper

$ # create spider(s)
$ scrapy startproject Scraper
$ cd Scraper
$ scrapy genspider -t crawl cardekho cardekho.com
$ scrapy genspider -t crawl zigwheels zigwheels.com

Run crawler

$ docker start capstone-scraper
$ docker exec -it capstone-scraper bash
$ scrapy crawl cardekho -o data/data.csv
$ scrapy crawl zigwheels -o data/data.csv

Clean data

Open clean.ipynb in Google Colab and use cars.csv present in Scraper/spiders/data as input.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published