Skip to content

Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs

Notifications You must be signed in to change notification settings

nhat2008/vietnam-ecommerce-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project for crawling data from lazada, websosanh, compare.vn, cdiscount and cungmua with many cooling wrappers


1. good structure for scrapy with items and pipelines
2. automatically proxy changing
3. simply running - don't need to remember the command to run scrapy
4. flexible config- the crawler gets data by patterns in template/product.yml
5. save data to databases: mongo or es
6. applying pybloom for checking duplicate crawled data when crawling
7. stopping after time -

Install requirements.txt


$python app.py

About

Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages