Skip to content

cpbscholten/scraper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Additions from fork

This repo has fixed some spiders and other bugfixes related to compatibility with python3. The included spiders are confirmed to be working correctly in 2020.

Introduction

This is a firmware scraper that aims to download firmware images and associated metadata from supported device vendor websites.

Dependencies

Usage

  1. Configure the firmware/settings.py file. Comment out SQL_SERVER if metadata about downloaded firmware should not be inserted into a SQL server.

  2. To run a specific scraper, e.g. dlink:

scrapy crawl dlink

To run all scrapers with maximum 4 in parallel, using GNU Parallel:

parallel -j 4 scrapy crawl ::: `for i in ./firmware/spiders/*.py; do basename ${i%.*}; done`

About

Firmware scraper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.9%
  • Shell 1.1%