Skip to content

wrathinc/web-scrapers

 
 

Repository files navigation

Web-Scrapers

This repo will contain code for all of the web-scrapers I have ever built, with the idea that maybe this will save work for somebody else building a scraper for the same site, or a similar site.

In terms of the way this is structured, for the time being I am planning on putting scrapers for different websites (defined by the domain named) into different folders. Depending on the length of the domain name, I will do my best to give each folder name the domain name. In any case, within the READMEs for each folder, I will give the URL to the homepage along with a description of how to use the scraper in that folder.

Notes

While it might be easier for others if I placed each individual scraper into its own repo (so users could download just the repo for the scraper they wanted, rather than this entire repo), I prefer this structure of keeping them all centralized.

The majority, if not all, of these scrapers will not actively be kept up to date. So, if the website that they were built on changes in a way that it breaks the scraper, users will have to refactor the scraper to account for that.

As a last quick note - all of these are built using Python 3. That might be worth nothing given the current state of the libraries available in Python 2 and 3 for interacting with the web.

About

A repo for all of the web-scrapers I've ever built

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.6%
  • Shell 0.4%