historical_movers_scrapy

Extract top gaining and losing stocks on stockmarketwatch.com top gainers/losers in premarket/aftermarket ( ./spiders/movers_scraper.py, parse() )
Using waybackmachine middleware we intercept requests and responses to download snapshots of the site from 20 June 2016 onwards before running the spider (middlewares.py) (middleware courtesy of https://github.com/sangaline/scrapy-wayback-machine with some modifications)
For each ticker scraped, call cdx api of waybackmachine for finviz site of the ticker to see dates where site is changed ( ./spiders/movers_scraper.py, parse_cdx() )
Get the closest date where finviz site is updated and scrape required data from waybackmachine archive of finviz site (or current site if closest) for ticker ( ./spiders/movers_scraper.py, parse_finviz() )
Store all scraped info along with session and move info as Ticker item (items.py)
For each item, check session and move and write to appropriate csv using CSVItemExporter alone with correct date to be determined based on session and time (pipelines.py)

Results:

Scraped ~3500 premarket winners and losers each, 550 after hours winners and losers each

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
__pycache__		__pycache__
scraped		scraped
spiders		spiders
README.md		README.md
__init__.py		__init__.py
items.py		items.py
middlewares.py		middlewares.py
pipelines.py		pipelines.py
settings.py		settings.py

Provide feedback