Automatically mine the technology information from web, store them in database and use for horizon scanning exercise. Supports both RSS & Non-RSS enabled sites
- Change the config.py file to specify your host, port, database-name & collection-name
- Add the RSS available urls and un-available urls in respective variables in config.py
- Run initiate.py !
Most RSS-enabled URL's would work perfectly fine, however non-rss url's scraping methodology needs to be tweaked a bit to get it working.