Scraper for Roskilde Festival webpage (www.roskilde-festival.dk). Scraping the info about the announced bands a specific year
I have translated the most important thing from danish to english. The rest will come. And in some time I will write some documentation at how to use it.
- Clone/download the files from the repository
- Download and install PhantomJS version 2.1.1 (or higher) by following this guide.
- Install Pipy packages:
pip3 install -r requirements.txt
- Done! :)
It will find and use the year that Roskilde Festival current uses. As in december 2016, it will use 2017 as the current year:
./RfBandScraping.py
The result of the above (should be) the band names for the upcomming year 2017.
If you want to scrape old years, you have to give the year explicit, like this:
./RfBandScraping.py 2016
OBS: Scraping old years, isn't yet implemented!
- Selenium
- BeautifulSoup4
- tqdm
- dateutil
- PyMySQL
- PhantomJS>=2.1.1
- Argument to choose to save bands in database or a file
- Arguments for when to save to the database (host, password etc.)
- Scraping old years