Sam got really excited and remade http://cufcq.com in Python Tornado, with rethinkdb.
Follow these instructions:
- Install python dependencies (see
requirements.txt
) rethinkdb
in the folder, in a seperate terminal tabPython main.py
to run the site. If you have all dependencies installed, it should begin build the database- It should host! But it will be boring cuz theres no data in it.
Python scraper.py
will scrape the office of planning and budget for data. This command cannot be called inpython3
. In 10 minutes you should have cloned the unorganized Office of planning and budget website. More specific options can be set to download one specific campus or yearterm.Python3 main.py --digest=20081-BD.csv
will digest that csv file. It takes about 1-2 seconds per file. RunPython3 main.py --digest=ALL
to digest every csv file. This will take 5 minutes.Python3 main.py --cleanup
will associate data dependencies, and then build overtime and statistics for each document, tying relevant data together. This will take about 25 minutes.
Run rethinkdb
in a seperate terminal tab. Then:
python scraper.py
python3 main.py --digest=ALL
python3 main.py --cleanup
After running all of this, you will have an identical and improved database comparable to the current setup. It can also have Colorado Springs and Denver data. All in under an hour.
If this scq thing is gonna be successful, it needs some strong data already in it. Having the existing fcq data in proximity, and with a simmilar stack as to what we're building, this will give us a chance to superpower scq when it launches. This will also give us a chance to "Bake Two Cakes", if you will. 🍰 Because cufcq and scq are simmilar stacks, whatever we find works well in one stack might also work well in the other. It also means we have two opportunities to law down a consistent set of css and javascript, and a consistent set of well-designed handlers and apis. Eventually, we can tie integrations between cufcq and scq, and the result will have a sexy and important "ratemyprofessors" feel to it. We'll need that. As developers, we shouldn't be afraid to tackle code 💪
Totes mcgotes it's feature creep. Which is why I'm taking full responsibility for 100% of cufcq's work. I've done this once before, and I'll do it again. I have this feature under control, and I can work on it speedily with my resources and experience.
BoilerplateScraperGenerator
Precomputed DataAssociationsStatsChronology
- Handlers
- APIs
- Search
- Surveys
- Integrate
- Views
- Charts
- Styling