DJANGO_SETTINGS_MODULE=exampleproject.settings.dev
DATABASE_URL=postgis:///tx_highered
Complete guide to getting started (remove steps to suit you):
# install postgresql libpq-dev
git clone $REPOSITORY && cd $PATH
mkvirtualenv tx_higher_ed
setvirtualenvproject
add2virtualenv .
pip install -r requirements.txt
# if you need to create a database:
# `postdoc` greatly simplifies connecting to Docker databases
pip install postdoc
phd createdb --encoding=UTF8 -T template0
echo "CREATE EXTENSION postgis;" | phd psql
echo "CREATE EXTENSION postgis_topology;" | phd psql
# or if you need to reset your database:
make resetdb
# syncdb and load fixtures
make syncdb
#######################################################################
# You can stop at this point if you're just playing with the project. #
#######################################################################
# if using 2012 data, bump it up to 2014 standards
python tx_highered/scripts/2014_update.py
# get ipeds data, requires https://github.com/texastribune/ipeds_reporter
../ipeds_reporter/csv_downloader/csv_downloader.py \
--uid data/ipeds/ipeds_institutions.uid --mvl data/ipeds
mv ~/Downloads/Data_*.csv data/ipeds
# get thecb data
cd data && make all
# load data
# timing: 10m25.069s
make load
# post-process the data
python exampleproject/manage.py tx_highered_process
####################################
# placeholder for post-2014 update #
####################################
# the 2012->2014 specific stuff can go out and the above importing
# instructions can get updated
This project currently requires a PostGIS database (hopefully not for long):
$ phd createdb
$ phd psql
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
You can do a sql dump to move data from one postgres database to another (excluding geo info):
$ phd SOURCE_DATABASE_URL pg_dump --no-owner --no-acl --table=tx_highered* --clean > tx_highered.sql
$ phd DEST_DATABASE_URL psql -f tx_highered.sql
- Freeze the current data in a fixture
- Edit the tx_highered_YYYY.json.gz make task
- Run the task to save the data
- Adjust the loading scripts to reference the new fixture
- Deprecate (or delete) any one-time data migration scripts, e.g. 2014_update.py won't be necessary after 2015
When it asks you for an Institution, enter a list of UnitIDs generated by:
list(Institution.objects.filter(ipeds_id__isnull=False).values_list('ipeds_id', flat=True))
If you want to regrab data from THECB's web site, first find the data file that you want to re-grab. It will be named something like "top_10_percent.html". There will also be a file called "top_10_percent.POST". From that file you can recreate the report with the command:
curl -X POST -d @top_10_percent.POST http://www.txhighereddata.org/interactive/accountability/InteractiveGenerate.cfm -s -v > blahblahblah.html
If you need to modify the report, you can reverse engineer it from the POST data and the form markup.
(c) 2012 The Texas Tribune